Showing posts with label meta-guide. Show all posts
Showing posts with label meta-guide. Show all posts

04 September 2013

Dissecting the Summarization Process

This is in effect a mid-2013 progress update. As with many of my blog posts, this is as much a status update for me to get a better handle on where I'm at as it is to broadcast my progress.

mendicott.com is a blog reflecting on my journey with the overall project. This blog started seven years ago, in 2006, with my inquiry into The difference between a web page and a blog.... I had then returned from something like five years of world travel to find the digerati fawning over the blogosphere. At first, I failed to see the difference between a blog and a content management system (CMS) for stock standard web pages. Upon closer examination, I began to realize that the real difference lay in the XML syndication of blog feeds into the real-time web.

meta-guide.com is an attempt to blueprint, or tutorialize, the process. My original Meta Guide 1.0 development in ASP attempted to create automated, or robotic, web pages based on XML feeds from the real-time web. Meta Guide 2.0 development was based on similar feed bots, or Twitter bots, in an attempt to automate, or at least semi-automate, the rapid development of large knowledgebases from social media via knowledge silos. Basically, I use knowledge templates to automatically create the knowledge silos, or large knowledgebases. The knowledge templates are based on my own, proprietary "taxonomies", or more precisely faceted classifications, painstakingly developed over many years.

gaiapassage.com aims to be an automated, or semi-automated, summarization of the knowledge aggregated from social media by feed bots via the proprietary faceted classifications, or knowledge templates. Right now, I'm doing a semi-automated summarization process with Gaia Passage, which consists of automated research in the form of knowledge silos being "massaged" in different ways, but ultimately manually writing the summarization in natural language. This is allowing me to analyze and attempt to dissect the processes involved in order to gradually prototype automation. Summarization technologies, and in particular summarization APIs, are still in their infancy. Examples of currently available summarization technologies include automatedinsights.com and narrativescience.com. The overall field is often referred to as automatic summarization.

In the future, the Gaia Passage human readable summarizations will need to be converted into machine readable dialog system knowledgebase format. The dialog system is basically a chatbot, or conversational user interface (CUI) into a specialized database, called a knowledgebase. Most, common chatbot knowledgebases are based on, or compatible with, XML, such as AIML for example. Voice technologies, both output and input, are generally an additional layer on top of the text based dialog system.

The two main bottlenecks I've come up against are what I like to call artificial intelligence middleware, or frameworks, the "glue" to integrate the various processes, as well as adequate dialog system tools, in particular chatbot knowledgebase tools with both "frontend" and "backend" APIs (application programming interface), in other words a dialog system API on the frontend with a backend API into the knowledgebase for dynamic modification. My favorite cloud based "middleware" is Yahoo! Pipes, which is generally referred to as a mashup platform (aka mashup enabler) for feed based data; however, there are severe performance issues with Yahoo! Pipes -- so, I don't really consider it to be a production ready tool. Like Yahoo! Pipes, my ideal visual, cloud based AI middleware could or should be language agnostic -- eliminating the need to decide on a single programming language for a project. I have also looked into scientific computing packages, such as LabVIEW, Mathematica, and MATLAB, for use as potential AI middleware. Additionally, there are a variety of both natural language and intelligent agent frameworks available. Business oriented cloud based integration, including visual cloud based middleware, is often referred to as iPaaS (integration Platform as a Service), integration PaaS or "Integration as a Service".

The recent closure of the previously open Twitter API with OAuth has set my feed bot, or "smart feed", development back by years. Right now, I'm stuck trying to figure out the best way to use the new Twitter OAuth with Yahoo! Pipes, for instance via YQL, if at all. And if that were not enough, the affordable and user-friendly dialog system API, verbotsonline.com, that I was using went out of business. There are a number of dialog system API alternatives, even cloud based dialog systems, but they are neither free nor cheap, especially for significant throughput volumes. Still to do: 1) complete the Gaia Passage summarizations, 2) make Twitter OAuth work, use a commercial third party data source (such as datasift.com, gnip.com or topsy.com), or abandon Twitter as a primary source (for instance concentrate on other social media APIs instead, such as Facebook), 3) continue the search for a new and better dialog system API provider.

Most basically, the Gaia Passage project is a network of robots that will not only monitor social media buzz about both the environment and tourism but also interpret the inter-relations, cause and effects, between environment and tourism -- such as how climate change effects the tourism industry both negatively or positively, or even what effects the weather has on crime trends for a particular destination -- as well as querying these interpreted inter-relations, or "conclusions", via natural language. If this can be accomplished with any degree of satisfaction, either fully automated or semi-automated, then the system could just as easily be applied to any other vertical. Proposals from potential sponsors, investors, or technology partners are welcomed, and may be sent to mendicot [at] yahoo.com.

13 March 2013

A New Website For A New Age: GaiaPassage.com

GaiaPassage.com is subtitled "Marcus L Endicott's favorite tips for green travel around the world".  I'm calling it a deep green, eco-centric travel guide to the whole Earth.  My Gaia Passage project will be a handwritten ecotourism guide to the entire world, based on the circa 250x ccTLD.  The general idea is to write a "white paper" for every country in the world, on environmental and cultural conditions, issues, and who is doing what about them, as well as examining both how they affect tourism and how tourism affects those issues. Anyone could write a lot about something, but the idea here is to provide "snapshots", or "bite sized" summaries, of only the best information and contacts.  The name "Gaia Passage" originally came from my pre-Internet (mid-1980s) travel tips newsletter. The site is a work in progress; so far, I've completed the entire Western Hemisphere:
GaiaPassage.com is handwritten, but based on automated research and automated outline. Primary research is based on data mining 20 years of Green Travel archives. Secondary research is based on multiple years of Meta Guide Twitter bots archives. Significance is based on primary sources in the form of root website domains, and/or secondary sources in the form of Wikipedia entries. In other words, if there is not a root website domain name or a Wikipedia entry then it is unlikely to be included. (However, almost anything may be included in Wikipedia - if properly referenced.) 

I have noticed that many websites of smaller concerns are going down, offline, apparently due to the economic downturn. However, social media such as Twitter and Facebook do present affordable alternatives to owning a root domain website, and I will take these into consideration when appropriate. (In other words, when something is really cool.)  I have also noticed a lot of people using Weebly to make free websites. (Note, GaiaPassage.com currently uses the free Google Sites platform.)

In the early evolution of a website, especially large projects, it's important to first have the "containers" in place as "placeholders", which is no small task in itself. With circa 250x countries and territorial entities, that's a whole year's fulltime work for one man, revising one country per working day. This would mean initial completion by December 2013. Eventually, GaiaPassage.com entries may morph into socialbots, or conversational assistants, containing not only all the knowledge about sustainable tourism gleaned from past Green Travel archives, but also current knowledge resulting from the Meta Guide Twitter bots.

In my previous blog, 250 Conversational Twitter Bots for Travel & Tourism, I detailed my 250x Meta Guide Twitter bots, one for every country and territory in the Internet ccTLD.  Basically, I've spent the past five years working on artificial intelligence and conversational agents - and tweeting about it all the while (links below).  I had been using Twitter extensively as a framework; however, Twitter has become increasingly protectionistic, most dramatically illustrated by the high profile 2012 Twitter-LinkedIn divorce. The Twitter API has become a moving target, which is just too costly for me to keep playing catch up.  In short, I find the "Facebook complex" of Twitter management immensely annoying, and concluded to stop contributing original content; so, my New Year's resolution was to stop tweeting manually at least for all of 2013.  Further, my excellent dialog system API, VerbotsOnline.com, went out of business in 2012.  Any other good dialog system API I found to replace it turned out to be much too expensive.  As a result, all my conversational agents are shut down, at least for 2013.  My hope is that the sector will shake out and/or advance during the year, and better or at least more affordable conversational tools will become available next year.

19 June 2012

250 Conversational Twitter Bots for Travel & Tourism

The reason I haven't updated this blog in almost a year is that I have moved most online development to my Meta-Guide.com website. In the previous two postings, I began testing my content repatriation strategy, in other words aggregating my own content from around the web, which I've continued on the Meta Guide website, in fact concentrating on seeding new webpages from mining the past four years of my own tweets. I have also made a prototype summarizer, which I am now training on my Meta Guide website in order to extract content from it to add on top of the mined tweets when building out new webpages. At the moment, I have three immediate goals. I would like to reach 10,000 tweets, 1,000 Meta Guide webpages, and 100 theses in AI & NLP (from the past 10 years). I only have about another 3,000 tweets to go, so maybe another year, about 300 webpages left to make, and less than 30 more theses to discover.

This past weekend, within view of the spectacular Colorado Rocky Mountains, I succeeded after some struggle in making my 250x Meta Guide Twitter bots conversationally interactive on Twitter. These are 250x manually constructed Twitter bots, one for every country, based on country code top-level domain. That includes one for each of the 193 member states of the United Nations, plus an additional 57 various and sundry territories included in the ccTLD. All of these Meta Guide Twitter bots are powered by my @VagaBot, a single cloud-based Verbot engine from Verbotsonline, using the undocumented API and connected to Twitter via Yahoo! Pipes. Previously they have just been retweeters, aggregating country-specific travel and tourism tweets. The next phase of development will involve marrying the incoming retweets to the outgoing responses in some meaningful way, in other words datamining the incoming retweets and attempting to process them semantically into answers.

You should now be able to @sign tweet any of the Meta Guide Twitter bots with questions. Currently, message turnaround time is running up to 30 minutes, but which is par for Twitter. Among other things replies contain lines from my travel books, see Vagabond Globetrotting 3 & From the Balkans to the Baltics. If you are interested in learning more about me and what I do, I recommend watching both Part 1 & Part 2 of my recent videos on "Open Chatbot Standards for a Modular Chatbot Framework", presented in Philadelphia at Chatbots 3.2: Fifth Colloquium On Conversational Systems. If you need help with socialbots for your social CRM, I am available for consulting; just check my Contact page for details, follow me on Twitter, or connect on LinkedIn, and let's Skype!


15 June 2008

Twitter, Bots & Twitterbotting

Micro-blogging is a form of blogging that allows users to write brief text updates, which may be viewed by anyone or restricted to a user group. Such messages can be submitted by a variety of means, including SMS, IM, Email or Web.

Twitter is the prototypical micro-blogging service and allows users to send text-based posts up to 140 characters long, called "tweets", to the Twitter web site. One of the main advantages of using Twitter is that it provides a functional gateway between the web and the mobile phone via SMS text messaging compatibility. Christina Laun recently posted a handy primer, Twitter for Librarians: The Ultimate Guide.

There are now a growing number of Twitter applications for travel and tourism:
  • The Multimap Twitter bot helps you to access maps, directions and local information by sending messages via twitter.
  • The Nelso Twitter bot will help you find bars, restaurants, hotels, shopping, and other businesses in Europe.
  • The Twanslate Twitter bot is capable of translating anything you throw at it, and for on the go translation when all you have is your phone.
I’ve now added feeds from Twitter for all 234 countries to my Destination Meta-Guide.com 2.0 semantic mashup, for instance at:
I’ve also created two Twitterbots already:
Twitter bots are actually special Twitter users that provide information, either upon request or as it becomes available. There are at least two good web sites about Twitter bots:
  • twitterbotting.com is a site to help folks get quick info about creating new Twitterbots.
  • retweet.com helps to discover Twitter, one bot at a time.
A web feed is a data format used to provide users with frequently updated content. RSS is a web feed format used to publish frequently updated content, such as blog entries, news headlines, and podcasts. Yahoo! Pipes is a web application for building applications that aggregate web feeds, web pages, and other services. A combination of data from more than one source in a single integrated application is called a mashup.

Web feeds or mashups can be sent into Twitter with twitterfeed.com . And, feeds can be sent out of Twitter with loudtwitter.com . Feeds can also be exported from Twitter using sites like tweetscan.com or summize.com .
Using the Twitter Facebook application I’ve managed to get Twitter talking to the Facebook status message. I’ve also added the Twitter Badge for Blogger to my blog (at right). And thanks to a new ping.fm beta account, I’ve been able to add my Linkedin status message into this loop.

Now if I can just send Twitter feeds into a chatbot knowledgebase….

08 June 2007

green-travel taxonomy

Over the past 10 years or so I’ve been gradually developing a taxonomy, or classification system, for “green travel”, or more accurately green or sustainable tourism, an extension of my work with globetrotting and backpacker tourism. In other words, what are the key concepts involved in responsible tourism? A two dimensional taxonomy becomes an ontology when applied in three dimensions, as relationships among the concepts emerge. Taxonomies and ontologies are useful in artificial intelligence applications, such as bots.

I’ve spent much of the past decade tinkering with and tweaking the http://meta-guide.com which emerged from the old green-travel.com site. Today this would be called a “mashup”. Lately, I seem to have hit on a particularly useful algorithm, and have in effect taught the meta-guide to tell me everything in the popular press about “green travel” happening in our world today, in a more useful format, country by country… for nearly every “country” on Earth…. In particular, it returns the latest information about climate change and global warming in relation to tourism, in addition to ecotourism and sustainable tourism developments, etc.

I recommend trying the random country feature and let me know what you think, either in the green-travel group at http://groups.yahoo.com/group/green-travel or directly to me!

Marcus Endicott http://mendicott.com

31 December 2006

ByronBot - Byron Bay, Australia

Lately I've been working with chatterbot technology, a computer program designed to simulate an intelligent conversation with one or more human users via auditory or textual methods, using AIML or Artificial Intelligence Markup Language, an XML dialect for creating natural language software agents.

My latest creation is ByronBot (http://www.mendicott.com/byronbay/), which you can ask questions about Byron Bay and the Rainbow Region of northern New South Wales, Australia - where I now live.

Previously, I created the meta-guide geobot (http://www.meta-guide.com/), which knows continents, regions, all countries, all of their capitals, and can provide more such as maps, travel books and cool travel videos - as well as country-specific information about ecotourism and sustainable tourism.

One associate has compared ByronBot with Microsoft's Ms. Dewey (http://www.msdewey.com/), an Adobe Flash-based experimental interface for Windows Live Search.