Wednesday, 20 March 2013

Tweeting at the Budget

Use this tool to track the level of tweeting from Parliament today after another budget, one that could make or break the future of Osborne, once seen as the new Prime Minister of the UK.

Sunday, 17 March 2013

Tracking St Patricks day in Dublin

Not very high level of tweeting, which is perhaps a bit surprising.  Tracking the tweets we see:

  • 5% of tweets use term Irish
  • 5% of tweets use Ireland
  • 7% use some form of Pat, Patty or Patrick
  • 15% of tweets are not in English
  • 17% of RTs

Wednesday, 13 March 2013

#SXSW mapped by social media

Track the real time twitter levels in Austin Texas, during #SXSW levels roles to several hundred geo-tagged tweets an hour form within just .5KM square area.

I doubt anyone would be surprised that #SXSW is a major source of social media activity, but just how major is impressive.

The map above shows active social media sites in Austin.  Note that the Foursquare icons are trending, not just checkins.  And the Flickr images are popular images that have been downloaded many times.  So we see for #SXSW event posts to foursquare and flickr are massive. Map made with the app.

And twitter is very active all around the center of the city.  All not surprising for the premier hipster techie event of the year. 

Friday, 8 March 2013

My misgivings on machine learning

Machine Learning offers the promise of rapid automated 'thinking', but is it well grounded in cognitive science.
As part of the London Data Science meetups I was able to see some demo of the Open Source machine learning Python platform scikit-learn.   I understand that machine learning has been a major element of the expansion of search and social networks.  Without machine learning things like sentiment detection, search, and recommendations would not work.

But I have to ask how real these things are.  Take the example of Google search.  We might say that Google search does a good job of finding what is on the Internet that we want, but is this true?  Google search essentially stands only against a few other search engines.  We don't search the Internet by hand ourselves so we really don't know if Google is doing a good job or not, its just Google is the best tool we have.  The same for detecting sentiment in forum posts of making friend recommendations: are theses tasks connected to anything real?

There is a real risk that the precision of machine learning could mask its artificial nature, that the results of most machine learning systems are utterly self referent establishing facts about things that only exist in the made up world of the internet and social media.

I have not been impressed by Facebook's ability to find my friends, or Google's ability to recommend ads I really want to see.  I am a bit concerned that machine learning is just becoming a ghost in the machine, a part of the simulation that the web replaces for reality.  People have masses of data, they want to find a cheap way to try and say something about the data so they run a training set through a random tree model and then get sufficient reliability for a test set, but is that how reality really works?

How about the black swans, the outliers that contain almost all the really important features of life.  In my own study of twitter I find that locations have a high degree of predictability as to how much tweeting will be located with it at a certain time.  But its the times when the model breaks down, when tweeting is higher or lower than I would forecast that is really interesting.

Wednesday, 6 March 2013

Tale of 2 Cities: Twitter in Caracas after #Chavez

Tracking Caracas Venezuela geo-tagged twitter traffic after the death of Hugo Chavez. Not surprising from the Presidential Palace there are a massive number of tweets referencing Chavez with the hashtag #ChavezVive very popular.

See tweets coming from Caracas Presidential Palace and their RTs:

Statistics fro the Palace show a major concentration on Chavez:
  • 70% of tweets are retweets originally from the Presidential Palace, showing extremely high amplification.
  • 40% of tweets mention Chavez by name
  • 20% mention the word Comarada: comrade
  • Tweeting is not remarkably high though with tweets not topping 20 per hour.

But go across town and things look very different. Only about 5% of tweets are talking about Chavez by name.  Tweets are about much more everyday matters in this part of town: football, movies, cheap internet, Pokemon, even some critics of the government.  Clearly among some people of Caracas there is not a need to tweet on the death of the President.  Does this reflect apathy among some of the richer people in Caracas able to afford mobile phones with OTT data servies needed for twitter?