Thursday, 13 October 2011

Scale variance in geo-tagged tweets

Above shows the cluster of tweets in the middle of a work day in London against a black background, and the map below shows it will major features showing. Though seen from the level of nations tweets follow clear geographic and social patterns, as you look closer and closer at tweets the clustering become more random, with some clear rules.

Geo-tagging of tweets suffer from the often poor quality of geo-location for mobile phones. Still it is remarkable how the pattern of tweets follows main roads and clusters around train station hubs. Clearly real world transport hubs are also virtual world Internet hubs. The structures of flow of traffic and commerce that condition are real world also structure our cyberworld. That is the real world deeply structures the production of content in the Internet, carrying the mark of geo-social patterns within the production and consumption of data. 

Distribution of tweets for nations on the North Atlantic
But as you look closer and you see the cluster becoming more random and evenly within in a smaller space. That is to say geo-tweeting does follow a power law at large scales, with the vast majority of tweets coming from a few urban areas, but as far as we can see this tweeting is not scale invariant. That is when you zoom out you see tweets highly concentrated in a few locations. But as you zoom in on tweets this power law vanishes, and you see the tweets spreading out in a more random fashion.

Distribution viewed at a close scale for London
There are probably a number of reasons for this. First and foremost there is a limit to power laws as you get smaller and smaller. Though it is possible that 90% of the tweeting for a nation could come from a single city, but it is highly unlikely that a single building or street could have 90% of the tweets fro a city. There are limits to how much you can compact or stack up humans.

Also as you get smaller and smaller scale random errors in the geo-tags coming from the devices themselves start to show. We find that often a phone will put someone up to 500 meters from their actual location. On machines using Wi Fi this error factor can become even higher.

So geo-tweeting shows power law relations on very large scales, covering entire cities or larger, but not on smaller scales. This makes it different than many other parts of the web with show power law relations on ever scale. For example there is a power law relationship of the popularity of blogs. A few blogs get vastly more hits than most. Within topics this also holds, with a few blogs dominating subjects. Even within blogs themselves you will almost always see a few pages that get the vast majority of views or links.

So on of the local impacts of Web 3.0 is likely to reduce the concentration of content at the local level.
Posted by Picasa


  1. re: "That is when you zoom out you see tweets highly concentrated in a few locations. But as you zoom in on tweets this power law vanishes, and you see the tweets spreading out in a more random fashion."

    I think this is a sign of it following a sigmoid, bounded growth model, rather than an unbounded growth model. Charlie Stross had a great blog entry on this.

    I think most real-world systems do this, rather than exponential growth ad-infinitum. Because there are finite amounts of resources. Because the surface of the Earth is finite. And, as I understand it, the universe itself has finite space (though it continues to expand).

    So with, e.g., the growth of animal species, the bounding factor is usually something like the amount of food and the amount of predators.

    With Twitter density, I suppose the bounding factor is that people don't like to imitate a sardine in a tin.

  2. Thanks for that.

    What is interesting though in this is not an issue of growth but of small scale organization. Our culture and geography can impose power law distributions at very large scales, and the larger the more a power law, its not limits upon growth, as the network grows it becomes more power law, its limits upon organization at the local scale.

    Much of it may have to do with resources, certainly does, but it may also have political and civic implications. Smaller scale use of the web may simply be more democratic than global scale. The Internet may start democratic and emerge as central and celebrity dominated at larger scales.