Mine’s worth $0.407. What’s your personal data worth? (2013)

Via my daily Quartz email comes a link to a Financial Times interactive graphic that lets you know roughly what a marketer might pay for your personal data.

The average person’s data often retails for less than a dollar.

General information about a person, such as their age, gender and location is worth a mere $0.0005 per person, or $0.50 per 1,000 people. A person who is shopping for a car, a financial product or a vacation is slightly more valuable to companies eager to pitch those goods.

Click on the pic to read the rest and try the interactive.

Screengrab of FT's interactive personal data graphic



Create your own data: using sensors for journalism (2013)

Matt Waite writes on The Source about building cheap drought sensors that individually track your backyard’s drought status and collectively paint a picture of a city’s weather conditions.

The problem with most news apps and data journalism is that they rely on the government to produce the data. If the government keeps numbers and you can pry it loose, game on. But what happens when the government doesn’t keep the data? Or you have a reason to believe it’s fatally flawed? Or what if you just want more?

When faced with this problem, most journalists will shrug their shoulders and give up. No data, no story, right? There are examples of journalists developing their own data, but few of them do it on a large scale.

Sensors open up a whole new world of data to journalists–data they themselves collect. That idea has both mundane and profound impacts. Journalists building their own devices means they’re taking a giant leap up in authority–from the device’s construction and capabilities to the data itself. There’s no source to call about the data because you are the source. And, you decide the scale of your data.

Read the rest here.


“The screen will shape our environment in the 21st century” (2012)

I missed Wilson Miner’s talk at Webstock (can’t remember why just now). So I was very pleased to come across this version of his talk online. There are some lovely insights here about change – and how profoundly the world our readers and customers inhabit is changing right now.

The screen

What’s your time to screen in the morning? How long does it take you to go from being dead asleep to in front of a screen. 5 seconds? 30 seconds? 5 minutes?

… All these little gaps in our lives are slowly, or quickly, filling up with screens.

The car shaped our environment in the 20th century, in this huge, tectonic way.

I don’t think it’s a stretch to say that the screen will be as important to shaping our environment in the 21st century.

What goes on those screens is pretty important.

The things that we choose to surround ourselves will shape what we become.

We’re not just making pretty interfaces. We’re actually in the process of building an environment where we’ll spend most of our time for the rest of our lives.

We’re the designers. We’re the builders.

What do we want that environment to feel like?

What do we want to feel like?

Wilson Miner – When We Build from Build on Vimeo.

Change is profound

“All media are extensions of some human faculty. Mental or physical. Electric circuitry displaces the other senses and alters the way we think. The way we see the world and ourselves. When these changes are made, men change.”  – Marshall McLuhan.

Change is fast

It’s not enough to rely on what we know, or what we think we know, what we expect people to know.

If the environment is always changing, we need to be always learning.

To do that we have to let go of what we know.

“At times of change, the learners are the ones who will inherit the world, while the knowers will be beautifully prepared for a world which no longer exists.” – Eric Hoffer.


See how your data series relates to search terms using Google Correlate

I don’t have time to play with this at the moment but I like the sound of Google Correlate, which was launched a little while ago. This is from the Google blog:

It all started with the flu. In 2008, we found that the activity of certain search terms are good indicators of actual flu activity. Based on this finding, we launched Google Flu Trends to provide timely estimates of flu activity in 28 countries. Since then, we’ve seen a number of other researchers—including our very own—use search activity data to estimate other real world activities.

However, tools that provide access to search data, such as Google Trends or Google Insights for Search, weren’t designed with this type of research in mind. Those systems allow you to enter a search term and see the trend; but researchers told us they want to enter the trend of some real world activity and see which search terms best match that trend. In other words, they wanted a system that was like Google Trends but in reverse.

This is now possible with Google Correlate, which we’re launching on Google Labs. Using Correlate, you can upload your own data series and see a list of search terms whose popularity best corresponds with that real world trend.

As Google and others point out, the tool needs to be handled with care, not least since people sometimes confuse correlation with causation. Here’s the point being made in the comic Google made to explain Correlate in simple terms (gotta  love the comic).

There’s also useful information in Google’s FAQ and Tutorial.

Flowing Data took a look and played around with both sensible and non-sensical correlations:

You can also see how your data is related geographically. For example, annual rainfall (left) strongly correlates with searches for “disney vacation package.” Although, it looks like distance is a strong factor in the latter, which should be a reminder that correlation is different from causation. Google is careful to point this out in their FAQ and explanation of the tool.

Nevertheless, it’s fun to poke around and sometimes see the non-sensical correlations. For example, the strongest correlation with “flowingdata” is “how to scan a document,” because the growth rates of both seem similar.


Vannessa Fox had a look at Rebecca Black, Glee and March Madness:

The comic Google created to explain the new tool is careful to point out (multiple times!) that correlation does not necessarily equal causation.  The states where Glee is performing in concert and searches for [the dreamiest] may have the same spikes, but that doesn’t necessarily mean the two are related.

They might be though. At O’Reilly’s Where 2.0 conference last month, I did an Ignite talk showing that people were interested in Rebecca Black everywhere, but were only really interested in March Madness in states that had teams participating.

Interest in Rebecca Black:

Interest in Rebecca Black



Data visualisation – The Typical Human (2011)

At a workshop run by David McCandless at Webstock recently one of the ideas that came out of group brainstorming sessions was what the average human looks like.

We wanted to look at the typical human at different ages, in different countries; at how many limbs they had, how well they could see, what living aids they used and so on.

Someone may or may not go ahead with that idea.

In the meantime, National Geographic has taken care of the first part of it – what the typical human looks like.

And they’ve looked at how much space the 7 billion humans take up on planet Earth.

The power of dedicated time… and spreadsheets

One of the great things about attending a workshop is that you give yourself time to focus on one thing for a while.

And in the case of a Webstock workshop, there’s also the comfort that comes with realising that other people have spreadsheets of ideas kicking around on their laptops. Not little scraps of notes, or post-it notes on a whiteboard, but spreadsheets organised into columns and rows and sections with shading and emphasis. Phew. Not just me, then.

I was in a workshop today with David McCandless, the London-based data journalist behind Information is Beautiful, who’s here to speak at the Webstock conference in Wellington this week (Feb 14-18 2011).

Early on he showed us a spreadsheet he’d used while working up ideas for his book Information is Beautiful – a process that worked through eliminations, to defining simple questions to answer, and on to answering them and presenting the information visually in a way that informed rather than baffled.

He talked about the power of boredom, ignorance, bewilderment and frustration (perhaps with a news story) as spurs to find out more, make it interesting, make it relevant and compelling.

He talked about the need for:

Integrity = truth, consistency, honesty, accuracy
Function = easienss, usefulness, usbility, fit
Form = beauty, structure, appearance
Interestingness = relevant, meaningul, new

We got into small groups to come up with 10 concepts we’d like to understand more about. What a relief to sit for an hour and hash out ideas  – and actually get past the fleeting-idea-while-doing-something-else stage. Was impressed by the kinds of concepts and breadth of them.  Got to love the way people think.

Later we sketched out ways for visualising the ideas – more challenging than it sounds. My biggest problem was narrowing down the focus of what I was trying to communicate. Nothing new under the sun, eh.

Taking away quite a bit to think about. So, thanks David.

I’m going to be taking a few notes this week  and will be posting them here (it’s a TiddlyWiki, which is still my preferred notepad for conferences since it allows me to update as I go and organise navigation as I go). They’re very rough notes, but you’re welcome to have a look if you’re interested.

Animated map of Auckland’s buses and ferries

I love where Chris McDowall is going with his animation and visualisation work and look forward to seeing more from him.

He’s published on Sciblogs an animated map of Auckland’s buses and ferries over a single day.

An animated map of Auckland’s public transport network from Chris McDowall on Vimeo.

The animation begins at 3am on a typical Monday morning. A pair of blue squiggles depict the Airport buses shuttling late night travellers between the Downtown Ferry Terminal and Auckland International. From 5am, a skeleton service of local buses begins making trips from the outer suburbs to the inner city and the first ferry departs for Waiheke Island. Over the next few hours the volume and frequency of vehicles steadily increases until we reach peak morning rush hour. By 8am the city’s major transportation corridors are clearly delineated by a stream of buses filled with commuters. After 9am the volume of vehicles drops a little and stays steady until the schools get out and the evening commute begins. The animation ends at midnight with just a few night buses moving passengers away from the central city.

Some things to note:
The steady pulse of the Devonport Ferry.
The speed at which buses hurtle down the Northern Motorway’s new bus lanes.
The interplay between buses and ferries on Waiheke Island.
The sheer mind-boggling complexity of the system.

Chris describes the process (and a few limitations he found with it) over on his blog at sciblogs.co.nz. Worth a look.

Dan Nguyen’s coding for journalists 101

A terrifically useful and generous post from Dan Nguyen introducing journalists to programming concepts and enough script writing to scrape some data from a web page.

Couldn’t come at a better time since data  – and the ability to find it, analyse it and share it in edible chunks – is becoming an increasingly important component of journalism.

As Tim Berners-Lee, founder of the world wide web, said recently:

“Journalists need to be data-savvy. It used to be that you would get stories by chatting to people in bars, and it still might be that you’ll do it that way some times.

“But now it’s also going to be about poring over data and equipping yourself with the tools to analyse it and picking out what’s interesting. And keeping it in perspective, helping people out by really seeing where it all fits together, and what’s going on in the country.”

The Guardian story those Berners-Lee quotes come from also talks about City University ‘s new ” MA in interactive journalism, led by Jonathan Hewett and Paul Bradshaw, which will teach “data journalism” as part of its curriculum – “sourcing, reporting and presenting stories through data-driven journalism, and visualising and presenting data (including databases, mapping and other interactive graphics).”

What a great initiative. Here’s hoping we can follow up on this idea in New Zealand.

Here’s a taste of Nguyen’s DIY coding post:

This is my attempt to walk someone through the most basic computer science theory so that he/she can begin collecting data in an automated way off of web pages, which I think is one of the most useful (and time-saving) tools available to today’s journalist. And thanks to the countless hours of work by generous coders, the tools are already there to make this within the grasp of a beginning programmer.

You just have to know where the tools are and how to pick them up.

Click here for this page’s table of contents. Or jump to the the theory lesson. Or to the programming exercise. Or, if you already know what a function and variable is, and have Ruby installed, go straight to two of my walkthroughs of building a real-world journalistic-minded web scraper: Scraping a jail site, and scrapingPfizer’s doctor payment list.

He goes on to explain and provide useful links about the basics of HTML, attributes, links, using Firebug, installing Ruby, strings, variables, comparison operators, arrays, hashes, conditional branches and more. Then you get to write a script.

I got sidetracked (by paid work) so didn’t finish the tutorial but will definitely be returning and spending more time on this over the Christmas break.

In the meantime I am enjoying playing with Ruby in my browser to learn a bit about how Ruby works (a phrase guaranteed to strike fear into the hearts of the allaboutthestory.com developers, I’m sure).