Knight awards $3.2m to eight open government projects (2013)

Knight Foundation

The Knight News Challenge on Open Gov has awarded more than $3.2m funding to eight projects working to make public information more accessible and useful.

They are:

Civic Insight: Providing up-to-date information on vacant properties so that communities can find ways to make tangible improvements to local spaces;

OpenCounter: Making it easier for residents to register and create new businesses by building open source software that governments can use to simplify the process;

Open Gov for the Rest of Us: Providing residents in low-income neighborhoods in Chicago with the tools to access and demand better data around issues important to them, like housing and education; Launching a public policy simulator that helps people visualize the impact that public policies like health care reform and school budget changes might have on local economies and communities;

Oyez Project: Making state and appellate court documents freely available and useful to journalists, scholars and the public, by providing straightforward summaries of decisions, free audio recordings and more; Making government contract bidding more transparent by simplifying the way smaller companies bid on government work;

GitMachinesSupporting government innovation by creating tools and servers that meet government regulations, so that developers can easily build and adopt new technology;

Plan in a Box: Making it easier to discover information about local planning projects, by creating a tool that governments and contractors can use to easily create websites with updates that also allow public input into the process.

Knight has more detailed information about each of the projects here. The next Knight News Challenge, the second of two this year, will be announced soon.

Iceland journalism haven, Wikileaks needs cash (2010)

This could turn out to be interesting.

In a post on the Guardian’s Organ Grinder blog the editor of Wikileaks Julian Assange talks about how Iceland could become a journalism haven.

I’m excited about what is happening in Iceland, which has started to see the world in a new way after its mini-revolution a year ago. Over the past two months I have been part of a team in Iceland advising parliamentarians on a cross-party proposal to turn it into an international “journalism haven” – a jurisdiction designed to attract organisations into publishing online from Iceland, by adopting the strongest press and source protection laws from around the world.

Because of the economic meltdown in the banking sector, which, per capita, was the largest of any western country, Icelanders believe that fundamental change is needed in order to prevent such events from taking place again. Those changes include not just better regulation of banks, but better media oversight of dirty deals between banks and politicians.

The “Icelandic Modern Media Initiative”, a proposal that binds the government to draft legislation to develop an attractive package of free speech and openness laws, including source protection, internal media communications protection, protection from libel tourism, immunity for intermediaries such as ISPs, and a tight statute of limitations on litigation. It is to be filed by tomorrow and has cross-party support, including from the governing coalition. Although the political environment in Iceland is still highly charged over the 6 March referendum about the Icesave dispute, it is expected to be voted through.

Wikileaks meanwhile has cut back its operations while it fundraises, saying it’s short $70,000 for the year’s basic operations and needs another $400,000 if it wants to pay staff. Here’s what visitors to the front page see.

Copyright, findability and other ideas from #ndf 2009

I was at the National Digital Forum conference in Wellington earlier this week mingling with people involved in digitising and curating New Zealand’s cultural heritage material – people from museums, galleries, archives, libraries.

I was struck by a few commonalities between the cultural heritage sector (known as GLAM – Galleries, Libraries, Archives, Museums) and the digital news media.

Both deal with sizeable repositories of digital content, for a start, and are grappling with how best to manage those assets, ensure their longevity and make them readily discoverable.

Here are a few thoughts on a couple of themes that I picked up on from the conference, which was held at Te Papa (Museum of New Zealand). The conference was nicely organised, had some interesting guest speakers from here and overseas, and was very enjoyable (my thanks to the organisers).


Since passive audiences have become active users of content, we’re all trying to figure out how to manage content ownership online and get a balance between commercial imperatives, the costs of digitisation, and the need to enable innovation and maintain a lively public domain of enduring use to citizens.

This is a big issue, and complex, and I don’t propose doing it justice in this post. I just want to acknowledge that it’s an issue affecting all branches of the creative industries and wonder out loud if we can’t jobshare the task of finding local solutions.

Five years ago copryight didn’t get a mention in a journalism curriculum. Now I feel dutybound to raise it, introduce Creative Commons, have discussions about how to use images found on Flickr and Google, and introduce questions to ask yourself when publishing your own work – who do you want to be able to use it, how do you want them to be able to use it, do you want to be credited, how will you enforce your rights and so on.

Libraries and museums, meanwhile, have to track down who holds the copyright on historical images and material, decide what to do if the holder cannot be found, very often seek permission to use the material, and determine how to indicate to end users what they are entitled to do with the material (without making them read dense legislation, clauses and exceptions).

Then there’s the people, like NZ On Screen, who are dealing with archival film and television material who also have to hunt down copyright holders, very often consult dozens of people about a single video clip (producer, director, writers, etc) and manage how end users interact with the material.

Meanwhile there are anomalies in the way we reference material. We think nothing of grabbing a couple of paragraphs from a report or speech or blogpost to include in a news story or essay or artwork, but we tend to feel differently about grabbing a few paragraphs out of an audio or video clip to use in a news story or essay or artwork.

Content ownership, use and licensing isn’t simple. Laws and regulations vary in different jurisdictions, how they’re applied varies even within jurisdictions, and they are often densely written and impenetrable to your average end user. Creative Commons stands out not only for giving content creators simple licences to choose from but also for creating simple icons to describe them that are instantly recognisable.

To extend that kind of simplicity to digital content management in the New Zealand context would be fantastic.

There was also a clearly articulated need for greater education about copyright/fair use issues.

There was a suggestion at the conference that members of the forum should work together on a coherent and simple set of guides/licences/icons for New Zealand.

If that conversation continues, my instinct is that the news media should be involved. I suspect we have insights from our industry to share, and would benefit from learning more about the issues and insights of others.

After all, journalists need cultural and heritage collections for research and should be linking to them for the benefit of readers, and I suspect the news media could learn a lot about managing archives from the GLAM folk.

Visual and digital literacy

Newsrooms everywhere are trying to get journalists comfortable online and competent at storytelling in visual, aural and written forms (video, audio, images, text) so they can get their product out to their customers in whatever format they demand.

Journalism schools are finding ways to do the same while still teaching traditional skills such as writing clearly, checking facts, attributing information, providing context, avoiding ambiguity and being fair and balanced and accurate.

It’s deceptively difficult, in my experience.

You think to yourself,  ‘I’ll introduce Flickr, that’s a useful resource’, then find yourself talking about how to shoot images, crop images, caption images and add metadata, search engines 101, how to use software such as Photoshop or Gimp, choose file sizes, understand compression and loss and file types, manage uploads and downloads, collaborate on content creation, use in-house content management systems, manage online accounts and profiles, understand privacy controls, host images for blogs, links, broken links, how to consider copyright and apply and acknowledge it in a variety of scenarios. Phew.

It’s not just newsrooms. The GLAM crowd face similar challenges of bringing their staff up to speed in these and other skills, because they too have to learn how to give their audiences what they want in a variety of engaging formats.

I get the feeling we’re all still finding our way and could use a bit of help.

Making our stuff findable

We can build beautiful, rich websites till the cows come home but they’re no good to anyone if people can’t easily find all that lovely content lurking beneath the homepage. That’s as true for news websites as it is for cultural archives and exhibitions, and it’s a topic that arose often in conversation at the NDF conference.

I’ve been cooling on destination websites for a while. You need to have a destination website, of course, but you need even more to have your content out where your audience is so they can trip over it often and usefully.

I often think it would be nice to create a website from the premise that you publish content all over the web and use the home site to curate it, rather than aggregating/curating first and then pushing out from your home site.

Either way, the big deal in making our content findable is…

Joining the dots

We reinvent the wheel a lot online, and we duplicate content and destinations. That’s partly because we’re all separate organisations doing our own thing. It’s partly because our stuff isn’t findable enough – I often go looking for information and come up empty, even though I know it must be out there somewhere.

But I think it’s also partly because we don’t try hard enough. We don’t allocate enough time for staff to go searching around topic areas, vet what they find, select the most relevant for users’ benefit, and think about how best to link to it.

News websites are perhaps the worst culprits. Some still don’t link out at all, to anything or anyone. Others have begun throwing in a few links to public documents and have finally brought themselves to link to, gosh, YouTube clips that they’re writing stories about. Others are doing a much better job.

But there’s often not enough evidence of news organisations behaving like they’re a member of society. There’s little thought about what a reader coming to a given news story might want to know about its background or what other questions it may raise for them. There’s little interaction with cultural, non-profit, government and other organisations with rich content that would be useful to readers.

There’s often little thought about how to provide useful links – links in stories and listed at the bottom of the page are a great start but how about ways to search other sites from the keywords generated by a news story, a way to book tickets to the show you’ve reviewed, a link to an online bookseller from a book review, a map showing the location of the story topic and a way to click through and explore the location.

Easier said than done, I know, but still.

Those are just a few things chasing round in my mind after the NDF conference. There are many more. We were shown some great sites and exhibitions as well, which I’ll try to collate into another blogpost in a while.

Data, labels and the power of microformats (2009)

Somehow a week’s slipped by since the Open Govt Data Barcamp and Hackfest in Wellington without me giving an update here.

It was a great event with around 150 people putting their heads together on the Saturday to brainstorm the why, where, who and how of making more government data readily available to the public in usable formats.

On the Sunday work started on a handful of projects including gathering case studies, and a local version of the Sahana open source disaster management system. I wrote more about the event and the ideas underpinning it in a guest post on Idealog, and you can see more about the projects on the OpenNZ Wiki.

For a bit of background, you might want to watch Glen Barnes, one of the people behind the Open Data Catalogue, talking to Russell Brown on TVNZ’s Media7 about open data.

One of the themes of the barcamp was the need for structured data, a subject which is also relevant to the management of news content.

By structured I mean that the information in a document, database, spreadsheet – or news story – is consistently entered and marked up with additional information, or metadata, which describes the content in a way that machines and humans both can understand.

In the case of data it means parts of it can be extracted for use elsewhere. Think of pulling election polling data for a particular area and displaying it on a map – useful for your readers but only possible if the polling data is labelled in a way that a computer program can understand and if parts of the data – eg figures for a certain region – can be separated out and extracted on their own.

In the case of a news story it means people can see at a glance where the story was filed, when, by whom, what sources were used, its copyright licence and, perhaps, information on your editorial policy.


Among other things, this makes your stories more visible to search engines and therefore easier to find online, and boosts the value of your news archive by giving stories a longer and more useful life.

A project set up by the UK’s Media Standards Trust and Sir Tim Berners-Lee, with funding from the Knight Foundation and others, has come up with some straightforward suggestions for how news companies can better structure stories.

The project, Value Added News, has a nice introductory slide show about the value of semantic information and advice on how to add that information, using the hAtom and hNews microformats.

hAtom provides semantic information for basic aspects of a news story – such as who wrote it, who it was written for and when it was published. Information that should already be machine-readable but in most cases isn’t. Integrating hAtom will make key elements of your content machine readable, but won’t distinguish your content specifically as news or identify usage rights.

Value Added News Mark-Up (hnews)

To distinguish your content as news, and embed information about usage rights, a site should supp additional levels of semantic standards. For this we recommend hNews. hNews is an extension to the hAtom microformat developed in partnership with AP and released into the public domain.

This is interesting stuff and AP, for one, has begun using these microformats.

Incidentally, the election map I mentioned above was one of the projects that got under way at last weekend’s hackfest.


It could end up being a nice app that would sit well on a news website. Now’s probably a very good time for news organisations to take an interest in open data and in the geeks who bring out the value in it.

Why I’m sponsoring Open Govt Data Bar Camp (2009)

First of all, what’s Open Govt Data Bar Camp?

“The New Zealand Open Government Bar Camp is an “unconference” for people who are interested in making government-held data more freely available for others to re-use. An “unconference” is an alternative participant-driven event, that avoids aspects of a conventional conference, such as high fees and sponsored presentations.

Web 2.0 developments have shown the potential of combining data from different sources made freely available on the Internet. The government holds a huge range of non-personal data which could form the basis of innovative services and applications by others on the Internet.

You should come if you are interested in government information policy, explore ways to provide data, making entrepreneurial use of the Internet, or building working applications during a weekend.”

So why am I a) sponsoring it and b) attending?

Mostly because my instincts tell me this is important. I don’t know what will happen at the bar camp (August 28 and 29) or as a result of it. There’s even a good chance I won’t understand half of what’s being said, given that I’m not a programmer and cannot speak geek.

What I do know is that this is a conversation worth having and I want to be part of it.

Why? Because government data is public data. It’s data about us, that helps describe us and informs us, and is often provided by us. It’s ours. But too often public data is not published which means it’s not available in the public arena. Other times data is published, but in a form that makes it hard to use. A pdf, for example, is great if you want to print out some information in the very format it was published in. What if you want to lay it out differently? Or incorporate it in your website? Or take the data and mash it up to make it useful in a new way?

Some government data is published in usable form in New Zealand. A useful starting point for finding those datasets is the Open Data Catalogue, a newish website that aims to catalog all available government data. One of the creators of that site, Glen Barnes, will be at the bar camp and hopefully we’ll hear more about that project over the weekend.

I’d like to see more public data published in reusable formats to see what useful new applications can be found for it.

I see it as an essential ingredient in the evolution of new forms of journalism.

A good example of someone taking existing data and adding huge value to it is the US-based project which was led by a journalist, Adrian Holovaty, and which has just been acquired by MSNBC. takes data from local government, libraries, police, emergency services and many more sources and ‘arranges it’ geographically so that visitors to the website can get a picture of what’s happening on their street. They can see figures for house listings, crime, new books available at the library, inspection ratings for local restaurants and much more, all on the one website. Previously, the visitor would have had to go to dozens of websites to find that range of information, or go to their local council building, or may not have been able to find it at all.

Another project of interest to me is, an open source initiative which aims to crowdsource crisis information by allowing people to add details to a map or timeline via text, email or web. The crisis could be a weather event, a medicine shortage, or a tumultous election process. What results is a centralised visual display of information  – which could easily also incorporate and/or link with data from government agencies and NGOs.  I know there’s someone coming to the bar camp who’s interested in the Sahana open source disaster management system, which I’ll be interested to learn more about too.

These are just two examples of new forms of journalism and of using data in new and useful ways. There are many more possibilities. Let’s see what else we can come up with.

Guardian opens up its data and content with API (2009)

This is a pretty old post but I’m leaving it here for future interest. Failed links have been updated (where possible) or removed.

The big news on the evolving news scene this past week has been the Guardian creating an open API.

What does that mean?

It means the Guardian has opened up its stories and databases of information in a way that allows other websites to use Guardian content in new and interesting ways.

Say you have a website with a local focus, you could create a map of your area and let readers click on the town they’re interested in to see stories relating to that place. What they’ll see is stories written by the Guardian as well as those written by the website. That’s a big help for the small website, a useful service to readers and a boost for the Guardian, which gets even more people reading its stories and clicking through to its website.

Another website might take Guardian data and create interactive displays to allow readers to explore information in interesting ways. Another might add Guardian data to a knowledge bank that’s easily searchable, visually interesting and a boon for educators.

The point is that there are limitless ways people can use data and stories. The more people who play around with it, the more interesting resources and ways of displaying and interacting with data get developed. Everyone benefits. And none more than the Guardian, which, all going well, builds an ever greater readership and relationships with an ever greater number of developers and publishers.

The Guardian can also serve ads with the content it gives away through the open API, broadening its advertising reach.

Bill Thompson did a nice job explaining what the API is on This Way Up on Radio New Zealand National on Saturday.

Why is it important?

Because it recognises that the best way to get your content out to the most people is to let other people help you do it. It’s a kind of variation on more hands make light work. Only exponential.

It’s not smart to expect people to remember to come to your website each day to read your news stories (and view your ads). People are busy, distracted, forgetful and fickle. You have to find a way to get your news out to where your audience is hanging out online, keep reminding them you exist and enticing them back to your website or to use your mobile or social media products or to view your ads wherever they happen to be.

Feeding your news out on Twitter, Facebook, Bebo and other social networks helps. So does making it easy for people to Digg your content, bookmark it and share it on other social media sites. But having an open API takes sharing to a new level. It has the potential to spread your content around the internet to an extent and on a scale you could never achieve on your own.

As the Wired article points out, the Guardian is not alone, the New York Times has done something similar and Google and others have been doing it for a while. But it’s relatively new to the news scene because newspapers have been coy about ‘giving away’ their content, preferring to keep everything tied up on their sites. Most are still not even linking out to other websites, let alone giving other websites direct access to their articles.

The details

Simon Willison was one of the developers of the Guardian’s open API and has blogged about the details. Here’s an excerpt about each of the two strands – the data and the content.

As a starting point, we’re publishing over 80 data sets, all using Google Spreadsheets which means it’s all accessible through the Spreadsheets Data API.

Here’s [news editor Simon Rogers’] take on it, from Welcome to the Datablog:

Everyday we work with datasets from around the world. We have had to check this data and make sure it’s the best we can get, from the most credible sources. But then it lives for the moment of the paper’s publication and afterward disappears into a hard drive, rarely to emerge again before updating a year later.

So, together with its companion site, the Data Store – a directory of all the stats we post – we are opening up that data for everyone. Whenever we come across something interesting or relevant or useful, we’ll post it up here and let you know what we’re planning to do with it.

It’s worth spending quite a while digging around the data. Most sets come with a full description, including where the data was sourced from. New data sets will be announced on the Datablog, which is cleverly subtitled “Facts are sacred”.

The Content API provides REST-ish access to over a million items of content, mostly from the last decade but with a few gems that are a little bit older. Various types of content are available—article is the most common, but you can grab information (though not necessarily content) about audio, video, galleries and more. You can retrieve 50 items at a time, and pagination is unlimited (provided you stay below the API’s rate limit).

Articles are provided with their full body content, though this does not currently include any HTML tags (a known issue). It’s a good idea to review our terms and conditions, but you should know that if you opt to republish our article bodies on your site we may ask you to include our ads alongside our content in the future.

We serve 15 minute HTTP cache headers, but you are allowed to store our content for up to 24 hours. You really, really don’t want to store content for longer than that, as in addition to violating our T&Cs you might find yourself inadvertently publishing an article that has been retracted for legal reasons.

Read the rest of what Simon has to say here. He notes that the response has been huge so “as a result it’s likely that API key provisions will be significantly lower than the overall demand for them. Please bear with us while we work towards a more widely accessible release.”

KiwiFoo ’09 and the power of conversation

This is a pretty old post but I’m leaving it here for future interest. Failed links have been updated (where possible) or removed.

I had the good fortune recently to attend KiwiFoo (aka Baa Camp), a kind of unconference which brings together a cluster of people from various fields who share at least one thing: a burning passion for what they do.

Hard to go wrong with a starting point like that and sure enough it proved a hugely entertaining and engaging weekend and something of a networking nirvana – every single conversation I had over those two days was interesting and useful and I made some great connections.

Sincere thanks go to the organisers Nat Torkington – who brought the idea of FOO (Friends of O’Reilly) to New Zealand – Jenine Abarbanel and Russell Brown for the invite, and for arranging such a stimulating event. My brain is still humming with ideas weeks later.

There’s no set agenda for KiwiFoo, instead you signal ahead of time what you’re interested in talking about and settle it down into scheduled one-hour sessions when you get there – then rearrange them until most people are happy with the spread.

For the most part the sessions are led by one or two people who get the ball rolling then open it up for discussion. Brilliantly simple and effective.

There were several highlights, including a very funny Saturday night town-hall debate which left me hankering for more oratory in my life (a certain lack of variation in the use of adjectives notwithstanding).

Another highlight for me was a session on the future of news in NZ which turned out to be lively and left me with the clear impression that people – all kinds of people – really care about keeping quality news alive.

We didn’t solve the problems of the world but started a good conversation and there were a few threads that have stayed with me. One is that the news media is broken, albeit not completely and in different ways for different people. Notably, a number of entrepreneurs talked to me about how little coverage there is of their sectors in mainstream news and what coverage there is often comes straight from press releases they’ve written themselves. They want their stories told in context and more coverage of the issues they face.

Advertising is a biggie – it’s not being sold well online and big agencies often don’t work for small publishers. And the need for good journalism to support democracy is paramount regardless of who’s publishing the news (newspapers, TV companies, bloggers, networks of independent journalists).

Probably the biggest takeaway for me was that the Future of News is a really big subject with multiple threads and I feel like we’re still looking for an effective framework and lexicon for discussing it. There was some suggestion that a MediaFoo might be a good idea and it certainly holds appeal for me. There’s a lot to talk about.

A particularly impressive outcome from KiwiFoo has been the way a core group of attendees have driven the #blackout campaign to halt the addition of the contentious S92A clause to NZ copyright law. The campaign is superbly documented on The Big Idea by Mohawk Media’s Helen Baxter.

Meanwhile, here are a few posts from other KiwiFoo campers: Mozilla hacker Robert O’Callahan, SixAparter David Recordon, Hard Newsman Russell BrownInterclue’s Seth Wagoner , blogger David Farrar and the Strategist.

PA chooses open source (2008)

“We were being asked to do things that we just couldn’t bend the system any further to do.”

Sound familiar?

That’s PA’s IT development manager Paul Berman talking to about the company’s decision to use an open source platform, Nuxeo, to build a better content management system.

Like everyone else on the planet PA is handling an increasing volume of video, audio and slide shows as well as text and they need a system that can manage that simply and efficiently.

Anyone who works in a newsroom will know that the legacy systems we have simply aren’t up to the job. Older web CMSs weren’t built for multimedia in any volume and no matter how many scripts the good people in IT write to make the ageing print CMS talk nicely to the ageing web CMS, it’s not a smooth process. In most newsrooms reporters still write into the print CMS.

Invariably getting something online involves multiple steps, things to remember, repetition and miles of mouse clicks and keystrokes.

That makes it difficult to distribute the burden of marking up and loading web copy among authors and editors, who are the natural candidates for the role in a web-first operation.

If it’s a really complicated process, they’re not going to want to do it, might not understand it well and cut corners, and will be distracted from the business at hand of researching and producing good stories.

“We have some really very specific requirements for our editorial applications in which we want to deliver a very effective tool for our journalists to do things with as few keystrokes as possible, efficiently and quickly,” Paul Berman told

On the reason for choosing open source, he said: “It’s a mixture of the flexibility, the cost and the potential to scale it and make it really adaptable for our environment.”

Interesting to see a major news group choose open source given the customary preference in such organisations for proprietary ‘best of breed’ systems.