The Guardianhas launched an
experiment in crowdsourcing following the publication of thousands
of MPs' expenses receipts.
The House of Commons has published 700,000 individual documents
in 5,500 PDF files on 646 MPs at parliament.uk.
It covers four years' worth of expenses and claims outlining
MPs' mortgages, second home purchases, moat cleaning and garden
furniture.
The newspaper has uploaded the documents to its own
microsite and is
allowing people to investigate and analyse the data.
The tools enabling people to investigate MPs' expenses were
build by developers at the Guardian, as a Django application
running on Amazon EC2. Developers said a major challenge was making
each page of the documents available for independent review.
The newspaper said it wants the public to help analyse the
information and potentially discover more news stories buried
within the material. It is hoping to build a picture of how MPs'
claims have changed over time, and find MPs who claimed small
amounts.
Janine Gibson, editor of Guardian.co.uk, said: "It's a huge
release of information, which manages to be both extremely open and
terribly closed at the same time. Open because it allows the public
unprecedented access to MPs' claims over a huge amount of time.
Closed because key address and personal details are blacked out,
and the information is impossible to analyse electronically."
She added that even if the documents do not lead to more
stories, it is hoped the site will help make the information more
transparent and more useful to users.
The site allows people to add narrative on individual expenses,
highlight documents of interest, tell the newspaper the context of
a receipt and how interesting it is, and enter the relevant
expenses figures and dates on each page.