In the week before the election, Rob Ford released an 83 page document that listed 2721 of the donors to his campaign. The file was in the form of a PDF, yet unlike many PDF files the data was not easily extracted, meaning it was not possible sort and edit the information in a spreadsheet or database.
In order to do an initial investigation into what the data revealed, the document was passed through Optical Character Recognition (OCR). We were able to put the data into a spreadsheet, but withthis came some draw backs- the OCR did not properly recognize all the characters.
For example:
Peter Viris became Fotor Viris
Jaspal Bhangal became Jos ou 13hongo1
We are looking for volunteers to go through this document and help us to clean the data by putting in the correct names. In addition we are hoping that volunteers can help us to do quick profiles on some of the biggest donors on the list.
If you are interested in helping with this project and you have one or more hours you can contribute, please sign up to be part of the team that will be working on this. For one person it would be a mammoth task, but if 20 people each gave an hour or two, it will be done before we know it.
The idea is that the results of the project would be released by December 1st, when Rob Ford takes office, and prior to that, anyone who helps with cleaning the data would have advanced access to it.
If you are interested in helping please click here to fill out this form