This is a common question for anyone who reports on systemic issues with data and code. My first response is to be somewhat dismissive because a single reporter at a tiny organization simply can’t cover that many data sources with enough rigor to have impact. Instead of being dismissive, we need to be creative by working with others outside of journalism.
That’s why it’s so exciting that our readers have downloaded the Chicago parking ticket data and Chicago gang database so often that they have become two of the most popular offerings in ProPublica’s Data Store, our collection of free and commercial datasets. A small newsroom can only do so much. But together we can amplify the impact of work like our reporting on tickets and ticket debt with open data and open-source software.
A way to start analyzing the data with a low barrier to entry is to use a tool called Jupyter. This makes it easy to combine text, code and visualization in a “notebook.” These notebooks are accessible to folks without coding knowledge and are easy for novices to use. We’ve used such notebooks in the past to share our methodology. The more notebooks we build with the ticket data, the more anyone who wants to analyze the data will be able to draw on the work of others.
We’re digging into how patterns in tickets may mirror patterns in policing, the issuing of tickets for technicalities like a cracked windshield or other minor violations, variations in contested tickets and the outcomes, if certain types of tickets and trends have a relationship with gentrification, and more.
We’ve been working with civic data hacker Matt Chapman to give everyone access to accurate geographic data for every ticket in the database. Data scientist Matt Triano and senior software developer Ishmael Rufus have stepped up and provided leadership and significant contributions.
Know how to work with data? We can use your help. Want to learn? We’ll show you. We are meeting every Tuesday at Chi Hack Night to build a collection of examples showing how to analyze Chicago’s ticket data. Even if you aren’t able to join us in person, you can participate via GitHub. If you’re interested in working with the data yourself, get it at the Data Store.
We’ll release another 10 years of ticketing data in the coming weeks.
This story was originally published by ProPublica.