In this installment of the Innovation of the Month series (see last month's story here), we explore the University of Pennsylvania’s Master of Urban Spatial Analytics (MUSA) Practicum and how the graduate students in the program work with city officials to develop data science tools that their clients can use to determine how best to use their resources. The program is led by Professor Ken Steif along with Karl Dailey and Michael Fichman.
MetroLab’s Executive Director Ben Levine sat down with Professor Steif and some of the program’s graduate students to learn more.
Ben Levine: Can you explain the concept of urban spatial analytics? Is this a new practice or an existing field in which you are applying new approaches?
Ken Steif: Spatial analysis explores "where" questions. Typically, most phenomena are distributed across space according to some underlying theory. Examples include segregation, zoning, environmental degradation and the like. Spatial analytics is the study of these underlying spatial dynamics toward a broader understanding of why things locate where. In recent years, the master of urban spatial analytics (MUSA) program has tried to take a relatively unique approach, harnessing spatial dynamics for the purposes of prediction.
Levine: Can you tell me about the MUSA program and how the practicum fits into the coursework?
Steif: The MUSA program at Penn is at the intersection of data science and public policy. The program, which began as a GIS degree, is about a dozen years old. In recent years, we have shifted toward a more traditional analytical curriculum building upon GIS skills with data visualization, machine learning, Web development and even how to communicate technical analyses to non-technical decision-makers. Aside from data-driven coursework, students are required to take urban-focused classes from the across the university in order to ask interesting and important questions about their data.
Our goal is to train the next generation of data scientists working to convert government data into actionable public policy intelligence.
Levine: Why the emphasis on having students engage with city agencies?
Steif: The open data movement has been a catalyst for the evidenced-based policy movement, but it's unclear whether governments have the resources to capitalize on these data. Besides the clear educational benefit of having the students work with real-world clients, we saw this opportunity both as a way to demonstrate novel, yet unexplored public-sector use cases, and as a vessel for releasing source code that other jurisdictions, collecting the same data, could use to replicate our algorithms.
This year, the MUSA Practicum consisted of five student-led projects, three of which are highlighted here. For more information on each of these projects, you can visit MUSA’s website.
Project Title: Property Tax Foreclosure Early Warning System: A Look into Philadelphia
Team members: Nate Klass, Justine Kone, and Sydney GoldsteinProject contact: Jonathan Pyle, attorney at the Philadelphia Legal Assistance
Levine: Can you describe what your project focused on and what motivated you to address the particular challenge?
Nate Klass: Our project aimed to help legal advocates at Philadelphia Legal Assistance (PLA) identify residential property owners who are at high risk of tax foreclosure. Currently, outreach to homeowners at risk of property tax foreclosure is restricted to certain areas of Philadelphia due to funding. We hope our data-driven approach to predicting foreclosure can create a tool that will maximize resource allocation, allowing advocates to better target homeowners vulnerable to sheriff sale.
Justine Kone: We were motivated by our individual interests in foreclosure and housing in general, and wanted the opportunity to use data analysis, which can sometimes be quite impersonal, to help improve the lives of people in our community.
Sydney Goldstein: This project was also motivated by the need to increase housing stability for long-term and lower-income residents in Philadelphia. Working with Jonathan Pyle of PLA, we learned that these are the clients PLA works with most frequently and are the hardest to reach through the current outreach system.
Levine: What motivated your agency to participate in the MUSA practicum?
Jonathan Pyle: Philadelphia Legal Assistance is a legal services organization that addresses the legal needs of the low-income population of Philadelphia. In order to do that, we need to understand the systemic issues that affect low-income people. When trying to understand the issues that affect populations, it isn't enough to rely on anecdotes. We aim to be data-driven so that we can set priorities and allocate our limited resources empirically to maximize our impact. We collect a lot of data in the course of our work, but we don't have in-house expertise in advanced data analytics, so we cannot make full use of the data. The MUSA practicum offers us an opportunity to work as a team with data scientists. As part of the team, we provide data and subject matter expertise, while the MUSA team develops a predictive model, tests it thoroughly and builds an app.
Levine: What did you learn about government data and urban analytics from the process? What was the most surprising thing you learned during your research?
Klass: I think what was most interesting to me was truly how much data was available publicly that is hard to parse for most people. It took us a significant amount of time to reshape our data into something that was usable for our modeling purposes. Without the data painstakingly put together by Jonathan, we really would not have been able to make this project a reality.
Kone: I completely agree with Nate. For me, I think this project also cemented the importance of using solid domain knowledge to inform the process. Much of the data we worked with was administrative court data, and if we hadn’t conducted our own research, and relied on Jonathan’s knowledge, we wouldn’t have been able to interpret that data enough to understand how different court events fit into the foreclosure process.
Goldstein: I also think this project showed the importance of understanding the resource allocation process for a specific research question. Throughout the entire project, we constantly came back to whether or not our results were reasonable based off the use case. Understanding how outreach was being done currently was important to knowing how well our analysis performed.
Levine: Where will this project go from here?
Klass: We’re pretty hopeful this model will be used by Philadelphia Legal Assistance. We created a Web app for this project that Jonathan seemed pretty interested in using in conjunction with his current methods for outreach. We will send this app to him soon so that he can take it and hopefully run with it!
Kone: We’ve also made our code publicly available online, through the course GitHub page and also through our own pages. The idea is that if other people are interested in creating a similar warning system for their city, they can use our code as a framework.
Goldstein: It also appears hopeful that not only will Philadelphia Legal Assistance adopt our app, like Nate mentioned, but also that the probabilities we predicted may also help their decision-making process in the court cases they work on.
Levine: How will your agency be implementing these findings?
Pyle: We will use the findings to better understand the real estate tax foreclosure phenomenon and the effectiveness of interventions. We will explore questions like: what are the conditions under which the tax collection process results in low-income homeowners losing their homes? For which homeowners does legal assistance mean the difference between saving a home and losing it? Can data be used to identify the most vulnerable homeowners, so that we can target our outreach efforts at those homeowners? The answers to these questions will help Philadelphia Legal Assistance increase the effectiveness of its legal services to tax-delinquent homeowners, and will help the city of Philadelphia balance the need to collect real estate taxes with the need to avoid the harmful effects of foreclosure.
Project Title: Predicting Spatial Risk of Opioid Overdoses in Providence, RI
Team members: Jordan Butz and Annie StreetmanProject contacts: Dahianna Lopez, Data and Evaluation Manager, Healthy Communities Office, city of Providence; and Lt. Thomas Stegnicki, Quality Assurance, Providence Fire Department
Levine: Can you describe what your project focused on and what motivated you to address the particular challenge?
Jordan Butz: We built a spatial risk model of opioid overdoses for the city of Providence that assigns a level of overdose risk to each area of the city. The idea is that having a citywide risk map could assist Providence and local stakeholders in strategically allocating resources in a way that will achieve the greatest impact. We were motivated to pursue this project for two reasons. First, the opioid epidemic is one of the greatest public health crises facing both cities and rural areas, and we wanted to find a way to contribute to such a relevant and pressing issue. Second, when we began this project, Providence had just launched its Safe Stations program, which allows people struggling with substance use disorders to walk into any of the city’s 12 fire stations and be connected with supportive services.
Annie Streetman: We wanted to see if predictive modeling and machine learning could bolster the city’s existing efforts by identifying areas at high risk of overdoses where additional interventions could be sited or where the city could supplement their communication efforts. We ultimately compiled this information into an interactive tool that can be found here.
Levine: Can you talk about your agency’s participation in the MUSA practicum?
Dahianna Lopez: Like many cities across the country, Providence has been hard hit by the opioid epidemic. As a creative capital, the city was searching for innovative ways to address this public health emergency. Leading up to our participation in the MUSA practicum, our data team had devised a text analytics protocol for identifying opioid overdose cases from ambulance runs. The next step would be to analyze the data, but the capacity to do predictive analytics was limited. As such, participating in the MUSA practicum was timely and relevant.
Levine: What did you learn during your research?
Streetman: How difficult it was to identify the number of overdoses that had occurred in the city. In order to collect the data set of overdoses in 2017 that we used for this analysis, the Providence Healthy Communities Office undertook a tremendous effort of using text analytics to identify overdose cases from EMS data.
Butz: Ideally, tagging an event as an overdose could be incorporated into 911 call or EMS run data so that the overdose data set could be updated with more regularity and capture events over a longer period of time. This would enable the predictive model to be validated on time-based variables in addition to randomized locations of Providence. While the city has expressed plans to collect this data in a more streamlined manner going forward, their text analytics work taught us that just because data doesn’t exist in a clear and accessible format doesn’t mean it’s impossible to obtain.
Levine: Where will the project go from here?
Streetman: We are thrilled that Providence has been very enthusiastic about the outcome of this project. We have transferred ownership back to them and they have indicated that there are plans to incorporate the results of our analysis into their current efforts and to continue the project going forward. We would love to work with them or other cities on continuing to develop this project in the future.
Levine: How will your agency be implementing these findings?
Chief Zachariah Kenyon: The Providence Fire Department (PFD) will use the interactive map to conduct targeted outreach for Safe Stations, a program aimed at connecting people struggling with addiction to recovery services. The PFD data team will also use the code provided by MUSA students to refine the predictive models over time and to ensure project sustainability. Finally, PFD will use this project to demonstrate the need for funding to establish a data analytics team. The project has caught the attention of the state Department of Health, and the PFD, in conjunction with the city’s Healthy Communities Office, is excited to share the information and contribute to the body of knowledge about the epidemic in Rhode Island.
Project Title: Predicting Lead Presence in Minneapolis
Team members: Evan Cernea, Maureen McQuilkin, and Xiao WuProject contact: Stacie Blaskowski, Data Scientist, city of Minneapolis
Levine: Can you describe what your project focused on and what motivated you to address the particular challenge?
Evan Cernea: Our project focused on using property-level data and neighborhood characteristics to predict the likelihood of lead presence in houses across Minneapolis. Although we were initially assigned this project, we became personally invested when we learned that the way most household lead is detected is after a child has tested positive for lead poisoning. We hope that our project will help the city become more accurate and proactive about its testing to prevent children from suffering the effects of a preventable illness.
Levine: What did you learn about government data and urban analytics from the process?
Maureen McQuilkin: In part we learned how lucky we were to work with Minneapolis, a city that is a leader in the civic open data movement and has many of its records publicly available and well documented. It was surprising to see other groups in the practicum struggle with data collection when we had such access to a wealth of data. But the amount of data also taught us a lot about project management: We had limited time to analyze a lot of data, and had to make decisions about what to include and remove in our predictions in the interest of time, to the detriment of slight improvements to the model's predictive power.
Xiao Wu: It was surprising to learn about the policies in place to deal with residential lead paint. There are many rules and regulations about what renovators must do when they are improving a house to prevent lead dust exposure. It was especially salient in Minneapolis, where we learned that two-thirds of the houses were built before 1978.
Levine: Where will this project go from here?
Cernea: The project results have been transferred to the city of Minneapolis, which expressed interest in improving upon the code and predictions.
About MetroLab: MetroLab Network introduces a new model for bringing data, analytics, and innovation to local government: a network of institutionalized, cross-disciplinary partnerships between cities/counties and their universities. Its membership includes more than 35 such partnerships in the United States, ranging from mid-size cities to global metropolises. These city-university partnerships focus on research, development, and deployment of projects that offer technologically- and analytically-based solutions to challenges facing urban areas including: inequality in income, health, mobility, security and opportunity; aging infrastructure; and environmental sustainability and resiliency. MetroLab was launched as part of the White House’s 2015 Smart Cities Initiative. Learn more at www.metrolabnetwork.org or on Twitter @metrolabnetwork.