The Essential Starting Point: Data Quality
A government can use AI to become more data-driven in multiple ways. AI algorithms perform in-depth data analysis far more quickly than a person, producing faster insights into causes and solutions. And with the ability to process large volumes of data, it’s easier for an AI algorithm to detect trends and correlations that wouldn’t be readily apparent to a human analyst. These discoveries can help leaders make more effective policy and program decisions and can help program staff meet goals such as improving service delivery, preventing benefits fraud and increasing tax collections.However, good insights depend on availability of quality data for the training data sets used by an AI model. These data sets can include a variety of structured or unstructured data, drawn from multiple sources. The critical factor is that all data has the quality and suitability to ensure the AI model’s output will stand up to scrutiny for fairness and accuracy.
For many governments, maintaining a high level of information quality is a challenge because data is often siloed into specific departments, programs and systems. Additionally, that data may be structured only for a specific type of report or transaction, making it difficult to access and analyze in a meaningful way with AI.
A government can improve data quality by focusing on three tasks. First, identify the right data to use and verify it is appropriate and valid for the intended AI analysis.
Next, perform basic data cleansing to eliminate duplicate, inconsistent, and inaccurate records and data fields. Consider creating central data stores that serve as a single source of truth for the data types processed by AI models.
Finally, prevent bias by validating that the data used by AI models accurately reflects the affected community. Factors to check include the data’s source, how well it fits the AI model’s intended purpose, and whether the data fairly represents community inclusivity and diversity.
Building the AI Capability
An AI capability requires both technology and staff expertise. For the technology, AI models need powerful computing platforms to support high-volume data processing and distributed AI services. Commercial AI platforms are available from vendors, although some agencies may choose to build a platform in-house for higher customization and control.For staff expertise, data scientists offer the understanding of how to build both predictive and prescriptive AI models for government purposes. It is critical that the department and program staff are involved from the beginning as data scientists can benefit from their experience and community knowledge. This program and community expertise can help design AI algorithms that reduce potential bias, as well as validate the AI analysis and recommendations.
If a government is unable to find or fund internal data science staff, outsourcing may be an option. In this case, a vendor develops the AI model and trains it with a government-provided data set. Once validated, that model can be deployed on a government’s platform for ongoing use by program, business and policy staff.
Maintaining Standards for Data Use
Data used by AI algorithms requires ongoing oversight to assure continued compliance with legislative and regulatory requirements as well as internal policies. This oversight must also monitor compliance with regulatory restrictions on data sharing across departments, programs and agencies.Oversight helps the organization address frequently raised issues of AI trust. For example, the logic and processing that AI algorithms use is often complex and hard to understand and explain. This lack of complete transparency can lead to more skepticism of AI among employees and constituents.
AI can also compound bias or errors that exist in data, which may create inaccurate, irrelevant or discriminatory analyses, profiling and decisions. Trust concerns especially apply to AI processing of personal or demographic data.
Several oversight activities, performed as a standard practice, can help alleviate concerns. These activities include:
- Checking data quality to verify it is current, accurate, relevant and fair
- Understanding the legislative and regulatory requirements around data use, especially personally identifiable information (PII) whereby data anonymization tools can help
- Developing and enforcing internal policies that cover data privacy and ethics of data use for AI
- Verifying governance of AI tools and algorithms to ensure they are transparent and auditable about their use of data in automated processes, analyses and decisions
- Consistent and executive-level attention to these factors is another key to fostering trust among employees and the public in a government’s use of AI technology.
Making the Best Use of Data and AI
A well-defined AI framework is built from both technology elements and human guidance. Give an AI model the right data and run it on a powerful computing platform. Then guide it with the knowledge of experienced data science and program staff as well as oversight by appropriate policies. This framework can help a government leverage the power of AI to deliver positive impact for the public, while minimizing the risk of unintended negative outcomes.To learn more about how to use AI to improve the future of government work, read the next article in our AI series, or to explore more thought leadership around AI, visit the SAP Institute for Digital Government.