According to a news release this month, the alliance will work together to define new industry benchmarks in four categories — safety, accountability, fairness and efficacy, or “S.A.F.E.” — to address data security, reliability and equity in learning.
Jim Larimore, chief officer for equity in learning at Riiid, said the digital learning market has yet to establish uniform standards to determine the quality, reliability and efficacy of emerging technologies such as AI and machine learning. In Larimore’s view, more needs to be done to test ed-tech tools “on the front end” and address potential risks that may come with them. The goal, he said, is to catalyze discussions and guide the development of emerging technologies in schools.
“We do feel that with the use of AI in education, we’re in a relatively early phase. It’s a growth area in education where we really have to slow up a little bit now to make sure that we all have a shared set of understandings or definitions for what we mean by data security or cybersecurity, and what we mean by accountability,” he said. “It comes down to a matter of knowing you have to act in trustworthy ways ... We think that the industry, or those involved in education technology, have a responsibility to show up that way.”
DXtera Institute President Dale Allen said stakeholders from about a dozen countries have signed up to participate in the alliance as of this week, following a formal announcement of the initiative earlier this month at the ASU+GSV summit in San Diego, organized by Arizona State University and the modern merchant bank Global Silicon Valley.
Organizations now include Carnegie Learning, ETS, GSV Ventures, Digital Promise, the German Alliance for Education, and Education Alliance Finland, among others.
“We’ve been working on this for about a year behind the scenes on a growing community perspective, talking to others around the world who are saying, ‘There are no benchmarks, we need to develop shared understanding, we need to come up with a way to certify something is safe,’” Allen said of the formation of the group, which intends to propose new guidelines by next year.
He said the alliance will explore how to best develop anonymized data sets to build AI tools that are equitable, fair and unbiased, as well as develop random controlled trials and other means to objectively measure the efficacy of AI digital learning programs.
Allen noted stringent testing may also help determine whether products meet existing data privacy laws, such as General Data Protection Regulation guidelines in the European Union and data privacy laws in California.
“There’s a lot of work to be done to create benchmarks and agreed-upon definitions, standards and testing procedures,” Allen said of how to meet existing data guidelines. “This is going to be the way forward for collective action.”
According to Larimore, the alliance could play a role in improving existing technologies, such as proctoring tools that use webcams and facial recognition to monitor virtual students and discourage cheating.
Larimore said students with ADHD, for instance, have been wrongfully flagged by these anti-cheating programs that monitor for behaviors deemed suspicious. He said this is partly due to a lack of behavioral data in creating the AI, so it doesn’t account for symptoms of ADHD such as fidgeting and an inability to maintain focus.
Some universities have dropped the use of proctoring programs altogether after students with darker skin tones reported not being recognized by the software, a recurring limitation with AI facial recognition since its early development.
“One of the things we know of those proctoring systems that rely heavily on computer vision or facial recognition is that there are hardware and software limitations to that,” he said.
Allen said the AI ed-tech industry can tackle shortcomings like these by building upon larger, representative sets of data that can take different scenarios into account when making decisions.
“If you don’t train those tools and those algorithms on a significantly diverse set of data to begin with, or a large data set that looks like a population that will use it, you are going to take [from] the biases of the developer of the algorithm,” he said.