With such risks in mind, the U.S. House Committee on Science, Space and Technology met May 11 to discuss the path to improving open source software cybersecurity.
“It’s safe to say that anyone who has used a computer has relied on open source software,” said Rep. Bill Foster, D-Ill., who convened the hearing.
In fact, technology firm Synopsys reviewed 2,409 commercial codebases and found open source in 97 percent of them. These codebases came from 17 sectors, ranging from energy to financial services.
Software that is “open source” is created by volunteers and made free for anyone to use or alter. The code is often submitted to repositories where others can download it, and frequently is used as a building block of other open source and proprietary software.
MAKING CHANGE
Plenty of challenges face this unique sector, but any cybersecurity improvements will benefit users worldwide, said Brian Behlendorf, general manager of the Open Source Security Foundation (OpenSSF). OpenSSF is a Linux Foundation project aimed at helping various groups collaborate on improving the security of the open source ecosystem.
“The bad news is there is a lot of work to do, and a lot of different kinds of work is needed,” Behlendorf said. “The good news is we know what that work is, and we've got some proven tools and techniques that can scale up if the resources are made available.”
Some of the most impactful efforts include spreading education about the fundamentals of secure software development and encouraging use of more secure, memory-safe coding languages, Behlendorf said. His organization is working to bring training to more people.
The government can play an important role in providing more tools and standards for software supply chain security and easy-to-follow guidance on how developers can maintain software projects and users responsibly and securely adopt them, said Amélie Koran, nonresident senior fellow in Cyber Statecraft Initiative at the Atlantic Council’s Scowcroft Center for Strategy and Security.
In some cases, awareness campaigns are needed to alert developers to existing resources, while other cases call for new supports, Koran said.
Some government efforts are forthcoming: The National Science Foundation (NSF) is expected to provide grants for securing elements of the open source ecosystem.
With so much open source software out there, it’s also important to determine which software to prioritize when directing resources. Government and open source community partners need to identify and focus first on the software that supports critical functions or underpins a vast array of other software, Behlendorf said.
Recent moves in that direction include a joint project from the Linux Foundation and Harvard Laboratory for Innovation Science, which cataloged the 1,000 open source libraries most widely used in “commercial and enterprise applications,” publishing a list in March 2022.
But it will take a variety of continual pushes to make change.
“None of this will be a quick fix,” Koran said. “It requires a consistent, reliable effort from every group I've mentioned and from each angle to ensure success.”
THE COMPLEXITIES OF OPEN SOURCE SECURITY
Volunteer Force, Volunteers’ Focus? — The benefits and risks of open-source go hand-in-hand.
Developers who cannot afford to build entire systems from scratch can create new offerings by using free open source components, said U.S. Air Force CIO Lauren Knausenberger.
That’s been a game-changer in artificial intelligence (AI), for example, where developers can tweak existing models to suit their needs — and skip the intense investment required to train and build models from the ground up, said Andrew Lohn, senior fellow at Georgetown University’s Center for Security and Emerging Technology (CSET).
“In much the same way that bakers today don’t grow their own grain or raise their own hens, most AI developers simply combine ready-made components then tweak them for their new applications,” Lohn said.
Volunteers power open source projects, although roughly half of contributors get compensation from their employers, according to a 2020 international survey from the Linux Foundation and the Laboratory for Innovation Science at Harvard (LISH).
There’s a risk that the work that grabs volunteers’ focuses isn’t what’s most important to safeguarding the ecosystem.
Both paid and unpaid contributors dedicated relatively little time to security, the Linux Foundation–LISH survey found. Paid contributors said they put about 22 percent of their time into new code, but only about 2.5 percent into security. Unpaid contributors were similar, dedicating 26 percent of their time to new code and roughly 3 percent to security.
Such dynamics are playing out in the AI space, Lohn said.
“A lot of the research into ‘What are the vulnerabilities we need to be worried about?’ is being done bottom up,” Lohn said. “But academics often chase the most interesting problem, not the most relevant ones.”
Lohn suggested the government could help by identifying and encouraging focus on key issues.
Security of projects in general can also be boosted by ensuring that those working on open source are familiar with secure software development best practices, Behlendorf said. Such training can help developers anticipate how their code might be abused or used in unexpected ways and plan against it.
More Players, More Transparency — The volunteer nature of the work and great number of people involved is an asset and a liability.
It means that developers’ methods “vary tremendously and in ways that affect the quality and security of each piece,” Behlendorf said.
But having so many people reviewing and experimenting with code also increases chances that errors and vulnerabilities are discovered and quickly repaired. The vulnerability in open source software Log4J was detected and fixed faster than the similarly headline-topping vulnerability in SolarWinds’ proprietary software, for example, said Knausenberger.
“It's transparent: you can see the code base. You can even see how the developers go through thinking about how they'll fix a particular bug and the online dialog around adding a particular feature,” Knausenberger said. “It’s often more secure, in my opinion, because it is thoroughly reviewed and vetted by the community as well as battle tested by companies around the world.”
Because anyone can submit open source code, Lohn said there’s a risk that malicious contributors will try to abuse the process, such as by poisoning the data on which AI models train or by inserting hard-to-detect backdoors into pieces of open source software that would allow hackers to later compromise final projects.
Such efforts won’t necessarily get through. Nonprofits maintain and manage some of the open source repositories and check for security issues. In 2020, University of Minnesota researchers tested this process by submitting a Linux kernel software patch with a backdoor embedded; the developer community caught the effort and banned the university, Behlendorf said.
Being able to pinpoint the developers behind software can help users and reviewers decide whether to trust it. In the AI space, Lohn said it’d be useful to have a system ranking AI resources with letter grades based on how much is known about their origins. An “A” ranking might go to AI tools where the developer and chain of custody are known, for example.
The Users’ Role — Creating secure code is only part of the fix — users adopting that software also need to maintain it well.
That includes ensuring they implemented the code in ways that don’t introduce vulnerabilities.
For example, the Equifax data breach stemmed from poor practices around maintaining and configuring open source software, Koran said. Following guidance on how to “manage and use and integrate that code into the digital environment,” could’ve prevented it.
Regularly checking for patches is also important, but can be a challenge if users don’t know whether their software includes the compromised code. Some hope a Software Bill of Materials (SBOM) could help.