Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Google, Elon Musk, and Mark Zuckerberg say their AI is open source but a new definition may challenge that.
Generative artificial intelligence (AI) companies, such as Meta’s Llama or Elon Musk’s Grok, claim they are open source. But not many agree on what open source AI is.
A new working definition of what the term means for AI could change that and has just been released by the Open Source Initiative (OSI), the self-appointed stewards of the term.
Open source generally means the software’s source code is available to everyone in the public domain to use, modify, and distribute.
The OSI open source definition states it must also comply with 10 criteria, including having a well-publicised means of obtaining the source code at a reasonable cost or for free, not being discriminatory, and the license not restricting other software.
But AI systems are more difficult to assess against the OSI’s 10 points and so it has a new specific definition for AI.
The open source AI definition states that it can be used for any reason without getting permission from the company, and researchers should be able to freely see how the system works.
It also states that the system can be modified for any purpose, including to change its output and share the system for others to use with or without modifications for any reason.
The definition states AI companies must also be transparent about the data used to train the system, the source code used to train and run the system and the weights – the numerical parameters that influence how an AI model performs.
Herein lies the problem. OpenAI, despite its name, is closed source in that its algorithms, models and data sets are kept secret.
But Meta, Grok and Google’s models, which claim they are open source, are not really either, if you go by the OSI definition. This is because the companies are not transparent in what data is used to train the weight, which can cause copyright issues and ethical questions around if the data is biased.
The OSI acknowledges that sharing full training data sets can be challenging so is not so black and white. It therefore does not disqualify otherwise open source AI development from being considered “open source”.
The definition has been a couple of years in the making and will likely need to be updated as AI progresses.
The OSI made the working definition by consulting a 70-person group of researchers, lawyers, policymakers, activists, and representatives of big tech companies such as Microsoft, Meta, and Google.
“This definition will become a valuable resource to combat the widespread practice of ‘openwashing’ that is becoming quite rampant,” Mozilla representatives Ayah Bdeir, Imo Udom, and Nik Marda said in a statement sent to Euronews Next.
They explained “overwashing” was where non-open models (or even open-ish models like Meta’s Llama 3) are promoted as leading “open source” options without contributing to the commons.
“Researchers have shown that ‘the consequences of open-washing are considerable’ and affect innovation, research and the public understanding of AI,” they added.
“We are the stewards, maintainers of the definition, but we don’t really have any strong powers to enforce it,” Stefano Maffulli, the OSI’s executive director told Euronews Next in an interview in March.
He added that judges and courts around the world are starting to recognise that the open source definition is important, especially when it comes to mergers but also regulation.
Countries around the world are finalising how they will regulate AI and open source software has been an issue of contention.
“The open source definition serves as a barrier to identify false advertising,” said Maffulli.
“If a company says it’s open source, it must carry the values that the open source definition carries. Otherwise, it’s just confusing”.