Artificial Intelligence

Is Open-Source AI the Future of Innovation or a Pandora’s Box?

By Thomas Morgan

The discourse surrounding artificial intelligence has been non-stop so far this year. Some of the world’s most prominent tech companies – such as Microsoft, Google, OpenAI, Amazon, and Meta – are pumping billions of dollars into creating the most powerful generative AI tool, all in hopes of dominating the AI landscape.

These developments have raised serious questions around the ethical handling of generative AI tools and how accessible it should be to the wider public. At the time of writing, OpenAI’s Chat GPT4 is one of the most prominent generative AI tools in the market. It’s a proprietary model, otherwise known as closed-source – the code, model architecture, data, and model weights are solely controlled by OpenAI and cannot be accessed by outside parties.

OpenAI has been heavily scrutinized for this for multiple reasons. Not only is it not as open as the name suggests, it also avoids having to share exactly what was used to train the model. This raises some serious concerns about what information ChatGPT is trained on and if it can be trusted. OpenAI is also in the midst of a huge lawsuit whereby a group of eight US newspapers, including The New York Times, have claimed that OpenAI has been using millions of copyrighted news articles (without permission or payment) to train its LLMs.

In response to the limitations of closed-source products came the open-source AI approach – where the source code, models, and training data are all freely accessible to the public. 

A big advocate for open-source AI is Mark Zuckerberg, who recently announced that Llama, Meta’s ever-evolving AI model, is to become an open-source model that will soon be “industry standard” – and ready to compete with ChatGPT.

The Future Benefits of Open-Source AI

According to Mark Zuckerberg himself, Meta is fully committed to open-source AI in relation to Llama, and believes a closed-source approach will only prevent Meta from achieving long-term growth. To maintain longevity, Zuckerberg believes a collaborative effort that includes knowledge and contribution outside of Meta is the way they will win the AI war.

Here’s a list of key benefits that Zuckerburg sees in open-source:

  1. Customization and fine-tuning: Having the ability to customize AI models caters to organizations’ unique requirements. On-device tasks and classification tasks require small models, while more complicated tasks require larger models.
  2. Self-dependency: With closed-source AI, you depend on something you have no control over. Open source allows full access in case changes or alterations are required, making it a far more reliable option.
  3. Protection of data: Companies need to protect their data at all costs, and handing your data over to closed-source providers comes with its own risks. With open source, you maintain complete control over your data handling.
  4. Affordability: According to Zuckerberg, Llama 3.1 would cost developers 50% less than Chat GPT4.0.
  5. Longevity: Open source is apparently advancing at a faster rate than closed-source.

This all sounds great, right? You have full access to a tool that is just as powerful as ChatGPT, all while saving money and having all the customization access you desire. 

But what if the promise of open source is not all that it seems?

Source: TechTarget

Is Open Source Really That Open?

All these promises surrounding open-source AI – or specifically Meta’s tool – may seem too good to be true, which could be because it is! 

Despite their apparent commitment to transparency, Llama is only partially open source. Meta’s model provides access to its responsive “weights”, allowing users to adapt the model. However, the original training data remains concealed, and users lack the necessary information to replicate the model entirely from scratch if they want to do so.

This means that Meta could be accused of “open-washing” – when AI models that claim to be open source only offer partial transparency to their product. Critics have argued that selective openness undermines the core principles of the open-source movement, which is based on transparency and collaboration. 

“Meta is continuing the industry standard of open-washing in AI. Llama 3.1’s launch does not fit any of the proposed definitions of open-source AI, failing on the common step: data. Meta’s release documents detail the data being ‘publicly available’, with no definition or documentation.” Nathan Lambert, ML expert, Allen Institute for AI

Although access to “weights” is still beneficial to developers who want to train models with their own data and information, not allowing full access to Llama prevents users from gaining a full understanding of the system, and is ultimately selling a false truth.

This, of course, could change over time as Meta continues to improve and enhance its tool, and it may be a decision based on how safe it is to release such a high-powered open-source model to the masses. If Meta’s Llama succeeds in the AI market, open source will become the default standard for the entire industry – a prospect that many leading AI contributors fear.

Google, Microsoft, and OpenAI have even lobbied together to discourage the use of open-source models, arguing that AI should be protected by closed-source and harnessed safely. 

While this is in part a strategic decision by these companies who see proprietary models as more profitable, there is genuine concern about open source falling into the wrong hands and the irreversible consequences of this.

The True Danger of Open Source

According to Meta’s benchmarks, Llama 3.1 is just as capable as models twice its size and is performing at the same level as GPT-4o. If this model is as powerful as Meta claims, is it a mistake to allow it to be open source and accessible to anyone?

Andrea Miotti, the executive director of AI safety group Control AI, says that open-source platforms (like Meta) are “refusing to take responsibility for damages to their AI technology”, which could lead to “catastrophic consequences”. Alongside this, Hamza Chaudry, a US policy specialist at the Future of Life Institute, had this to say: “Most authoritarian states will likely repurpose models like Llama to perpetuate their power and commit injustices”.

Irresponsible use of AI has already been a huge talking point this year – from Taylor Swift falling victim to some uncomfortable deepfake creations, to an “AI heist” that lost a Hong Kong company $25M, and thousands of people scammed by AI voices mimicking loved ones in emergencies. Many already feel under threat by what AI poses, and open source opens up an entirely different can of worms in terms of what it can create, support, and encourage.

OpenAI, for example, has significant safety regulations that prevent users from abusing the model – a significant benefit of closed-source AI. Ultimately, stripping that protection away means that Llama, or other future high-powered open-source products, could be used for any intention. 

That sort of freedom is extremely susceptible to abuse, with critics going as far as to suggest it could be used to develop biological or chemical weapons. While this could be seen as a scaremongering tactic from skeptics, it’s not completely off the cards that Llama’s open product could be used and abused, as with many new technologies. 

Summary

There’s no doubt that being fully open source benefits the tech industry. Having that freedom to customize a model to suit a business’s needs presents us with unlimited potential for how high-powered AI can transform day-to-day operations. However, with great power comes great responsibility, and there is no way of assuring that open source will be used with ethical intentions.

Let us know your thoughts in the comments below!

The Author

Thomas Morgan

Thomas is a Content Editor at Salesforce Ben.

Leave a Reply