AI Policy and the importance of foundation models

Given the recent developments in AI policy from the European Union, this piece focuses on understanding the terms and concepts related to foundation models, also referred to as General Purpose AI (GPAI) models, or frontier models and the relevance of these models to managing AI policy in organisations.

In order to do so, this paper will cover these topics:

  • Brief discussion of the issues surrounding AI terminology

  • Description of foundation models 

  • Explanation of why foundation models are important for understanding AI policy

Pexels. An artist’s illustration of artificial intelligence (AI).

Issues surrounding AI terminology

You would be forgiven for not keeping abreast of all of the AI terms flying around including diffusion models, multimodal models, GANs and transformer models, or for being tongue-tied when asked to define a term as simple as even AI.  Many of the terms used in the field of AI do not, as of yet, have clear and agreed upon definitions, including the terms ‘AI’ or even ‘intelligence’ for that matter. 

One challenge in deriving a simple and agreed upon definition for artificial intelligence (AI) is that AI does not refer to one thing, AI is an umbrella term.  Definitions vary in the type of details included, such as what are the related fields, computer science or mathematics, the tools involved such as digital computers and robots, and what AI has actually does, such as completing tasks that had previously required human thinking.  

A second challenge for defining AI, is that it is an emerging technology and terminology is a social construct understood within its context.  In fact, it is this vital connection between technology and society which is leading to a major shift in the governance of technologies, such as AI.  Technologies like AI do not operate in isolation, they are co-created with and deeply embedded in society and are therefore referred to as ‘socio-technical systems’.  As highlighted by academics studying the governance of socio-technical systems, there is a blurring of boundaries between state and non-state action.  Evidence of this shift is currently at play in the AI governance and policy arena where the governance of technology is changing in terms of its objectives, actors involved and the activities undertaken. Further in this piece, this will become relevant when we discuss the relevance of foundation models in recent AI policy discussions.

To demonstrate the range of definitions for AI and in keeping with the theme of AI policy, the table below provides a few definitions used by globally respected bodies in the AI policy field. While not exhaustive, these definitions demonstrate the complexity in determining consensus over a single definition for AI.

Table 1

Select list of definitions of artificial intelligence in the AI policy arena

Foundation models

On February 24, 2024, a landmark agreement took place among EU Member States to unanimously endorse The Artificial Intelligence Act, a set of rules for the “development, placement on the market and use of AI systems in the European Union, following a proportionate risk-based approach.” Included in these set of rules were the governance of “high-impact general-purpose AI models that can cause systemic risk in the future.”

So what exactly are general-purpose AI models?

A general-purpose AI’ or ‘GPAI’ system is an emerging type of AI system also commonly referred to as ‘foundational model’ which is trained on massive datasets and designed to produce a wide range of outputs and use cases. GPAIs can be standalone systems or the ‘base’ for many applications. A commonly known example is OpenAI’s GPT-4, the foundation model for the conversational chat agent ChatGPT.

As depicted in the image below, large amounts of data are trained to produce a foundational model that can then be adapted and applied for a variety of tasks. Foundational models usually train on unlabelled datasets which removes the time and expense of describing items in datasets and thus increases the scale of the model. Foundation models also rely on ‘transfer learning’ – meaning the models apply learned patterns from one task to another.

After researchers defined foundation models, the term generative AI came onto the scene as an umbrella term to describe a range of techniques that can be used to turn text into images (diffusion models), generate natural language responses to a wide range of inputs (large language models), and generate text, translate languages and write different kinds of creative content (generative pre-trained transformer or GPT).

Foundation models are getting larger and more complex and hundreds of models are now available. Increasingly, these large models are multimodal, which refers to using multiple inputs and generating multiple outputs. For example, Google’s PaLM-E is a multimodal language model and is able to perform visual tasks (i.e. describe images, detect objects, classify scenes), and perform robotics tasks (i.e. moving a robot through space) (ref).

Instead of building new models from scratch, businesses and startups are customising pre-trained foundation models.

From a policy perspective, this makes regulation challenging because essentially, foundation models have created an entire ecosystem which requires governance of the supply chains, deployers and developers.


Importance of foundation models for understanding AI policy

The ability to build different applications for many purposes ‘on top of’ foundation models makes them difficult to regulate. To regulate AI, one needs to clearly distinguish between types of models, their weights and their capabilities.

“Foundation models can be made available to downstream users and developers through different types of hosting and sharing. Some models are private and hosted inside a company (like Google DeepMind’s Gato), some are made widely available via ‘open source’ distribution (like HuggingFace’s BLOOM), and some are hosted on cloud computing platforms, like Microsoft Azure or Google Cloud and made accessible via an application programming interface (API) “ (ref). A single issue with a model at the foundation stage could create a cascading effect that causes problems for all subsequent downstream users.

The image below from the Ada Lovelace Institute website is an excellent visual diagram depicting the complex supply chain of a foundation model and gives insight into why these models are presenting a regulatory quagmire for AI policymakers.

For businesses and startups that are building generative AI into their products, selecting a foundation model (FM) is one of the first and most critical steps.

A range of challenges still exist for developing AI products using foundation models (ref).

  1. Building applications: While a range of foundation models (such as GPT-3, PaLM, Flamingo, DALL-E, Stable Diffusion) and others are adaptable to a range of downstream tasks and producing some exciting results (such as GitHub and Co-Pilot); there are still development gaps in the applications that sit on top of these foundation models.

  2. Running models: Specialised hardware such as GPUs and a massive amount of computing power impose operational constraints that limit throughput and increases costs.

  3. Models do not always work: Developers often need to derive solutions or tools to improve inferences, such as fine-tuning and distillation.

  4. Models are unpredictable: Hallucinations and data ingested from the dark internet into these models means there are risks for biased, offensive and incorrect results to occur.


Conclusion
Large-scale foundation models are a new application platform. For innovators, businesses and startups, this new AI paradigm likely feels like further development to arrive at their future utopia isn’t happening fast enough; whereas for others in the general public, regulation, and policy space, the emergence of new techniques, terms and applications already feels like destabilising shift in the universe.

It is no wonder that industry, policymakers and regulators are already using ‘frontier models’ as a term to describe an undefined group of cutting-edge powerful models. Wait until we get to the term the ‘new frontier” models and then what? We seem to be moving so fast, we cannot even find the words to describe what we are building. Speechless.

Thanks for reading.

Previous
Previous

3 reasons why you should care about Causal AI

Next
Next

Designing humans into an AI future