Phi models , OpenELM and more – the rise of small AI models

In the world of AI, what might be called "small language models" have been growing in popularity recently because they can be run on a local device instead of requiring data center-grade computers in the cloud.

First to come was Microsoft, which announced a new, freely available lightweight AI language model named Phi-3-mini, which is simpler and less expensive to operate than traditional large language models (LLMs) like OpenAI's GPT-4 Turbo.

Its small size is ideal for running locally, which could bring an AI model of similar capability to the free version of ChatGPT to a smartphone without needing an Internet connection to run it.

How is Phi-3-mini different from LLMs?

Phi-3-mini is an SLM. Simply, SLMs are more streamlined versions of large language models. When compared to LLMs, smaller AI models are also cost-effective to develop and operate, and they perform better on smaller devices like laptops and smartphones.

How good are the Phi-3 models?

Phi-2 was introduced in December 2023 and reportedly equaled models like Meta’s Llama 2. Microsoft claims that the Phi-3-mini is better than its predecessors and can respond like a model that is 10 times bigger than it.

On the very next day of the launch of Phi by Microsoft came Apple with its own small AI model.

Apple introduced a set of tiny source-available AI language models called OpenELM that are small enough to run directly on a smartphone. They're mostly proof-of-concept research models for now, but they could form the basis of future on-device AI offerings from Apple.

Apple's new AI models, collectively named OpenELM for "Open-source Efficient Language Models," are currently available on the Hugging Face under an Apple Sample Code License. Since there are some restrictions in the license, it may not fit the commonly accepted definition of "open source," but the source code for OpenELM is available.

The eight OpenELM models come in two flavors: four as "pretrained" and four as instruction-tuned:

OpenELM-270M
OpenELM-450M
OpenELM-1_1B
OpenELM-3B
OpenELM-270M-Instruct
OpenELM-450M-Instruct
OpenELM-1_1B-Instruct
OpenELM-3B-Instruct

What’s good in it ?

Apple’s commitment to on-device AI aligns with its privacy-focused approach. OpenELM is designed to run on smartphones and laptops, reducing reliance on cloud-based computations. Additionally, Apple’s move to open-source these models on Hugging Face promotes community collaboration and transparency, allowing developers to experiment with different applications.

While Apple has not yet integrated this new wave of AI language model capabilities into its consumer devices, the upcoming iOS 18 update (expected to be revealed in June at WWDC) is rumored to include new AI features that utilize on-device processing to ensure user privacy.

Apple’s OpenELM has its strengths, especially in the context of privacy and on-device processing. However, its limited performance and niche focus raise concerns about its broader applicability. The release feels more like a response to market pressures than a strategic innovation, leaving much to be desired.

Ultimately, the emergence of small language models signals a potential shift from expensive and resource-heavy LLMs to more streamlined and efficient language models, arguably making it easier for more businesses and organizations to adopt and tailor generative AI technology to their specific needs.

As language models evolve to become more versatile and powerful, it seems that going small may be the best way to go.