Generative AI Watch: Small Language Models’ Growing Role in a Multi-Model World

R. Bhattacharyya

Summary Bullets:

  • As training techniques improve, small language models (SLMs) are becoming more and more accurate, increasing their appeal.
  • The smaller models make sense for simpler tasks; they can work offline and are a good alternative when organizations want to process information close to the source of collection.

The generative AI (GenAI) landscape has been evolving at breakneck speed since OpenAI exploded onto the scene in late 2022.  And despite the numerous new GenAI solutions and product enhancements already brought to market in the last 18 months, momentum around natural language processing (NLP) shows no signs of slowing down. The latest buzz worth paying attention to is around SLMs, which offer capabilities similar to large language models (LLMs) but require far less training data and processing power.  Easier to adopt, less expensive to run, and with a smaller carbon footprint, these models hold the potential to further accelerate the already rapid pace of GenAI adoption.

Last week saw announcements of new SLMs from two major AI platform providers.  Microsoft announced the Phi-3 family of small language models, which includes the Phi-3-mini (3.8 billion parameters), the Phi-3-small (7 billion parameters), and the Phi-3-medium (14 billion parameters).  A few days later H2O.ai released an upgraded foundational model, H2O-Dunabe2 (1.8 billion parameters) and a model for chat-specific use cases, the H2O Danube2 Chat. (For comparison, Open AI’s Chat GPT-3 contains 175 billion parameters and Chat GPT-4 and Gemini 1.0 Ultra are rumored to contain over one trillion.) The Phi-3 and H2O-Danube models are by no means the only SLMs are the market; they are just the latest to make an appearance in the increasingly crowded GenAI arena. For example, Google released Gemini Nano-1 (1.8 billion parameters) and Gemini Nano-2 (with 3.25 billion parameters) at the end of 2023.

 

As training techniques improve, smaller models with fewer parameters are becoming more and more accurate, increasing their appeal.  SLMs can be more easily trained and fine-tuned, making them an attractive option for companies that want to customize a language model.  Additionally, since they utilize far less computing power than an LLM, they don’t require a massive investment in expensive infrastructure, and are therefore a much more feasible option for on-premises, at the edge, or on device deployments. They can summarize documents, surface key insights from text, and create sales or marketing content. The smaller models make sense for simpler tasks, can work offline, and are a good alternative when organizations want to process information close to the source of collection, for example if they are building applications that require low latency or if they prefer to keep their data on-premises. In contrast, LLMs are ideal for applications that involve orchestration of multiple tasks or that need to excel at advanced reasoning and analysis. However, they necessitate a massive amount of infrastructure to host, and therefore generally require organizations to move their data to a third party that is running the model.

 

Even though most organizations are starting their GenAI journey with OpenAI (often via Azure) at present, many will likely begin to explore alternative models before long. Some companies have noted that Azure costs are rising, which may prompt them to explore other options.  Additionally, organizations have reported that the limit on the number of query requests that can be performed by OpenAI’s models in a given time period is holding them back from expanding deployments of GenAI.  For many organizations, the future will likely be a multi-model and hybrid-model environment. Some applications, possibly those that are customer facing, will require one or more LLMs hosted on the cloud, whereas other applications will perform well with SLMs that are locally hosted.  Finally, companies would be wise to diversify and to not place all their eggs in one basket. The GenAI market is young and constantly has new entrants.  As it matures, there will be inevitable product withdrawal, startup failures, and of course consolidation via merger and acquisition. Diversification is a smart strategy at this point in regard to long term availability and performance optimization, plus it allows enterprises to exert pricing pressure via competition.

What do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.