SLM (Small Language Model)

Simple Definition

A Small Language Model (SLM) is an AI language model with significantly fewer parameters than the large models you typically hear about. While models like GPT-4 have hundreds of billions of parameters, SLMs might have 1 to 13 billion — making them lighter, faster, and cheaper to run.

Examples: Microsoft Phi-4, Google Gemma, Meta Llama 3.2 (1B/3B), Apple’s on-device models.

SLM vs. LLM

SLMLLM
ParametersMillions to ~13BTens to hundreds of billions
SpeedFastSlower
CostLowHigher
Where it runsOn-device, edge, localCloud servers
CapabilityGood at focused tasksBroader, more complex reasoning

Why SLMs Matter

Not every task needs the power of a massive model. SLMs are a better choice when:

  • Privacy matters — running on-device means your data never leaves your phone or computer
  • Speed is critical — SLMs respond faster, which matters for real-time apps
  • Cost is a factor — smaller models are far cheaper to run at scale
  • You’re working offline — SLMs can run without an internet connection
  • The task is narrow — summarizing, classifying, or answering specific questions doesn’t always need a frontier model

Where SLMs Are Used

  • AI features built into phones and laptops (Apple Intelligence, Copilot+ PCs)
  • Edge devices and IoT
  • Customer service bots with focused tasks
  • Coding assistants that run locally
  • On-device translation and transcription

The Tradeoff

SLMs are not as capable as frontier LLMs at complex reasoning, long documents, or nuanced creative tasks. You gain speed and privacy but give up raw power. The right choice depends on what you actually need the model to do.

Continue learning

Explore related guides, tools, workflows, and prompts that help you go deeper into this topic.

See AI terms in action

Browse practical AI workflows that use the concepts in this glossary.

Last updated: