Back to Blog
·3 min read
Edit

What are LLMs

AILLM

Building on my previous post, over the next few weeks I will be sharing some fundamental concepts of Generative AI that underpin the practical application of the GenAI.

First up is the explanation of what kicked everything off, the release of the first widely adopted Large Language Model (LLM) namely ChatGPT in November 2022. The term “Large Language Model” gets thrown around a lot, so let’s simplify what it actually means and why it matters in real world applications of GenAI.

An LLM is essentially a super-charged autocomplete. It doesn’t “think” or “know” facts the way people do, it predicts the next word (token) based on patterns it learned from a vast reading diet. Rather than processing language as full words or sentences, LLMs break inputs into tokens, tiny units like sub-words or punctuation, so they can analyse and predict based on patterns across these pieces. Tokens are important in the context of LLMs; they are the lego bricks which enhance the predictive capability of the model and from a commercial perpective actually drive the associated costs of LLM usage.

Training an LLM is an energy-hungry and expensive exercise. Sam Altman puts GPT-4’s training cost in the hundreds of millions of dollars. The model devours terabytes of books, websites, code, and chat logs, learning the statistical patterns that make one word likely to follow another. That same data, however, carries human biases and blind spots, flaws the model can echo unless they’re actively corrected, making guardrails an essential element of any GenAI deployment.

Whilst ChatGPT was the first widely adopted, consumer facing LLM, and arguably the most famous, there are several other LLMs that have been developed by leading tech companies:

OpenAI - Models: GPT-3.5, GPT-4o

Anthropic - Models: Claude 3 family, Haiku, Sonnet, Opus

Google - Models: Gemini 1.5 Pro, Gemini 1.5 Flash, Gemini Nano

Meta (Facebook) - Models: Llama 3 8B & 70B, Llama 3.1 405B

xAI - Models: Grok 3, Grok 4 (plus “Grok for Government”)

Mistral AI - Models: Mistral Large 2, Mixtral 8x7B, Mistral Small 3

LLM in a nutshell: An LLM is a giant pattern-matching engine guessing the next word(s) based on the statistical probability of the data that it was trained on. They are the central engine that powers any GenAI solution.

Next time we’ll dig into Retrieval-Augmented Generation (RAG) and how it turns that raw engine into a reliable knowledge assistant aligned with company process industry best practice.

Erik Cavan

Erik Cavan

Applied AI

Share: