What Is Generative AI? — AI in Practice

1

The Moment We Are In

Something unusual is happening. A technology that was, until recently, a specialist research interest has become a general-purpose tool used by hundreds of millions of people. It writes, reasons, explains, translates, codes, and converses. It arrived not as a gradual improvement but as a step change — and the pace has not slowed.

For most people in professional life, the honest position is somewhere between curiosity and confusion. The tools are clearly useful. The underlying technology is obscure. The claims made about it range from sober to hysterical. Cutting through that noise to understand what this technology actually is — how it works, what it can reliably do, and where it falls short — is the foundation for using it well.

That is what this paper is for. No jargon that isn't explained. No hype in either direction. Just a clear account of what generative AI is, grounded in how the technology actually functions.

This Paper's Purpose

This is the first in a series of six papers. It provides the foundation: what generative AI is, what a large language model is, and how they work. Subsequent papers build on this — covering agents, knowledge bases, prompting, tools, and workflows. Each is designed to be read independently, but the concepts compound.

2

AI, Machine Learning, and Generative AI: What's the Difference?

These three terms are used interchangeably in conversation, but they describe different things. Understanding the distinction helps enormously in understanding what today's tools can and cannot do.

Artificial Intelligence (AI)

The broad field. Any computer system designed to perform tasks that would typically require human intelligence — pattern recognition, decision-making, translation, game-playing. AI has existed as a research discipline since the 1950s.

Machine Learning (ML)

A subset of AI. Instead of being explicitly programmed with rules, ML systems learn patterns from data. A spam filter that learns from examples of spam is machine learning. So is the algorithm that recommends what to watch next.

Deep Learning

A subset of machine learning using large neural networks — loosely inspired by the brain — that can learn from vast quantities of data. Powers image recognition, speech-to-text, and most modern AI capabilities.

Generative AI You are here

A subset of deep learning focused specifically on generating new content — text, images, audio, code, video. Large language models (LLMs) are the most prominent type, producing fluent, human-like text in response to natural language input.

Generative AI is notable because it produces original output, not just a classification or a prediction. It does not retrieve a pre-written answer from a database. It constructs a response, word by word, from patterns it absorbed during training. That is what makes it feel qualitatively different from earlier AI systems — and what makes it genuinely useful for open-ended creative and analytical work.

3

How We Got Here: A Brief History

The current generation of AI tools did not appear from nowhere. They are the result of a seventy-year research arc, with a small number of pivotal moments that each changed the trajectory significantly.

1950s–1980s

Rules-based AI

Early AI systems followed explicit rules written by humans. "If the input contains X, respond with Y." Impressive in narrow domains but brittle — they failed the moment they encountered anything their rules did not anticipate. Natural language was effectively out of reach.

1990s–2000s

Statistical learning

Systems began learning from data rather than following hand-written rules. Early language models could predict the next word in a sequence based on which words tended to follow which. Useful, but limited — they had no grasp of meaning, only of frequency.

2010s

Deep learning breakthrough

Neural networks — mathematical structures loosely modelled on neurons — became practical as computing power and data availability grew. Systems could now recognise images, transcribe speech, and translate between languages with near-human accuracy. The foundations were laid.

2017–now

The Transformer era

A 2017 research paper — "Attention Is All You Need" — introduced the transformer architecture. This allowed models to process entire documents at once rather than word by word, capturing long-range relationships in language. Combined with training on internet-scale data, it produced the LLMs we use today. GPT, Claude, Gemini, and Llama are all transformers.

The Scale Moment

The jump from useful research tool to mass-market product came not from a new invention but from scale. Researchers discovered that increasing model size, data, and compute consistently produced better capabilities — and that beyond a certain threshold, emergent abilities appeared that had not been explicitly trained for. Reasoning, translation, and code generation were not programmed in. They emerged from scale.

4

What Is a Large Language Model?

A large language model is a neural network trained on a massive quantity of text to predict — with high accuracy — what comes next in a sequence of words. "Large" refers to the number of parameters (the learned numerical weights) inside the model. Modern LLMs contain hundreds of billions of parameters, each adjusted during training to improve prediction accuracy across trillions of words of text.

The scale is difficult to convey. To give it some shape:

~1T

Words of training data

Approximately the text content of one million novels, ingested during training.

100B+

Parameters

Numerical weights, each fine-tuned across billions of examples to improve prediction.

Months

Training time

Run on thousands of specialist processors simultaneously at a cost of tens of millions of pounds.

The result of this process is a model that has absorbed the statistical structure of language at a scale no human could approach — and, in doing so, has internalised a remarkable amount about how the world works, how arguments are constructed, how code functions, and how facts relate to each other. This is not because anyone programmed it with these things. It is because they are present, implicitly, in the text it was trained on.

The analogy that holds best is this: if you read everything ever written, you would know a very great deal — not because anyone taught you, but because knowledge is embedded in language. An LLM has done the reading. What it has not done is understand it in the way humans do — a distinction with important practical implications, covered in section 6.

5

How an LLM Actually Works

When you type a message to an AI and it responds, a specific sequence of operations occurs. Understanding this sequence dissolves a lot of the "magic" — and explains both the model's impressive capabilities and its characteristic failure modes.

Step one: tokenisation. Your text is not read as words. It is broken into tokens — chunks of characters that may be a whole word, part of a word, or a punctuation mark. This is how the model processes language internally.

How "The Paris agreement was signed in 2015" becomes tokens

Input text → broken into chunks the model can process numerically

The Par is agreement was signed in 20 15

Each token is converted to a number, then to a vector — a point in a high-dimensional space that captures its meaning and relationships to other tokens.

Step two: attention. The transformer architecture processes all tokens simultaneously and calculates relationships between them. "Paris" relates to "agreement." "2015" relates to "signed." These relationships — encoded as attention weights — allow the model to understand context, not just isolated words.

Step three: prediction. The model produces a probability distribution across its entire vocabulary — tens of thousands of tokens — for what should come next. It selects the next token based on this distribution, then repeats the process, generating one token at a time until the response is complete.

You provide this

→

The model completes it

"The capital of France is"

→

Paris (high probability — seen millions of times in training)

"Summarise this contract in plain English:"

→

A structured summary (learned from millions of examples of summaries)

"Write a Python function that sorts a list"

→

Working code (pattern-matched from billions of lines of code in training data)

This mechanism — prediction based on learned statistical patterns — is simultaneously the source of the model's remarkable capability and its most important limitation. It is very good at producing text that looks like the correct answer. It is not, in a meaningful sense, reasoning its way to truth. This is why LLMs can be fluently, confidently wrong — a phenomenon known as hallucination — and why verification of factual claims always matters.

The core mechanism

Patterns in language → Statistical prediction → Generated text

Not retrieval. Not reasoning in the human sense. Pattern completion at extraordinary scale.

6

What LLMs Are Good At — and What They Are Not

The clearest mistake people make when starting with AI is treating it as either a search engine (factual lookup) or a human expert (reliable judgement). It is neither. Understanding what it genuinely does well, and where it reliably fails, determines whether you deploy it effectively or frustratingly.

Strong capabilities

Writing, editing, and adapting tone across any style
Summarising long or complex documents quickly
Drafting structured content from notes or bullet points
Explaining complex topics in plain language
Generating and debugging functional code
Translating between languages with high fluency
Brainstorming, exploring options, stress-testing ideas
Classifying, extracting, and structuring information from text
Reasoning through multi-step problems when prompted carefully

Notable limitations

Reliable recall of specific facts, dates, and statistics
Knowledge of events after the training data cutoff
Arithmetic and precise numerical reasoning
Genuine understanding — it predicts, not comprehends
Knowing when it does not know something
Consistent output across repeated identical prompts
Access to real-time data without tools
Taking action in the world without tools
Replacing human judgement on high-stakes decisions

The limitations are not reasons to avoid LLMs — they are design constraints. A system designed around these constraints, with appropriate verification steps and human oversight where it matters, is highly effective. A system that ignores them and treats the model as an infallible oracle will produce unreliable results and erode trust quickly.

The Hallucination Problem

Hallucination is the term for when a model generates confident-sounding but incorrect information. It is not random error — it follows the same mechanism as correct output. The model predicts what a plausible answer looks like, and in the absence of the correct information, that prediction can be wrong. It does not know it is wrong. This is why outputs that involve specific facts, figures, citations, or legal and medical information should always be verified against primary sources.

7

The Models You Have Heard Of

Several distinct families of large language model are now widely deployed. They differ in size, architecture, capabilities, and the organisations that built them — but they share the same fundamental mechanism described above. Understanding the landscape helps in choosing the right tool for a given context.

Model family	Made by	Notable characteristics
GPT-4o / GPT series	OpenAI	The model behind ChatGPT. Strong across writing, code, and reasoning. Multimodal — can process text, images, and audio. Widely integrated via API.
Claude series	Anthropic	Designed with a particular focus on safety and reliability. Handles very long documents well. Strong for structured analysis and writing with a consistent voice.
Gemini series	Google DeepMind	Deeply integrated with Google's products and search infrastructure. Strong multimodal capabilities. Available via Google Workspace and Vertex AI.
Llama series	Meta (open weights)	Released as open-weight models — the underlying parameters are publicly available. Can be run locally or self-hosted. Basis for many custom and fine-tuned variants.
Mistral / Mixtral	Mistral AI	European open-weight models with strong performance relative to size. Efficient to run; increasingly used in enterprise deployments where data privacy requires self-hosting.

The market is moving fast. Benchmarks that distinguish models today will be superseded by new releases within months. The practical guidance is to choose a model based on the specific task — some excel at code, others at long documents, others at instruction-following — rather than assuming one model is universally best. Most enterprise AI platforms give you the choice.

8

Common Misconceptions

Several persistent misconceptions shape how people approach — and often misuse — these tools. Addressing them directly is more useful than leaving them to be discovered through frustration.

It is searching the internet for answers. It is not. A standard LLM has no live internet connection. It generates responses from patterns learned during training. When it gives you a statistic or a citation, it has predicted what that information looks like — which is why specific facts should always be verified. (Models with web search tools are a different matter, covered in a later paper.)
It understands what it writes. This is philosophically contested and practically important. The model predicts plausible continuations of text with extraordinary accuracy. Whether that constitutes understanding is an open question — but behaviourally, it does not understand in the way a human expert does. It has no beliefs, no intentions, and no model of the world. It has learned patterns.
It is conscious or has feelings. It does not. LLMs produce text that describes emotions because they were trained on text written by humans describing emotions. The language is mimicked, not experienced.
Bigger models are always better. Not for every task. Smaller models, properly prompted and given relevant context, often outperform larger ones on specific tasks — while being faster and cheaper to run. The right model is the one appropriate to the task, not the largest available.
Once an AI says something is correct, it is. The model has no access to ground truth. It can be confidently wrong, and it cannot tell the difference from the inside. Confidence in the output does not imply accuracy. Verification always matters for consequential outputs.
It will replace human judgement. For well-defined, repetitive language tasks — drafting, summarising, classifying — it reduces effort significantly. For tasks requiring contextual judgement, ethical reasoning, or accountability, it is a tool in human hands, not a replacement for those hands.

9

Why This Matters Now

Generative AI has been described — with varying degrees of credibility — as the most significant technological shift since the internet, the industrial revolution, and the invention of the printing press. These comparisons are difficult to evaluate in the moment. What is observable is narrower and more useful: a specific class of work has become dramatically faster and cheaper to do.

Writing first drafts. Summarising long documents. Classifying and extracting information from text. Translating between languages. Generating functional code. Answering questions from a body of documents. Each of these tasks, done at professional quality, previously required skilled human time. Each of them can now be accomplished in seconds by a well-configured AI system at negligible marginal cost.

The organisations that treat this as a tactical time-saver — using AI to do one task slightly faster — will extract some value. The organisations that treat it as a structural opportunity — redesigning workflows around what AI can reliably do, reserving human effort for what it cannot — will extract a great deal more. The technology does not determine which category you fall into. The design choices do.

The Practical Starting Point

Understand the tool → Design around its strengths → Guard against its weaknesses

That sequence, applied consistently, is what separates useful AI from frustrating AI.

🧠

What It Is

A neural network trained on vast quantities of text to predict what comes next. Not search, not a database, not a reasoning engine — a very sophisticated pattern matcher.

⚙️

How It Works

Your input is tokenised. Attention mechanisms find relationships. Probabilities are calculated. The next token is selected. Repeat until the response is complete.

✅

Where It Excels

Writing, summarising, drafting, classifying, translating, coding. Tasks with a correct general shape that can be expressed in language and benefit from speed and scale.

⚠️

Where to Be Careful

Specific facts, recent events, arithmetic, high-stakes decisions. Confident output does not mean correct output. Verification matters wherever accuracy is critical.