Back to Blog
·7 min read
Edit

Embeddings, Vector Databases ad Semantic Search

AI Engineering

What Are Embeddings?

Imagine you could turn any word into a set of coordinates on a map. Not a geographical map a meaning map. That's essentially what embeddings are.

Let's start with a simple example. Take the word "King". An embedding model reads that word and outputs a long list of numbers say, 1,536 (OpenAI’s number of dimensions) of them. Each number captures a different shade of meaning:

royalty, power, authority, and hundreds of other subtle dimensions that humans would struggle to articulate.

Now take the word "Queen". It gets its own list of 1,536 numbers. Here's where it gets interesting; those numbers are almost identical to King's, except in the dimensions related to gender. The model has learned, purely from reading text, that a Queen is essentially a King shifted along a gender axis.

This is the famous equation that made embeddings click for the AI world:

King Man + Woman Queen

That's not a metaphor. You can literally do that arithmetic on the number arrays and land near Queen's embedding. The model didn't learn a rule about royalty or gender, it learned the shape of meaning itself.

And it doesn't stop at single words. Modern embedding models can take entire sentences, paragraphs or documents and compress their meaning into that same format: a list of numbers that captures what it's about.

What Are Vector Databases?

So now you have thousands, maybe millions, of these number arrays. Each one represents a piece of content: a product description, a support ticket, a medical record, a song lyric. You need somewhere to store them, and more importantly, you need to search them fast.

A regular database is built for exact matches. Ask it for "Queen" and it finds rows where a column literally says "Queen". It has no idea that "female monarch" means the same thing.

A vector database stores those embedding arrays (called vectors) and is purpose-built for a completely different question: "What's closest to this?"

Think back to our meaning map. If you plotted King, Queen, Prince and Princess as dots, they'd form a tight cluster. The word "Bicycle" would be way off in the distance. A vector database lets you point at any spot on that map and instantly find the nearest neighbours, even across millions of points.

Popular vector databases include Pinecone, Weaviate, ChromaDB and pgvector (a PostgreSQL extension often used with Supabase). They use clever indexing algorithms so that finding the nearest neighbours among 100 million vectors, which takes milliseconds, not hours.

How Semantic Search Ties It All Together

So what does this lead to… Traditional keyword search is fragile. If a customer searches your help centre for "my payment didn't go through", a keyword engine only finds articles containing those exact words. An article titled "Troubleshooting Failed Transactions" might never surface, even though it's exactly what they need.

Semantic search fixes this in three steps:

  1. Embed your content. Take every article, product listing, or document and pass it through an embedding model. Store the resulting vectors in a vector database.

  2. Embed the query. When a user searches for "my payment didn't go through", that sentence gets its own embedding, a point on the meaning map.

  3. Find the nearest neighbours. The vector database returns the content whose embeddings are closest to the query's embedding. Because "payment didn't go through" and "failed transaction" land in nearly the same spot on the meaning map, the right article surfaces — no keyword overlap required.

It's the same King/Queen principle at scale. Meaning is proximity. Proximity is search and this in a nutshell is semantic search, where search is based on related meaning not exact text.

Practical Applications

Search

When to use it: Any time users search using natural language, help centres, e-commerce product catalogues, internal knowledge bases, legal document retrieval.

Real-world impact:

  • Airbnb reported a 12% increase in booking conversion after switching to embedding-based search for listings.

  • Organisations using semantic search for internal documentation typically see 30–50% reduction in time-to-answer for support teams.

  • E-commerce platforms using vector search report up to 20% higher click-through rates compared to keyword search because results match intent, not just words.

Clustering

When to use it: When you have a large volume of unstructured content and need to discover natural groupings, organising customer feedback, grouping support tickets by theme, or segmenting research papers by topic.

How it works: Embed all your items, then run a clustering algorithm (like k-means) on the vectors. Items with similar meanings naturally group together — without you ever defining the categories.

Real-world impact:

  • Support teams using embedding-based ticket clustering have reduced manual triage time by up to 60%, automatically routing tickets to the right team.

  • Marketing teams analysing open-ended survey responses report discovering 2–3x more actionable themes compared to manual tagging.

Recommendations

When to use it: Suggesting similar products, articles, courses, music or any content where "you liked this, so you might like that" is valuable.

How it works: Embed each item in your catalogue. When a user interacts with something, find the nearest neighbours to that item's embedding. Those are your recommendations.

Real-world impact:

  • Spotify's embedding-based recommendation engine drives over 30% of all listening hours through Discover Weekly and related features.

  • E-commerce recommendation engines powered by embeddings typically see a 10–25% increase in average order value compared to rule-based systems.

  • Content platforms report 40% higher engagement when using semantic similarity for "related articles" compared to tag-based matching.

Anomaly Detection

When to use it: Fraud detection, identifying unusual network activity, spotting defective products in manufacturing, or flagging outlier data points in any dataset.

How it works: Embed your "normal" data. Any new data point whose embedding lands far from every cluster is an anomaly. No rules, no thresholds to manually tune, the model learns what "normal" looks like.

Real-world impact:

  • Financial institutions using embedding-based anomaly detection report catching up to 30% more fraudulent transactions while reducing false positives by 25%.

  • Manufacturing quality control systems using visual embeddings detect defects with 95%+ accuracy, reducing manual inspection costs by up to 70%.

Diversity Measurement

When to use it: Ensuring a content feed, hiring pipeline, news aggregator or dataset covers a broad range of perspectives and topics rather than converging on a narrow subset.

How it works: Embed all items and measure how spread out they are in vector space. If all your news articles cluster in one region of the meaning map, your coverage is narrow. If they spread evenly, it's diverse.

Real-world impact:

  • News platforms using embedding-based diversity scoring increased topical coverage by 35% in their recommendation feeds.

  • HR teams measuring job description embeddings identified unconscious language bias that was discouraging diverse applicants, and saw a 15–20% increase in application diversity after corrections.

Classification

When to use it: Sentiment analysis, content moderation, spam filtering, intent detection in chatbots, medical record categorisation, any task where you need to sort content into predefined categories.

How it works: Embed labelled examples from each category. To classify a new item, embed it and see which category's examples it's closest to. This approach often needs far less training data than traditional machine learning.

Real-world impact:

  • Zero-shot classification using embeddings achieves 80–90% accuracy on many tasks with no task-specific training data at all just category labels.

  • Companies replacing traditional NLP classifiers with embedding-based approaches report 50% less time spent on data labelling while maintaining comparable accuracy.

  • Content moderation systems using embeddings can detect harmful content in new and evolving forms that rule-based filters miss entirely.

The mental model is simple: embeddings turn meaning into math. Vector databases make that math searchable at scale. Semantic search is the user-facing result. Once that clicks, every application above; clustering, recommendations, anomaly detection, classification, diversity measurement is just a different question you ask of the same underlying geometry.

Start small. Embed a hundred documents. Search them. The moment you see a query with zero keyword overlap return the perfect result, you'll understand why this matters.

Erik Cavan

Erik Cavan

Applied AI

Share: