← All Writing
April 4, 20264 min read

How I Upgraded Search to Semantic Vector Embeddings

From keyword matching to meaning-based search using pgvector, Supabase, and Hugging Face — and why it’s the foundation for Ask Goose

YieldSemantic search on joseandgoose.com — type “game theory” or “firewall” into the search bar
DifficultyIntermediate (first time working with vector embeddings and pgvector)
Total Cook Time~2 hours in a single morning session

Ingredients

What Changed and Why It Matters

The site already had a self-updating search bar — a build script generates a JSON index of every page, post, and feature, and the search bar filters that list client-side as you type. It was fast and reliable, but it could only find things when your words matched the text. Searching “firewall” returned nothing because no title or description contained the word “firewall,” even though an entire post existed about configuring one.

The upgrade: every piece of content on the site now has a vector embedding — a list of 384 numbers that represent the meaning of that content. When you search, your query gets converted to the same kind of vector, and the database finds whichever content is closest in meaning. The word “firewall” doesn’t need to appear anywhere — the model understands that firewalls relate to server security.

dimension 1dimension 2SECURITYGAMESAUTOMATIONSecured LinuxAPI ServerNumeratorTrophyManagerGarmin RecapsMarket BriefingCron Ops"firewall""gametheory"ContentSearch queryNearest match

Each piece of content becomes a point in 384-dimensional space. Similar content clusters together. A search query lands near the content it’s about — even without matching any keywords.

How it works in practice

Keyword search still runs first, instantly, client-side — same as before. Vector search only fires as a fallback when keyword matching returns zero results. You get the speed of the old system for obvious queries and the intelligence of the new system for everything else.

The pgvector Setup in Supabase

Supabase runs on Postgres, and Postgres has an extension called pgvector that adds a native vector column type and distance operators. Enabling it was one line of SQL: create extension if not exists vector. After that, I created a content_embeddings table with a vector(384) column — each row stores one piece of content alongside its embedding.

🔧 Developer section: Schema and search function

Content embeddings are generated locally using @xenova/transformers with the all-MiniLM-L6-v2 model — a build script reads the search index and TL;DR summaries, generates vectors for all 24 items, and upserts them to Supabase. At query time, the Vercel API route sends your search text to Hugging Face’s hosted version of the same model, gets a vector back, and passes it to the Supabase RPC function. Same model on both sides means the vectors are comparable.

Keyword vs. Semantic: Side-by-Side

QueryKeyword SearchSemantic Search
“firewall”No resultsHow I Secured the Home Linux Server
“health data”No resultsHow I Automated Daily Garmin Recaps
“game theory”No resultsNumerator + How I Built Numerator
“email automation”No resultsHow I Built a Market Briefing
“Numerator”Numerator (instant)Not needed — keyword handles it

Keyword search is still the first pass for exact matches. Semantic search catches everything keyword misses.

Where Semantic Search Got It Wrong

Searching “theory” by itself returned nothing. That felt wrong — there’s a post about building Numerator, which is based on a game theory puzzle. But “theory” alone is genuinely ambiguous. Game theory, music theory, color theory — the model can’t know which you mean, so every piece of content scores below the similarity threshold. Adding one word of context — “game theory” — immediately surfaced Numerator as the top result.

This is a feature, not a bug, but it’s worth naming: semantic search rewards specificity. Vague queries get vague results. The system is honest about what it doesn’t know rather than guessing — a property I’d rather keep than tune away.

Why This Enables A Future Feature: Ask Goose

Vector search is half of a pattern called RAG — retrieval-augmented generation. The idea: instead of asking an AI to answer a question from memory (where it might hallucinate), you first retrieve the most relevant content from your own data, then pass that content to the AI as context. The AI answers based on what you actually wrote, not what it imagines you wrote.

With embeddings in Supabase and a similarity search function already working, the retrieval half is done. Ask Goose — a conversational search feature where visitors can ask Goose questions about the site and get answers grounded in real content — becomes a matter of wiring the retrieval results into a Claude API call. The vector infrastructure built today is the foundation that makes it possible. Instead of building search and then rebuilding for AI, I built the AI-ready version first.

Pattern worth noting

Building the retrieval layer before the generation layer forces you to get the data quality right first. If the embeddings return irrelevant results, no amount of prompt engineering will fix the answers. By validating search quality now, Ask Goose inherits a tested foundation instead of debugging two problems at once.

← Back to all writing