Semantic search
We use search all the time, from getting answers to questions via Google, looking for products on Amazon, finding a video on YouTube, or even searching for a document in our local file system. However, not all search systems are created equal, and they can be implemented in a variety of ways. Let’s say we have a corpus of blog posts that we want to be able to search through to find relevant articles to read. The table to store the blog posts in looks like this:FULLTEXT
index to perform faster text-matching search).
This could give some good results, however there would be some instances where it would be problematic.
Searching only for exact matches may miss some of the relevant results.
For example, a user might search for the term “dogs” and end up with some posts about dogs.
However, it would miss results that do not use the term dog in favor of words like “puppy” or “hound.”
It also might miss documents that are about “wolves” or “coyotes.”
This search knows nothing about the meaning of the word “dog.”
This is where vector similarity search comes into play.
With this type of search, we would generate an embedding for each blog post in our data set.
An embedding is an N-dimensional vector that captures opaque meaning about some piece of data — in this case, the title + subtitle of a blog post.
This vector would then be stored right along with the corresponding blog post row in the database.
Recommendation systems
Recommendation systems are also common in many products. Amazon may recommend purchases similar to ones you view, and streaming services may recommend shows to you based on your watch history. Similar types of systems can be built using vector similarity search. Perhaps we have an e-commerce platform. In our database, we have aproduct
table, a user
table, and purchase
table to track which items each user purchases.
VECTOR
column to the product
table:
@recommendationIDs
and display those to the user.
Retrieval-Augmented Generation (RAG)
RAG is a popular technique for augmenting and enhancing results produced by an LLM. LLMs such as GPT-4.0 or Sonnet-3.5 are extremely powerful, as they have been trained on immense data sets. However, these LLMs are not trained on the entire universe of data, and it is often useful to pass them additional context to help answer a query. Suppose we have a private question/answer platform, internal to our organization. None of the information on this platform is on the public internet and was not used to train any public LLMs. This platform stores questions and answers like so:answer
table and populate them with embeddings: