Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL now support the pgvector extension, bringing the power of vector search operations to PostgreSQL databases.

In this step-by-step tutorial, we will show you how to add generative AI features to your own applications with just a few lines of code using pgvector, LangChain and LLMs on Google Cloud. You can also follow along with our guided tutorial video. We’ll build a sample Python application that can understand and respond to human language queries about the data stored in your PostgreSQL database. Then, we’ll further push the creative limits of the application by teaching it to create new AI-generated product descriptions based on our existing dataset about children’s toys. Forget about boring “orange and white cat stuffed animal” descriptions of days past — with AI, you can now generate descriptions like “ferocious and cuddly companion to keep your little one company, from crib to pre-school”. Let’s see what our AI-generated taglines can do!

How to install pgvector in Cloud SQL and AlloyDB for PostgreSQL

The pgvector extension can be installed within an existing instance of Cloud SQL for PostgreSQL and AlloyDB for PostgreSQL using the CREATE EXTENSION command as shown below. If you do not have an existing instance, create one for Cloud SQL and AlloyDB.

postgres=> CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION

The new `vector` data type

Under the hood, the pgvector extension uses the PostgreSQL `CREATE TYPE` command to register a new data type called `vector`. PostgreSQL table columns can be defined using this new `vector` data type. A string with comma-separated numbers within square brackets can be used to insert values into this column as shown below. 

The following code snippet uses a three-dimensional vector as an example. In actual AI/ML applications, a vector has many more dimensions, typically in the order of hundreds.

postgres=> CREATE TABLE embeddings(
      id INTEGER,
      embedding vector(3)
);
CREATE TABLE
postgres=> INSERT INTO embeddings 
           VALUES
                   (1, '[1, 0, -1]'),
                   (2, '[1, 1, 1]'),
                   (3, '[1, 1, 50]');
INSERT 0 3

New similarity search operators in pgvector 

The pgvector extension also introduces new operators for performing similarity matches on vectors, allowing you to find vectors that are semantically similar. Two such operators are:

  • ‘<->’: returns the Euclidean distance between the two vectors. Euclidean distance is a good choice for applications where the magnitude of the vectors is important — for example, in mapping and navigation applications, or when implementing the K-means clustering algorithm in machine learning. 
  • ‘<=>’: returns the cosine distance between the two vectors. Cosine similarity is a good choice for applications where the direction of the vectors is important — for example, when trying to find the most similar document to a given document for implementing recommendation systems or natural language processing tasks. 

We use the cosine similarity search operator for our sample application.

Building the sample application

Let’s get started with building our application with pgvector and LLMs. Google will also use LangChain, which is an open-source framework that provides several pre-built components that make it easier to create complex applications using LLMs.

The entire application is available as an interactive Google Colab notebook for Cloud SQL PostgreSQL. You can directly run this sample application from your web browser without any additional installations, or writing a single line of code!

Follow the instructions in the Colab notebook to set up your environment. Note that if an instance with the required name does not exist, the notebook creates a Cloud SQL PostgreSQL instance for you. Running the notebook may incur Google Cloud charges. You may be eligible for a free trial that gets you credits for these costs. 

Loading our ‘toy’ dataset

The sample application uses an example of an e-commerce company that runs an online marketplace for buying and selling children’s toys. The dataset for this notebook has been sampled and created from a larger public retail dataset available at Kaggle. The dataset used in this notebook has only about 800 toy products, while the public dataset has over 370,000 products in different categories. 

After you set up the environment using the steps mentioned in the Colab notebook, load the provided sample dataset into a Pandas data frame. The first five rows of the dataset are shown for your reference below.lang-py

import pandas as pd
# Download and save the dataset in a Pandas dataframe.
DATASET_URL='https://github.com/GoogleCloudPlatform/python-docs-samples/raw/main/cloud-sql/postgres/pgvector/data/retail_toy_dataset.csv'
df = pd.read_csv(DATASET_URL)
df = df.loc[:, ['product_id', 'product_name', 'description', 'list_price']]
df.head(5)

Save the dataset in a PostgreSQL table called `products` that has a simple schema with four fields: product_id, product_name, description, and list_price.lang-py





# Save the Pandas dataframe in a PostgreSQL table.
import asyncio
import asyncpg
from google.cloud.sql.connector import Connector
async def main():
    loop = asyncio.get_running_loop()
    async with Connector(loop=loop) as connector:
        # Create connection to Cloud SQL database
        conn: asyncpg.Connection = await connector.connect_async(
            f"{project_id}:{region}:{instance_name}",  # Cloud SQL instance connection name
            "asyncpg",
            user=f"{database_user}",
            password=f"{database_password}",
            db=f"{database_name}"
        )
        # Create the `products` table.   
        await conn.execute("""CREATE TABLE products(
                                product_id VARCHAR(1024) PRIMARY KEY,
                                product_name TEXT,
                                description TEXT,
                                list_price NUMERIC)""")
        # Copy the dataframe to the `products` table.
        tuples = list(df.itertuples(index=False))
        await conn.copy_records_to_table('products', records=tuples, columns=list(df), timeout=10)
        await conn.close()
# Run the SQL commands now.
await main() # type: ignore

Generating the vector embeddings using Vertex AI

Google use the Vertex AI Text Embedding model to generate the vector embeddings for the text that describes various toys in our products table. At publication, the Vertex AI Text Embedding model only accepts 3,072 input tokens in a single API request. Therefore, as a first step, split long product descriptions into smaller chunks of 500 characters each.

Split long text into smaller chunks with LangChain

You can use the RecursiveCharacterTextSplitter method from LangChain library, which provides a nice and simple way to split the large text into smaller chunks.lang-py

from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(
    separators = [".", "\n"],
    chunk_size = 500,
    chunk_overlap  = 0,
    length_function = len,
)
chunked = []
for index, row in df.iterrows():
  product_id = row['product_id']
  desc = row['description']
  splits = text_splitter.create_documents([desc])
  for s in splits:
    r = { 'product_id': product_id, 'content': s.page_content }
    chunked.append(r)

Get the vector embeddings using Vertex AI

After you split long product descriptions into smaller chunks, you can generate vector embeddings for each chunk by using the Text Embedding Model available through Vertex AI. Notice how we can accomplish this in just a few lines of code!lang-py

from langchain.embeddings import VertexAIEmbeddings
from google.cloud import aiplatform
aiplatform.init(project=f"{project_id}", location=f"{region}")
embeddings_service = VertexAIEmbeddings()
batch_size = 5
for i in range(0, len(chunked), batch_size):
  request = [x['content'] for x in chunked[i: i + batch_size]]
  response = embeddings_service.embed_documents(request)
  # Post-process the generated embeddings.
  # ...

Use pgvector to store the generate embeddings

After creating the pgvector extension and registering a new vector data type, you can store a NumPy array directly into a PostgreSQL table.lang-py

from pgvector.asyncpg import register_vector
# ...
await conn.execute("CREATE EXTENSION IF NOT EXISTS vector")
await register_vector(conn)
# Create the `product_embeddings` table to store vector embeddings.
await conn.execute("""CREATE TABLE product_embeddings(
                             product_id VARCHAR(1024), 
                             content TEXT,
                             embedding vector(768))""")
# Store all the generated embeddings.
for index, row in product_embeddings.iterrows():
    await conn.execute("INSERT INTO product_embeddings VALUES ($1, $2, $3)"                                    
                       row['product_id'], 
                       row['content'], 
                       np.array(row['embedding']))

Finding similar toys using pgvector cosine search operator

By completing the steps above, you’ve just made an entire toy dataset searchable using simple English! Check it out in the image below. You can even filter toys based on a specific price range, demonstrating the hybrid search capabilities.

So, how does this work? Let’s break it down. 

Step 1: Generate the vector embedding for the incoming input query.lang-py

# Generate vector embedding for the user query.
from langchain.embeddings import VertexAIEmbeddings
embeddings_service = VertexAIEmbeddings()
qe = embeddings_service.embed_query([user_query])

Step 2: Use the new pgvector cosine similarity search operator to find related products 

Notice how you can combine the vector search operation with the regular SQL filters on the `list_price` column using the powerful PostgreSQL and pgvector query semantics.lang-py

# Use cosine similarity search to find the top five products 
# that are most closely related to the input query.
results = await conn.fetch("""
             WITH vector_matches AS (
                     SELECT product_id, 
                            1 - (embedding <=> $1) AS similarity
                     FROM product_embeddings
                     WHERE 1 - (embedding <=> $1) > $2
                     ORDER BY similarity DESC
                     LIMIT $3
             )
             SELECT product_name, 
                    list_price, 
                    description 
             FROM products
             WHERE product_id IN (SELECT product_id FROM vector_matches)
                   AND list_price >= $4 AND list_price <= $5
             """, 
             qe, similarity_threshold, num_matches, min_price, max_price)

Use case 1: Building an AI-curated contextual hybrid search

Now that you’ve learned how to find similar toy products, it’s time to super-charge your application with AI. Google will use large language models (LLMs) to make our application more intelligent and capable of answering user queries about these products.

The following example shows how a grandparent uses the AI-powered search interface to find a perfect toy for their grandkid by describing their needs in simple English!

What is happening behind the scenes here?

  • First, retrieve all the matching products and their descriptions using pgvector, following the same steps that we showed above.
  • Then, use the MapReduce Chain from LangChain library to build a high-quality prompt context by combining summaries of all similar toy products. 
  • Finally, invoke the Vertex AI text generation LLM model to get a well-formatted answer. See the code snippet below for an example.

lang-py

# Using LangChain for summarization and efficient context building.
from langchain.chains.summarize import load_summarize_chain
from langchain.docstore.document import Document
from langchain.llms import VertexAI
from langchain import PromptTemplate, LLMChain
from IPython.display import display, Markdown
llm = VertexAI()
map_prompt_template = """
              You will be given a detailed description of a toy product.
              This description is enclosed in triple backticks (```).
              Using this description only, extract the name of the toy,
              the price of the toy and its features.
              ```{text}```
              SUMMARY:
              """
map_prompt = PromptTemplate(template=map_prompt_template, input_variables=["text"])
combine_prompt_template = """
                You will be given a detailed description different toy products
                enclosed in triple backticks (```) and a question enclosed in
                double backticks(``).
                Select one toy that is most relevant to answer the question.
                Using that selected toy description, answer the following
                question in as much detail as possible.
                You should only use the information in the description.
                Your answer should include the name of the toy, the price of
                the toy and its features. 
                Your answer should be less than 200 words.
                Your answer should be in Markdown in a numbered list format.
                Description:
                ```{text}```
                Question:
                ``{user_query}``
                Answer:
                """
combine_prompt = PromptTemplate(template=combine_prompt_template, input_variables=["text", "user_query"])
docs = [Document(page_content=t) for t in matches]
chain = load_summarize_chain(llm,
                             chain_type="map_reduce",
                             map_prompt=map_prompt,
                             combine_prompt=combine_prompt)
answer = chain.run({
                      'input_documents': docs,
                      'user_query': user_query,
                    })
display(Markdown(answer))

Use case 2: Adding AI-powered creative content generation

Building a hybrid semantic search is a common, powerful example for using LLMs with vector embeddings. But there’s so much more you can do with this new technology! 

You can create an AI-powered creative content generation tool by adjusting LLM prompt input and model temperature settings. Temperature is an input parameter for an LLM prompt that ranges from zero to one, and it defines the randomness of the model’s output. A higher temperature will result in more creative response, while a lower temperature will result in more definitive and factual output.

In the following example, you can see how a seller uses generative AI to get auto-generated item descriptions for a new bicycle product that they want to add to the platform!

Like before, you can use the pgvector similarity search operator to find a similar product description, then use it as a prompt context to generate new creative output. Here is the LLM prompt code snippet that was used for the above output:lang-py

from langchain.llms import VertexAI
from langchain import PromptTemplate, LLMChain
from IPython.display import display, Markdown
template = """
            You are given descriptions about some similar kind of toys in the
            context. This context is enclosed in triple backticks (```).
            Combine these descriptions and adapt them to match the 
            specifications in the initial prompt. 
            All the information from the initial prompt must be included. 
            You are allowed to be as creative as possible,
            Describe the new toy in as much detail. Your answer should be
            less than 200 words.
            Context:
            ```{context}```
            Initial Prompt:
            {creative_prompt}
            Answer:
        """
prompt = PromptTemplate(template=template, input_variables=["context", "creative_prompt"])
# Increase the `temperature` to allow more creative writing freedom.
llm = VertexAI(temperature=0.7)
llm_chain = LLMChain(prompt=prompt, llm=llm)
answer = llm_chain.run({
    "context": '\n'.join(matches),
    "creative_prompt": creative_prompt,
    })
display(Markdown(answer))

Summary

This blog demonstrated just two examples of powerful features that you can implement by combining the power of relational databases with LLMs. 

Generative AI is a powerful paradigm shift in application development that lets you create novel applications to serve users in new ways — from answering patients’ complex medical questions to helping enterprises analyze cyberattacks. Google have shown just one small example of what generative AI can unlock; Google can’t wait to see what you build with it.