Vectorize and Workers AI

Vectorize allows you to generate vector embeddings using a machine-learning model, including the models available in Workers AI.

This guide will instruct you through:

Creating a Vectorize index.
Connecting a Cloudflare Worker to your index.
Using Workers AI to generate vector embeddings.
Using Vectorize to query those vector embeddings.

Prerequisites

To continue:

Sign up for a Cloudflare account ↗ if you have not already.
Install npm ↗.
Install Node.js ↗. Use a Node version manager like Volta ↗ or nvm ↗ to avoid permission issues and change Node.js versions. Wrangler requires a Node version of 16.17.0 or later.

1. Create a Worker

You will create a new project that will contain a Worker script, which will act as the client application for your Vectorize index.

Open your terminal and create a new project named embeddings-tutorial by running the following command:

npm create cloudflare@latest -- embeddings-tutorial

pnpm create cloudflare@latest embeddings-tutorial

yarn create cloudflare@latest embeddings-tutorial

For setup, select the following options:

For What would you like to start with?, choose Hello World example.
For Which template would you like to use?, choose Hello World Worker.
For Which language do you want to use?, choose TypeScript.
For Do you want to use git for version control?, choose Yes.
For Do you want to deploy your application?, choose No (we will be making some changes before deploying).

This will create a new embeddings-tutorial directory. Your new embeddings-tutorial directory will include:

A "Hello World" Worker at src/index.ts.
A wrangler.json configuration file. wrangler.json is how your embeddings-tutorial Worker will access your index.

2. Create an index

A vector database is distinct from a traditional SQL or NoSQL database. A vector database is designed to store vector embeddings, which are representations of data, but not the original data itself.

To create your first Vectorize index, change into the directory you just created for your Workers project:

cd embeddings-tutorial

To create an index, use the wrangler vectorize create command and provide a name for the index. A good index name is:

A combination of lowercase and/or numeric ASCII characters, shorter than 32 characters, starts with a letter, and uses dashes (-) instead of spaces.
Descriptive of the use-case and environment. For example, "production-doc-search" or "dev-recommendation-engine".
Only used for describing the index, and is not directly referenced in code.

In addition, define both the dimensions of the vectors you will store in the index, as well as the distance metric used to determine similar vectors when creating the index. This configuration cannot be changed later, as a vector database is configured for a fixed vector configuration.

Run the following wrangler vectorize command, ensuring that the dimensions are set to 768: this is important, as the Workers AI model used in this tutorial outputs vectors with 768 dimensions.

npx wrangler vectorize create embeddings-index --dimensions=768 --metric=cosine

✅ Successfully created index 'embeddings-index'

[[vectorize]]
binding = "VECTORIZE" # available in your Worker on env.VECTORIZE
index_name = "embeddings-index"

This will create a new vector database, and output the binding configuration needed in the next step.

3. Bind your Worker to your index

You must create a binding for your Worker to connect to your Vectorize index. Bindings allow your Workers to access resources, like Vectorize or R2, from Cloudflare Workers. You create bindings by updating your Wrangler file.

To bind your index to your Worker, add the following to the end of your Wrangler file:

wrangler.json
wrangler.toml

{
  "vectorize": [
    {
      "binding": "VECTORIZE",
      "index_name": "embeddings-index"
    }
  ]
}

[[vectorize]]
binding = "VECTORIZE" # available in your Worker on env.VECTORIZE
index_name = "embeddings-index"

Specifically:

The value (string) you set for <BINDING_NAME> will be used to reference this database in your Worker. In this tutorial, name your binding VECTORIZE.
The binding must be a valid JavaScript variable name ↗. For example, binding = "MY_INDEX" or binding = "PROD_SEARCH_INDEX" would both be valid names for the binding.
Your binding is available in your Worker at env.<BINDING_NAME> and the Vectorize client API is exposed on this binding for use within your Workers application.

4. Set up Workers AI

Before you deploy your embedding example, ensure your Worker uses your model catalog, including the text embedding model built-in.

From within the embeddings-tutorial directory, open your Wrangler file in your editor and add the new [[ai]] binding to make Workers AI's models available in your Worker:

wrangler.json
wrangler.toml

{
  "vectorize": [
    {
      "binding": "VECTORIZE",
      "index_name": "embeddings-index"
    }
  ],
  "ai": {
    "binding": "AI"
  }
}

[[vectorize]]
binding = "VECTORIZE" # available in your Worker on env.VECTORIZE
index_name = "embeddings-index"

[ai]
binding = "AI" # available in your Worker on env.AI

With Workers AI ready, you can write code in your Worker.

5. Write code in your Worker

To write code in your Worker, go to your embeddings-tutorial Worker and open the src/index.ts file. The index.ts file is where you configure your Worker's interactions with your Vectorize index.

Clear the content of index.ts. Paste the following code snippet into your index.ts file. On the env parameter, replace <BINDING_NAME> with VECTORIZE:

export interface Env {
  VECTORIZE: Vectorize;
  AI: Ai;
}
interface EmbeddingResponse {
  shape: number[];
  data: number[][];
}

export default {
  async fetch(request, env, ctx): Promise<Response> {
    let path = new URL(request.url).pathname;
    if (path.startsWith("/favicon")) {
      return new Response("", { status: 404 });
    }

    // You only need to generate vector embeddings once (or as
    // data changes), not on every request
    if (path === "/insert") {
      // In a real-world application, you could read content from R2 or
      // a SQL database (like D1) and pass it to Workers AI
      const stories = [
        "This is a story about an orange cloud",
        "This is a story about a llama",
        "This is a story about a hugging emoji",
      ];
      const modelResp: EmbeddingResponse = await env.AI.run(
        "@cf/baai/bge-base-en-v1.5",
        {
          text: stories,
        },
      );

      // Convert the vector embeddings into a format Vectorize can accept.
      // Each vector needs an ID, a value (the vector) and optional metadata.
      // In a real application, your ID would be bound to the ID of the source
      // document.
      let vectors: VectorizeVector[] = [];
      let id = 1;
      modelResp.data.forEach((vector) => {
        vectors.push({ id: `${id}`, values: vector });
        id++;
      });

      let inserted = await env.VECTORIZE.upsert(vectors);
      return Response.json(inserted);
    }

    // Your query: expect this to match vector ID. 1 in this example
    let userQuery = "orange cloud";
    const queryVector: EmbeddingResponse = await env.AI.run(
      "@cf/baai/bge-base-en-v1.5",
      {
        text: [userQuery],
      },
    );

    let matches = await env.VECTORIZE.query(queryVector.data[0], {
      topK: 1,
    });
    return Response.json({
      // Expect a vector ID. 1 to be your top match with a score of
      // ~0.89693683
      // This tutorial uses a cosine distance metric, where the closer to one,
      // the more similar.
      matches: matches,
    });
  },
} satisfies ExportedHandler<Env>;

6. Deploy your Worker

Before deploying your Worker globally, log in with your Cloudflare account by running:

npx wrangler login

You will be directed to a web page asking you to log in to the Cloudflare dashboard. After you have logged in, you will be asked if Wrangler can make changes to your Cloudflare account. Scroll down and select Allow to continue.

From here, deploy your Worker to make your project accessible on the Internet. To deploy your Worker, run:

npx wrangler deploy

Preview your Worker at https://embeddings-tutorial.<YOUR_SUBDOMAIN>.workers.dev.

7. Query your index

You can now visit the URL for your newly created project to insert vectors and then query them.

With the URL for your deployed Worker (for example,https://embeddings-tutorial.<YOUR_SUBDOMAIN>.workers.dev/), open your browser and:

Insert your vectors first by visiting /insert.
Query your index by visiting the index route - /.

This should return the following JSON:

{
  "matches": {
    "count": 1,
    "matches": [
      {
        "id": "1",
        "score": 0.89693683
      }
    ]
  }
}

Extend this example by:

Adding more inputs and generating a larger set of vectors.
Accepting a custom query parameter passed in the URL, for example via URL.searchParams.
Creating a new index with a different distance metric and observing how your scores change in response to your inputs.

By finishing this tutorial, you have successfully created a Vectorize index, used Workers AI to generate vector embeddings, and deployed your project globally.

Next steps

Build a generative AI chatbot using Workers AI and Vectorize.
Learn more about how vector databases work.
Read examples on how to use the Vectorize API from Cloudflare Workers.