Vectorize + Workers AI Semantic Search

Priority: P1 (High Value)

What is Vectorize?

A globally distributed vector database for storing and querying embeddings. Combined with Workers AI for embedding generation, it enables semantic search, recommendations, and RAG without external services.

Why This Matters for Company Manager

Current Search Infrastructure

The platform uses \1 (tsvector, ts_rank) in the press center:


-- Current search pattern (packages/services/src/press-center/search-service.ts)
SELECT *, ts_rank(search_vector, to_tsquery('french', $1)) AS rank
FROM "Article"
WHERE search_vector @@ to_tsquery('french', $1)
  AND "tenantId" = $2
ORDER BY rank DESC

Keyword-only matching (misses semantic meaning)
No cross-language search
No "similar articles" recommendations
No product recommendation engine
No semantic ticket matching for customer support
Full-text search performance degrades with scale

What Vectorize + Workers AI Enables

Feature	Before	After
Article search	Keyword matching	Semantic understanding
Product discovery	Category browsing	"Find products like this"
Recommendations	None	AI-powered similar items
Support tickets	Manual routing	Semantic matching to solutions
Content moderation	Rule-based	Embedding similarity detection
Multilingual search	Per-language index	Cross-language vectors

Architecture


                   ┌──────────────────────────┐
                   │  Workers AI               │
                   │  (embedding generation)   │
                   │  @cf/baai/bge-base-en-v1.5│
                   └──────────┬───────────────┘
                              │ vectors
┌─────────────┐     ┌────────▼────────┐     ┌──────────────┐
│ Content      │────►│ Vectorize       │◄────│ Search Query │
│ (articles,   │     │ (vector store)  │     │ (user input) │
│  products,   │     │ 10M vectors/idx │     │              │
│  tickets)    │     │ multi-namespace  │     │              │
└─────────────┘     └─────────────────┘     └──────────────┘

Implementation

Step 1: Create Vectorize Index


# Article search index (384 dimensions for bge-small, 768 for bge-base)
npx wrangler vectorize create articles-index \
  --dimensions=768 \
  --metric=cosine

# Product recommendations index
npx wrangler vectorize create products-index \
  --dimensions=768 \
  --metric=cosine

# Support ticket similarity index
npx wrangler vectorize create tickets-index \
  --dimensions=768 \
  --metric=cosine

Step 2: Configure Worker


// wrangler.jsonc (search-worker)
{
  "name": "search-worker",
  "compatibility_flags": ["nodejs_compat"],
  "ai": { "binding": "AI" },
  "vectorize": [
    { "binding": "ARTICLES_INDEX", "index_name": "articles-index" },
    { "binding": "PRODUCTS_INDEX", "index_name": "products-index" },
    { "binding": "TICKETS_INDEX", "index_name": "tickets-index" }
  ]
}

Step 3: Embedding Generation + Indexing


// search-worker/src/index.ts

interface Env {
  AI: Ai;
  ARTICLES_INDEX: VectorizeIndex;
  PRODUCTS_INDEX: VectorizeIndex;
  TICKETS_INDEX: VectorizeIndex;
}

// Generate embedding for text
async function embed(env: Env, text: string): Promise<number[]> {
  const result = await env.AI.run("@cf/baai/bge-base-en-v1.5", {
    text: [text],
  });
  return result.data[0]; // 768-dimensional vector
}

// Index an article
async function indexArticle(env: Env, article: Article) {
  const text = `${article.title} ${article.summary} ${article.content}`;
  const vector = await embed(env, text);

  await env.ARTICLES_INDEX.upsert([{
    id: article.id,
    values: vector,
    namespace: article.tenantId, // tenant isolation via namespace
    metadata: {
      title: article.title,
      author: article.author,
      publishedAt: article.publishedAt,
      category: article.category,
      tenantId: article.tenantId,
    },
  }]);
}

// Index a product
async function indexProduct(env: Env, product: Product) {
  const text = `${product.name} ${product.description} ${product.category}`;
  const vector = await embed(env, text);

  await env.PRODUCTS_INDEX.upsert([{
    id: product.id,
    values: vector,
    namespace: product.tenantId,
    metadata: {
      name: product.name,
      price: product.price,
      category: product.category,
      tenantId: product.tenantId,
    },
  }]);
}

Step 4: Semantic Search


// Semantic search endpoint
async function search(env: Env, query: string, tenantId: string, type: "articles" | "products") {
  const queryVector = await embed(env, query);

  const index = type === "articles" ? env.ARTICLES_INDEX : env.PRODUCTS_INDEX;

  const results = await index.query(queryVector, {
    topK: 20,
    namespace: tenantId,
    returnMetadata: "all",
    returnValues: false,
  });

  return results.matches.map(match => ({
    id: match.id,
    score: match.score,
    ...match.metadata,
  }));
}

Step 5: Similar Items / Recommendations


// "More like this" recommendations
async function findSimilar(env: Env, itemId: string, tenantId: string, type: "articles" | "products") {
  const index = type === "articles" ? env.ARTICLES_INDEX : env.PRODUCTS_INDEX;

  // Get the item's existing vector
  const existing = await index.getByIds([itemId]);
  if (!existing.length) return [];

  // Query for similar items (exclude self)
  const results = await index.query(existing[0].values!, {
    topK: 11, // extra 1 to exclude self
    namespace: tenantId,
    returnMetadata: "all",
  });

  return results.matches
    .filter(m => m.id !== itemId)
    .slice(0, 10);
}

Step 6: Hybrid Search (Semantic + Keyword)

Combine Vectorize results with PostgreSQL full-text for best results:


// In TRPC router -- hybrid search
export const searchRouter = createTRPCRouter({
  search: permissionProtectedProcedure(["content:read"])
    .input(z.object({ query: z.string(), type: z.enum(["articles", "products"]) }))
    .query(async ({ ctx, input }) => {
      // Parallel: semantic + keyword search
      const [semanticResults, keywordResults] = await Promise.all([
        // Semantic via Worker
        fetch(`${SEARCH_WORKER_URL}/search`, {
          method: "POST",
          body: JSON.stringify({
            query: input.query,
            tenantId: ctx.tenantId,
            type: input.type,
          }),
        }).then(r => r.json()),

        // Keyword via PostgreSQL (existing)
        getService("search", ctx).then(s => s.search(input.query)),
      ]);

      // Merge and re-rank (reciprocal rank fusion)
      return mergeResults(semanticResults, keywordResults);
    }),
});

Use Cases by Domain

1. Press Center Article Search

\1: tsvector keyword search \1: Semantic search + "related articles" + cross-language discovery


// User searches: "climate change impact on farming"
// Semantic: finds articles about agriculture, environment, sustainability
// Keyword: only finds articles with exact words "climate", "change", "farming"

2. Product Recommendations

\1: None (category browsing only) \1: "Similar products", "Customers also viewed", "Complete the look"


// User views: "Organic lavender essential oil"
// Recommends: other essential oils, aromatherapy products, organic skincare

3. Support Ticket Routing

\1: Manual routing by CustomerOperationsAgent \1: Match new tickets to resolved tickets for auto-suggestions


// New ticket: "My order hasn't arrived after 2 weeks"
// Finds similar resolved tickets with shipping delay solutions

4. AI Agent Enhancement

Enhance existing autonomy agents with vector context:


// ContentManagementAgent -- find content gaps
const existingContent = await env.ARTICLES_INDEX.query(topicVector, {
  topK: 5,
  namespace: tenantId,
});
// If low similarity scores → content gap → generate new content

5. Classified Ads Matching

Match buyer searches to seller listings semantically.

6. City Portal Entity Discovery

Semantic search across businesses and associations in city portal.

Multi-Tenant Isolation

Vectorize supports \1 (50K per index). Use tenantId as namespace:


// Tenant A's articles are isolated from Tenant B
await index.query(vector, {
  namespace: "tenant-uuid-a",  // only searches tenant A's vectors
  topK: 10,
});

This is perfect for Company Manager's multi-tenant architecture.

Embedding Model Options

Model	Dimensions	Speed	Quality	Languages
`@cf/baai/bge-small-en-v1.5`	384	Fast	Good	English
`@cf/baai/bge-base-en-v1.5`	768	Medium	Better	English
`@cf/baai/bge-large-en-v1.5`	1024	Slow	Best	English
`@cf/baai/bge-m3`	Variable	Medium	Good	Multilingual
`@cf/google/embeddinggemma-300m`	Variable	Fast	Good	100+ languages

\1: Use \1 or \1 for multilingual support (French + English content in Company Manager).

Batch Indexing Pipeline

For initial data load and ongoing sync:


// Batch indexing Worker (triggered by Queue)
export default {
  async queue(batch: MessageBatch<IndexJob>, env: Env) {
    const vectors: VectorizeVector[] = [];

    for (const message of batch.messages) {
      const { id, text, tenantId, metadata } = message.body;
      const embedding = await embed(env, text);
      vectors.push({ id, values: embedding, namespace: tenantId, metadata });
    }

    // Batch upsert (up to 1000 vectors)
    await env.INDEX.upsert(vectors);
  },
};

Limits

Metric	Free	Paid
Indexes/account	100	50,000
Vectors/index	10M	10M
Dimensions	1,536 max	1,536 max
Namespaces/index	1,000	50,000
Metadata/vector	10 KiB	10 KiB
topK (with metadata)	20	20
topK (without metadata)	100	100
Batch upsert	1,000 vectors	1,000 vectors

Pricing

Nearly free for moderate usage:

**Stored**: $0.05 per 100M stored dimensions/month
**Queried**: $0.01 per 1M queried dimensions

\1: 100K articles at 768 dimensions, 10K queries/day:

Storage: 100K * 768 = 76.8M dims → ~$0.04/mo
Queries: 10K * 30 * 768 = 230M dims → ~$0.23/mo
**Total: ~$0.27/month**

Estimated Impact

**Search quality**: 3-5x improvement in relevance for semantic queries
**New capabilities**: Product recommendations, similar articles, ticket matching
**Cost**: < $1/month for moderate usage
**Latency**: ~50-100ms for search (embedding + query)
**Effort**: 1-2 weeks for search worker + indexing pipeline