Back to Projects
Software

Embeddable Semantic Search Platform

A drop-in widget that adds semantic search and AI summaries to any website - it crawls a site, indexes the content as vector embeddings, and serves meaning-based search through a single embeddable script.

TurborepoTypeScriptPostgreSQLRedisWeaviateFastifyPlaywrightBullMQ
Embeddable Semantic Search Platform

This is an embeddable semantic search platform: a single drop-in <script> tag that gives any website meaning-based search and fast AI summaries. It crawls a target site, indexes the content as vector embeddings, and serves results ranked by meaning rather than keyword overlap - so a query like “software with vector databases” finds the right page even when those exact words never appear on it.

It was my final-year project, built as a TypeScript monorepo (Turborepo) spanning a crawler (Playwright), an ingestion/embedding pipeline (BullMQ workers, Hugging Face Transformers), a Weaviate vector store, a Fastify search API, and the embeddable front-end widget - the same widget running live on this page.

Try it live

The search box below is the actual widget, indexing this very portfolio.

How it works

A site is crawled and the content is broken into chunks; each chunk is embedded with a local transformer model and stored in Weaviate alongside its source URL. At query time the search API embeds the query, retrieves the nearest chunks by vector similarity, and optionally passes the top results to an LLM to compose a short, grounded answer. Heavy work (crawling, embedding) runs asynchronously on Redis/BullMQ queues, and PostgreSQL/Prisma holds site config and metadata.

Gallery

The embeddable widget returning ranked results with an AI summary
The embeddable widget returning ranked results with an AI summary