Skip to content

Semantic Search

Overview

Semantic search uses AI-powered vector embeddings to understand the meaning behind your queries, not just match keywords. This helps you find relevant documentation even when you don't know the exact terms to search for.

Key Benefits

  • Natural language queries: Ask questions like "How do we handle multi-tenancy?" instead of guessing keywords
  • Topic discovery: Find related ADRs and architecture docs by concept
  • Cross-domain relationships: Discover connections across different parts of the system

How It Works

  1. Embedding Generation: Documentation is converted to vector embeddings using sentence-transformers/all-MiniLM-L6-v2
  2. Vector Storage: Embeddings are stored in Qdrant vector database
  3. Semantic Search: Queries are converted to embeddings and compared using cosine similarity
  4. Result Ranking: Results are ranked by relevance score (0-1, higher is better)

Usage

CLI

# Literal search by component name
docs context sites

# Semantic search (auto-detected for multi-word queries)
docs context "authentication and authorization"

# Force semantic search with single word
docs context "sites" --semantic

# Control result count
docs context "database architecture" --limit 10

Python API

from tools.docs.agents import DocumentationAgent
from pathlib import Path

agent = DocumentationAgent(docs_root=Path("docs"))
results = agent.semantic_search("How do we handle JWT tokens?", limit=5)

for result in results:
    print(f"{result['doc_id']}: {result['score']:.2f}")
    print(f"  {result['content'][:100]}...")
    print(f"  Type: {result['metadata'].get('type', 'unknown')}")

REST API

# Basic semantic search
curl -X POST http://localhost:8000/api/v1/docs/search \
  -H "Content-Type: application/json" \
  -d '{"query": "multi-tenancy implementation", "limit": 5}'

# With hybrid search
curl -X POST http://localhost:8000/api/v1/docs/search \
  -H "Content-Type: application/json" \
  -d '{"query": "authentication", "limit": 10, "use_hybrid": true}'

API Response Format

{
  "results": [
    {
      "doc_id": "architecture/adr/001-hybrid-modular-ddd.md",
      "title": "Hybrid Modular DDD",
      "snippet": "MBPanel uses a hybrid modular Domain-Driven Design...",
      "score": 0.92,
      "metadata": {
        "type": "adr",
        "title": "Hybrid Modular DDD"
      }
    }
  ],
  "total": 5,
  "query": "domain driven design",
  "search_type": "semantic"
}

Query Tips

Best Practices

  1. Use natural language: Ask questions like you would to a colleague
  2. ✅ "How do we handle authentication for external APIs?"
  3. ❌ "authentication api external"

  4. Be specific: Include relevant context

  5. ✅ "PostgreSQL connection pooling for high traffic"
  6. ❌ "database"

  7. Include the operation: Say what you want to know

  8. ✅ "How to create sites with custom domains"
  9. ❌ "sites custom domains"

  10. Try variations: If one query doesn't work, rephrase

  11. "multi-tenancy implementation" → "how we handle multiple tenants"

Example Queries

Query Finds
"How do we handle authentication?" JWT, session, auth middleware docs
"database architecture decisions" ADRs about PostgreSQL, caching
"multi-tenancy implementation" Architecture docs on team isolation
"site creation workflow" Sites domain, PRDs, implementation docs
"backup scheduling strategy" Backups domain, tasks, Celery docs
"error tracking and logging" Observability docs, error tracker
"CI/CD pipeline for deployments" Architecture ADRs, GitHub Actions

Troubleshooting

No Results Found

Problem: Search returns empty results

Solutions: 1. Reindex the docs: make docs-semantic-index 2. Check Qdrant is running: docker ps | grep qdrant 3. Verify index has documents:

curl http://localhost:6333/collections/docs | jq '.result.points_count'

Slow Search Performance

Problem: First search is slow, subsequent searches are fast

Cause: Model lazy-loads on first use (screenshot-transformers downloads model)

Solution: First search will be slower (~2-3 seconds), subsequent searches are fast (~100ms)

Connection Errors

Problem: "Qdrant connection failed" or "Service unavailable"

Solutions: 1. Start Qdrant: docker compose up qdrant -d 2. Check Qdrant health: curl http://localhost:6333/ 3. Check container logs: docker compose logs qdrant

Empty Index

Problem: "Semantic search index is empty"

Solution: Run indexing:

# Rebuild from scratch
make docs-semantic-rebuild

# Or via CLI
python -m tools.docs.cli semantic-index --force

Technical Details

Model Configuration

Parameter Value
Model sentence-transformers/all-MiniLM-L6-v2
Embedding Dimension 384
Similarity Metric Cosine similarity
Max Query Length 500 characters
Max Results 50 (default: 5)

Vector Database

Parameter Value
Database Qdrant v1.7.4
Collection Name docs
Port 6333 (internal), 56333 (external)
Persistence Docker volume: qdrant_data

Performance Benchmarks

Operation Target
Embedding 100 texts <100ms
Search (1000 docs) <100ms
Search (10000 docs) <500ms
Indexing (1000 docs) <30s

Architecture

┌─────────────────┐      ┌──────────────┐      ┌─────────┐
│  CLI / Agent    │─────▶│ Semantic     │─────▶│ Qdrant  │
│  (user input)   │      │ SearchEngine │      │ Vector  │
└─────────────────┘      └──────────────┘      │ DB      │
                              ┌──────────────┐         │
                              │ Embedding    │◀────────┘
                              │ Generator    │
                              └──────────────┘
  1. Query Input: User provides query via CLI, Python API, or REST API
  2. Embedding: Query is converted to 384-dimensional vector
  3. Search: Qdrant finds similar vectors using cosine similarity
  4. Results: Ranked results returned with relevance scores