Semantic Search¶

Overview¶

Semantic search uses AI-powered vector embeddings to understand the meaning behind your queries, not just match keywords. This helps you find relevant documentation even when you don't know the exact terms to search for.

Key Benefits¶

Natural language queries: Ask questions like "How do we handle multi-tenancy?" instead of guessing keywords
Topic discovery: Find related ADRs and architecture docs by concept
Cross-domain relationships: Discover connections across different parts of the system

How It Works¶

Embedding Generation: Documentation is converted to vector embeddings using sentence-transformers/all-MiniLM-L6-v2
Vector Storage: Embeddings are stored in Qdrant vector database
Semantic Search: Queries are converted to embeddings and compared using cosine similarity
Result Ranking: Results are ranked by relevance score (0-1, higher is better)

Usage¶

CLI¶

# Literal search by component name
docs context sites

# Semantic search (auto-detected for multi-word queries)
docs context "authentication and authorization"

# Force semantic search with single word
docs context "sites" --semantic

# Control result count
docs context "database architecture" --limit 10

Python API¶

from tools.docs.agents import DocumentationAgent
from pathlib import Path

agent = DocumentationAgent(docs_root=Path("docs"))
results = agent.semantic_search("How do we handle JWT tokens?", limit=5)

for result in results:
    print(f"{result['doc_id']}: {result['score']:.2f}")
    print(f"  {result['content'][:100]}...")
    print(f"  Type: {result['metadata'].get('type', 'unknown')}")

REST API¶

# Basic semantic search
curl -X POST http://localhost:8000/api/v1/docs/search \
  -H "Content-Type: application/json" \
  -d '{"query": "multi-tenancy implementation", "limit": 5}'

# With hybrid search
curl -X POST http://localhost:8000/api/v1/docs/search \
  -H "Content-Type: application/json" \
  -d '{"query": "authentication", "limit": 10, "use_hybrid": true}'

API Response Format¶

{
  "results": [
    {
      "doc_id": "architecture/adr/001-hybrid-modular-ddd.md",
      "title": "Hybrid Modular DDD",
      "snippet": "MBPanel uses a hybrid modular Domain-Driven Design...",
      "score": 0.92,
      "metadata": {
        "type": "adr",
        "title": "Hybrid Modular DDD"
      }
    }
  ],
  "total": 5,
  "query": "domain driven design",
  "search_type": "semantic"
}

Query Tips¶

Best Practices¶

Use natural language: Ask questions like you would to a colleague
✅ "How do we handle authentication for external APIs?"
❌ "authentication api external"
Be specific: Include relevant context
✅ "PostgreSQL connection pooling for high traffic"
❌ "database"
Include the operation: Say what you want to know
✅ "How to create sites with custom domains"
❌ "sites custom domains"
Try variations: If one query doesn't work, rephrase
"multi-tenancy implementation" → "how we handle multiple tenants"

Example Queries¶

Query	Finds
"How do we handle authentication?"	JWT, session, auth middleware docs
"database architecture decisions"	ADRs about PostgreSQL, caching
"multi-tenancy implementation"	Architecture docs on team isolation
"site creation workflow"	Sites domain, PRDs, implementation docs
"backup scheduling strategy"	Backups domain, tasks, Celery docs
"error tracking and logging"	Observability docs, error tracker
"CI/CD pipeline for deployments"	Architecture ADRs, GitHub Actions

Troubleshooting¶

No Results Found¶

Problem: Search returns empty results

Solutions: 1. Reindex the docs: make docs-semantic-index 2. Check Qdrant is running: docker ps | grep qdrant 3. Verify index has documents:

curl http://localhost:6333/collections/docs | jq '.result.points_count'

Slow Search Performance¶

Problem: First search is slow, subsequent searches are fast

Cause: Model lazy-loads on first use (screenshot-transformers downloads model)

Solution: First search will be slower (~2-3 seconds), subsequent searches are fast (~100ms)

Connection Errors¶

Problem: "Qdrant connection failed" or "Service unavailable"

Solutions: 1. Start Qdrant: docker compose up qdrant -d 2. Check Qdrant health: curl http://localhost:6333/ 3. Check container logs: docker compose logs qdrant

Empty Index¶

Problem: "Semantic search index is empty"

Solution: Run indexing:

# Rebuild from scratch
make docs-semantic-rebuild

# Or via CLI
python -m tools.docs.cli semantic-index --force

Technical Details¶

Model Configuration¶

Parameter	Value
Model	sentence-transformers/all-MiniLM-L6-v2
Embedding Dimension	384
Similarity Metric	Cosine similarity
Max Query Length	500 characters
Max Results	50 (default: 5)

Vector Database¶

Parameter	Value
Database	Qdrant v1.7.4
Collection Name	docs
Port	6333 (internal), 56333 (external)
Persistence	Docker volume: qdrant_data

Performance Benchmarks¶

Operation	Target
Embedding 100 texts	<100ms
Search (1000 docs)	<100ms
Search (10000 docs)	<500ms
Indexing (1000 docs)	<30s

Architecture¶

┌─────────────────┐      ┌──────────────┐      ┌─────────┐
│  CLI / Agent    │─────▶│ Semantic     │─────▶│ Qdrant  │
│  (user input)   │      │ SearchEngine │      │ Vector  │
└─────────────────┘      └──────────────┘      │ DB      │
                                                        │
                              ┌──────────────┐         │
                              │ Embedding    │◀────────┘
                              │ Generator    │
                              └──────────────┘

Query Input: User provides query via CLI, Python API, or REST API
Embedding: Query is converted to 384-dimensional vector
Search: Qdrant finds similar vectors using cosine similarity
Results: Ranked results returned with relevance scores