Semantic Search¶
Overview¶
Semantic search uses AI-powered vector embeddings to understand the meaning behind your queries, not just match keywords. This helps you find relevant documentation even when you don't know the exact terms to search for.
Key Benefits¶
- Natural language queries: Ask questions like "How do we handle multi-tenancy?" instead of guessing keywords
- Topic discovery: Find related ADRs and architecture docs by concept
- Cross-domain relationships: Discover connections across different parts of the system
How It Works¶
- Embedding Generation: Documentation is converted to vector embeddings using sentence-transformers/all-MiniLM-L6-v2
- Vector Storage: Embeddings are stored in Qdrant vector database
- Semantic Search: Queries are converted to embeddings and compared using cosine similarity
- Result Ranking: Results are ranked by relevance score (0-1, higher is better)
Usage¶
CLI¶
# Literal search by component name
docs context sites
# Semantic search (auto-detected for multi-word queries)
docs context "authentication and authorization"
# Force semantic search with single word
docs context "sites" --semantic
# Control result count
docs context "database architecture" --limit 10
Python API¶
from tools.docs.agents import DocumentationAgent
from pathlib import Path
agent = DocumentationAgent(docs_root=Path("docs"))
results = agent.semantic_search("How do we handle JWT tokens?", limit=5)
for result in results:
print(f"{result['doc_id']}: {result['score']:.2f}")
print(f" {result['content'][:100]}...")
print(f" Type: {result['metadata'].get('type', 'unknown')}")
REST API¶
# Basic semantic search
curl -X POST http://localhost:8000/api/v1/docs/search \
-H "Content-Type: application/json" \
-d '{"query": "multi-tenancy implementation", "limit": 5}'
# With hybrid search
curl -X POST http://localhost:8000/api/v1/docs/search \
-H "Content-Type: application/json" \
-d '{"query": "authentication", "limit": 10, "use_hybrid": true}'
API Response Format¶
{
"results": [
{
"doc_id": "architecture/adr/001-hybrid-modular-ddd.md",
"title": "Hybrid Modular DDD",
"snippet": "MBPanel uses a hybrid modular Domain-Driven Design...",
"score": 0.92,
"metadata": {
"type": "adr",
"title": "Hybrid Modular DDD"
}
}
],
"total": 5,
"query": "domain driven design",
"search_type": "semantic"
}
Query Tips¶
Best Practices¶
- Use natural language: Ask questions like you would to a colleague
- ✅ "How do we handle authentication for external APIs?"
-
❌ "authentication api external"
-
Be specific: Include relevant context
- ✅ "PostgreSQL connection pooling for high traffic"
-
❌ "database"
-
Include the operation: Say what you want to know
- ✅ "How to create sites with custom domains"
-
❌ "sites custom domains"
-
Try variations: If one query doesn't work, rephrase
- "multi-tenancy implementation" → "how we handle multiple tenants"
Example Queries¶
| Query | Finds |
|---|---|
| "How do we handle authentication?" | JWT, session, auth middleware docs |
| "database architecture decisions" | ADRs about PostgreSQL, caching |
| "multi-tenancy implementation" | Architecture docs on team isolation |
| "site creation workflow" | Sites domain, PRDs, implementation docs |
| "backup scheduling strategy" | Backups domain, tasks, Celery docs |
| "error tracking and logging" | Observability docs, error tracker |
| "CI/CD pipeline for deployments" | Architecture ADRs, GitHub Actions |
Troubleshooting¶
No Results Found¶
Problem: Search returns empty results
Solutions:
1. Reindex the docs: make docs-semantic-index
2. Check Qdrant is running: docker ps | grep qdrant
3. Verify index has documents:
Slow Search Performance¶
Problem: First search is slow, subsequent searches are fast
Cause: Model lazy-loads on first use (screenshot-transformers downloads model)
Solution: First search will be slower (~2-3 seconds), subsequent searches are fast (~100ms)
Connection Errors¶
Problem: "Qdrant connection failed" or "Service unavailable"
Solutions:
1. Start Qdrant: docker compose up qdrant -d
2. Check Qdrant health: curl http://localhost:6333/
3. Check container logs: docker compose logs qdrant
Empty Index¶
Problem: "Semantic search index is empty"
Solution: Run indexing:
# Rebuild from scratch
make docs-semantic-rebuild
# Or via CLI
python -m tools.docs.cli semantic-index --force
Technical Details¶
Model Configuration¶
| Parameter | Value |
|---|---|
| Model | sentence-transformers/all-MiniLM-L6-v2 |
| Embedding Dimension | 384 |
| Similarity Metric | Cosine similarity |
| Max Query Length | 500 characters |
| Max Results | 50 (default: 5) |
Vector Database¶
| Parameter | Value |
|---|---|
| Database | Qdrant v1.7.4 |
| Collection Name | docs |
| Port | 6333 (internal), 56333 (external) |
| Persistence | Docker volume: qdrant_data |
Performance Benchmarks¶
| Operation | Target |
|---|---|
| Embedding 100 texts | <100ms |
| Search (1000 docs) | <100ms |
| Search (10000 docs) | <500ms |
| Indexing (1000 docs) | <30s |
Architecture¶
┌─────────────────┐ ┌──────────────┐ ┌─────────┐
│ CLI / Agent │─────▶│ Semantic │─────▶│ Qdrant │
│ (user input) │ │ SearchEngine │ │ Vector │
└─────────────────┘ └──────────────┘ │ DB │
│
┌──────────────┐ │
│ Embedding │◀────────┘
│ Generator │
└──────────────┘
- Query Input: User provides query via CLI, Python API, or REST API
- Embedding: Query is converted to 384-dimensional vector
- Search: Qdrant finds similar vectors using cosine similarity
- Results: Ranked results returned with relevance scores