Hybrid Search vs Vector Search - Which is Better for AEC Documents?
Hybrid Search vs Vector Search: Which is Better for AEC?
When searching construction documents, accuracy matters. Missing a contract clause or specification requirement can cost thousands (or millions) of rupees. So which search technology should you use: vector search or hybrid search?
TL;DR: Hybrid search (vector + keyword) delivers 40% better accuracy for construction documents. Here's why.
The Contenders
Vector Search (Semantic)
- Converts text to numerical vectors (embeddings)
- Finds semantically similar content
- Understands meaning, not just words
Keyword Search (Exact Match)
- Traditional full-text search
- Finds exact terms
- Fast, precise
Hybrid Search (Both)
- Combines vector + keyword
- Gets benefits of both approaches
- Slightly more complex, much more accurate
Real-World Test: Construction Contract
Let's search a 250-page EPC contract with all three methods.
Query: "What is the retention money clause?"
Vector Search Results:
✓ Clause 12.4 - Retention Money (Semantic match: 0.89)
✓ Payment Schedule - Retention section (0.87)
✓ Section 8.2 - Security deposit (0.75) ← Not what we wanted
✓ Warranty clause (0.68) ← Also not relevant
Accuracy: 50% (2/4 relevant)
Keyword Search Results:
✓ Clause 12.4 - "retention money" (Exact match)
✓ Annexure B - "retention" in payment table
X Missed: Clause 12.6 - talks about retention but uses "security withholding"
X Missed: GCC reference - uses "holdback amount"
Accuracy: 50% (2/4 relevant, 2 missed due to terminology variation)
Hybrid Search Results:
✓ Clause 12.4 - Retention Money (Keyword: 1.0, Vector: 0.89)
✓ Clause 12.6 - Security withholding (Vector: 0.82)
✓ Payment Schedule - Retention section (Keyword: 0.95, Vector: 0.87)
✓ GCC Reference - Holdback amount (Vector: 0.79)
Accuracy: 100% (4/4 relevant)
Why Hybrid Wins for AEC
1. Construction Has Multiple Terms for Same Concept
| Concept | Variations | |---------|------------| | Scope of Work | SOW, Detailed Scope, Work Breakdown, Project Scope, Services | | Bill of Quantities | BOQ, BoQ, Bill, Quantities Schedule, Measurement Sheet | | Liquidated Damages | LD, Delay Damages, Penalty, Damages for Delay | | Completion | Substantial Completion, Final Completion, Handover, Commissioning |
Vector search catches variations via semantics
Keyword search catches exact acronyms
Hybrid catches everything
2. Technical Codes Need Exact Matching
Indian Standards: IS 456:2000, IS 1893:2016, NBC 2016
Vector search: Might confuse IS 456 with IS 800 (both are design codes)
Keyword search: Finds exact "IS 456:2000" ✅
Hybrid: Uses keyword for codes, vector for context around them ✅✅
3. Tables Have Mixed Content
BOQ tables contain:
- Structured data (item codes: "A.1.2.3")
- Descriptions ("Excavation in ordinary soil")
- Units ("cum", "sqm", "mt")
- Numbers (quantities, rates)
Vector search: Struggles with item codes
Keyword search: Misses semantic relationships
Hybrid: Handles both ✅
4. Cross-Referencing is Common
Contracts reference other documents:
- "As per Drawing No. ST-12"
- "Refer Clause 3.2 of GCC"
- "See Technical Specification Vol-II"
Hybrid search understands these references via vector similarity while maintaining exact clause/drawing number matching via keywords.
Performance Comparison
Based on tests with 1,000+ construction document queries:
| Metric | Vector Only | Keyword Only | Hybrid | |--------|-------------|--------------|--------| | Precision | 72% | 68% | 91% | | Recall | 81% | 65% | 89% | | F1 Score | 0.76 | 0.66 | 0.90 | | Speed | Fast | Very Fast | Fast | | Credit Cost | Low | Low | Low |
Winner: Hybrid search (91% precision, 89% recall)
When to Use Each Mode
AECOS Insights lets you choose search mode per query.
Use Vector Search When:
- Conceptual queries ("sustainability requirements")
- Exploring unfamiliar documents
- Finding related sections
- Broad discovery
Use Keyword Search When:
- Looking for specific codes/IDs ("IS 456", "Drawing ST-12")
- Searching for unique terms ("Phoenix Project")
- When you know exact phrasing
- Speed is critical
Use Hybrid Search When ⭐
- Most queries (it's the best default)
- Searching technical documents
- Need both accuracy and coverage
- Don't know exact terminology
- Recommended for 90% of construction queries
Configuring Hybrid Search
In AECOS Insights Settings sidebar:
RAG Search Mode: Hybrid Search ⭐
Source Weightage: 70% RAG, 30% Document
Max Chunks: 25
Similarity Threshold: 0.30
Why these settings?
- Hybrid: Best accuracy for AEC
- 70/30: Prioritize RAG but keep full doc context
- 25 chunks: Balance between context and cost
- 0.30 threshold: Inclusive enough to catch variations
Real User Results
Architect Firm (Mumbai)
"Switched from keyword-only to hybrid. Now finding 40% more relevant clauses we used to miss."
Contractor (Delhi)
"Hybrid search found an addendum clause that would've cost us ₹15 lakhs. ROI in one query."
Engineering Consultant (Bangalore)
"We use hybrid for everything except IS code lookups (keyword for those). Perfect combo."
The Technical Details
Vector Search (Under the Hood)
# Convert query to vector
query_vector = embed("retention money clause")
# Shape: [1024] dimensional vector
# Find similar chunks
results = cosine_similarity(query_vector, document_chunks)
# Returns: chunks with similarity > 0.30Keyword Search (Under the Hood)
-- PostgreSQL Full-Text Search
SELECT * FROM document_chunks
WHERE to_tsvector('english', content) @@
to_tsquery('retention & money')
ORDER BY ts_rank DESCHybrid Search (Under the Hood)
// 1. Run both searches
const vectorResults = vectorSearch(query)
const keywordResults = keywordSearch(query)
// 2. Merge with Reciprocal Rank Fusion (RRF)
const hybridResults = mergeResults(vectorResults, keywordResults, {
vectorWeight: 0.6, // 60% weight to vector
keywordWeight: 0.4, // 40% weight to keyword
})
// 3. Return top 25 chunks
return hybridResults.slice(0, 25)Conclusion
For construction document intelligence, hybrid search is the clear winner:
- ✅ 91% precision (vs 72% vector, 68% keyword)
- ✅ Handles AEC terminology (variations + exact terms)
- ✅ Best for technical docs (codes, tables, cross-refs)
- ✅ Minimal cost difference (vs pure vector or keyword)
Recommendation: Always use hybrid search unless you have a specific reason not to.
Try it yourself: Start Free Trial
Further Reading
Share this article
Help others discover insights about construction AI