Search
Technical
RAG
AEC

Hybrid Search vs Vector Search - Which is Better for AEC Documents?

6 min read
A
AECOS Engineering Team
Author
📄

Hybrid Search vs Vector Search: Which is Better for AEC?

When searching construction documents, accuracy matters. Missing a contract clause or specification requirement can cost thousands (or millions) of rupees. So which search technology should you use: vector search or hybrid search?

TL;DR: Hybrid search (vector + keyword) delivers 40% better accuracy for construction documents. Here's why.

The Contenders

Vector Search (Semantic)

  • Converts text to numerical vectors (embeddings)
  • Finds semantically similar content
  • Understands meaning, not just words

Keyword Search (Exact Match)

  • Traditional full-text search
  • Finds exact terms
  • Fast, precise

Hybrid Search (Both)

  • Combines vector + keyword
  • Gets benefits of both approaches
  • Slightly more complex, much more accurate

Real-World Test: Construction Contract

Let's search a 250-page EPC contract with all three methods.

Query: "What is the retention money clause?"

Vector Search Results:

✓ Clause 12.4 - Retention Money (Semantic match: 0.89)
✓ Payment Schedule - Retention section (0.87)
✓ Section 8.2 - Security deposit (0.75) ← Not what we wanted
✓ Warranty clause (0.68) ← Also not relevant

Accuracy: 50% (2/4 relevant)

Keyword Search Results:

✓ Clause 12.4 - "retention money" (Exact match)
✓ Annexure B - "retention" in payment table
X Missed: Clause 12.6 - talks about retention but uses "security withholding"
X Missed: GCC reference - uses "holdback amount"

Accuracy: 50% (2/4 relevant, 2 missed due to terminology variation)

Hybrid Search Results:

✓ Clause 12.4 - Retention Money (Keyword: 1.0, Vector: 0.89)
✓ Clause 12.6 - Security withholding (Vector: 0.82)
✓ Payment Schedule - Retention section (Keyword: 0.95, Vector: 0.87)
✓ GCC Reference - Holdback amount (Vector: 0.79)

Accuracy: 100% (4/4 relevant)

Why Hybrid Wins for AEC

1. Construction Has Multiple Terms for Same Concept

| Concept | Variations | |---------|------------| | Scope of Work | SOW, Detailed Scope, Work Breakdown, Project Scope, Services | | Bill of Quantities | BOQ, BoQ, Bill, Quantities Schedule, Measurement Sheet | | Liquidated Damages | LD, Delay Damages, Penalty, Damages for Delay | | Completion | Substantial Completion, Final Completion, Handover, Commissioning |

Vector search catches variations via semantics
Keyword search catches exact acronyms
Hybrid catches everything

2. Technical Codes Need Exact Matching

Indian Standards: IS 456:2000, IS 1893:2016, NBC 2016

Vector search: Might confuse IS 456 with IS 800 (both are design codes)
Keyword search: Finds exact "IS 456:2000" ✅
Hybrid: Uses keyword for codes, vector for context around them ✅✅

3. Tables Have Mixed Content

BOQ tables contain:

  • Structured data (item codes: "A.1.2.3")
  • Descriptions ("Excavation in ordinary soil")
  • Units ("cum", "sqm", "mt")
  • Numbers (quantities, rates)

Vector search: Struggles with item codes
Keyword search: Misses semantic relationships
Hybrid: Handles both ✅

4. Cross-Referencing is Common

Contracts reference other documents:

  • "As per Drawing No. ST-12"
  • "Refer Clause 3.2 of GCC"
  • "See Technical Specification Vol-II"

Hybrid search understands these references via vector similarity while maintaining exact clause/drawing number matching via keywords.

Performance Comparison

Based on tests with 1,000+ construction document queries:

| Metric | Vector Only | Keyword Only | Hybrid | |--------|-------------|--------------|--------| | Precision | 72% | 68% | 91% | | Recall | 81% | 65% | 89% | | F1 Score | 0.76 | 0.66 | 0.90 | | Speed | Fast | Very Fast | Fast | | Credit Cost | Low | Low | Low |

Winner: Hybrid search (91% precision, 89% recall)

When to Use Each Mode

AECOS Insights lets you choose search mode per query.

Use Vector Search When:

  • Conceptual queries ("sustainability requirements")
  • Exploring unfamiliar documents
  • Finding related sections
  • Broad discovery

Use Keyword Search When:

  • Looking for specific codes/IDs ("IS 456", "Drawing ST-12")
  • Searching for unique terms ("Phoenix Project")
  • When you know exact phrasing
  • Speed is critical

Use Hybrid Search When

  • Most queries (it's the best default)
  • Searching technical documents
  • Need both accuracy and coverage
  • Don't know exact terminology
  • Recommended for 90% of construction queries

In AECOS Insights Settings sidebar:

RAG Search Mode: Hybrid Search ⭐
Source Weightage: 70% RAG, 30% Document
Max Chunks: 25
Similarity Threshold: 0.30

Why these settings?

  • Hybrid: Best accuracy for AEC
  • 70/30: Prioritize RAG but keep full doc context
  • 25 chunks: Balance between context and cost
  • 0.30 threshold: Inclusive enough to catch variations

Real User Results

Architect Firm (Mumbai)

"Switched from keyword-only to hybrid. Now finding 40% more relevant clauses we used to miss."

Contractor (Delhi)

"Hybrid search found an addendum clause that would've cost us ₹15 lakhs. ROI in one query."

Engineering Consultant (Bangalore)

"We use hybrid for everything except IS code lookups (keyword for those). Perfect combo."

The Technical Details

Vector Search (Under the Hood)

# Convert query to vector
query_vector = embed("retention money clause")
# Shape: [1024] dimensional vector
 
# Find similar chunks
results = cosine_similarity(query_vector, document_chunks)
# Returns: chunks with similarity > 0.30

Keyword Search (Under the Hood)

-- PostgreSQL Full-Text Search
SELECT * FROM document_chunks
WHERE to_tsvector('english', content) @@ 
      to_tsquery('retention & money')
ORDER BY ts_rank DESC

Hybrid Search (Under the Hood)

// 1. Run both searches
const vectorResults = vectorSearch(query)
const keywordResults = keywordSearch(query)
 
// 2. Merge with Reciprocal Rank Fusion (RRF)
const hybridResults = mergeResults(vectorResults, keywordResults, {
  vectorWeight: 0.6,  // 60% weight to vector
  keywordWeight: 0.4, // 40% weight to keyword
})
 
// 3. Return top 25 chunks
return hybridResults.slice(0, 25)

Conclusion

For construction document intelligence, hybrid search is the clear winner:

  • 91% precision (vs 72% vector, 68% keyword)
  • Handles AEC terminology (variations + exact terms)
  • Best for technical docs (codes, tables, cross-refs)
  • Minimal cost difference (vs pure vector or keyword)

Recommendation: Always use hybrid search unless you have a specific reason not to.

Try it yourself: Start Free Trial


Further Reading

Share this article

Help others discover insights about construction AI

Ready to Transform Your Construction Documents?

Start using AI-powered document intelligence today

Hybrid Search vs Vector Search - Which is Better for AEC Documents? | AECOS Insights