Vector Search

Powerful hybrid search combining semantic vector search with keyword matching for legal documents.

Overview

The HOLE Foundation Vector Search API provides:

  • Hybrid Search: Combines vector embeddings with keyword matching
  • Legal Document Focus: Optimized for statutes, case law, regulations
  • Rich Metadata: Full context including citations, jurisdiction, dates
  • Fast Results: Sub-second response times with relevance scoring

Search Endpoint

1import { HoleFoundationClient } from '@hole-foundation/sdk';
2
3const client = new HoleFoundationClient({
4 environment: 'https://api.theholefoundation.org',
5 token: 'your-jwt-token'
6});
7
8const results = await client.vectorSearch.search({
9 query: 'FOIA exemptions for national security',
10 limit: 10
11});
12
13console.log(`Found ${results.results.length} documents`);
14results.results.forEach(result => {
15 console.log(`${result.title} (score: ${result.score})`);
16 console.log(result.excerpt);
17});

Python Example

1from hole_foundation_api import HoleFoundationClient
2
3client = HoleFoundationClient(
4 environment="https://api.theholefoundation.org",
5 token="your-jwt-token"
6)
7
8results = client.vector_search.search(
9 query="FOIA exemptions for national security",
10 limit=10
11)
12
13for result in results.results:
14 print(f"{result.title} (score: {result.score})")
15 print(result.excerpt)

Search Parameters

Query (Required)

The search query in natural language:

1// Good queries
2await client.vectorSearch.search({
3 query: "What are the exemptions to public records laws?"
4});
5
6await client.vectorSearch.search({
7 query: "California Public Records Act attorney fees"
8});
9
10// Works with legal citations
11await client.vectorSearch.search({
12 query: "5 USC 552(b)(6)"
13});

Limit (Optional)

Control number of results (default: 10, max: 100):

1const results = await client.vectorSearch.search({
2 query: "FOIA deadlines",
3 limit: 25 // Get up to 25 results
4});

Include Context (Optional)

Get full document context with surrounding text:

1const results = await client.vectorSearch.search({
2 query: "privacy exemptions",
3 include_context: true
4});
5
6results.results.forEach(result => {
7 console.log(result.before_context); // Text before match
8 console.log(result.excerpt); // Matched text
9 console.log(result.after_context); // Text after match
10});

Filters (Optional)

Filter by jurisdiction, document type, or custom metadata:

1const results = await client.vectorSearch.search({
2 query: "open meeting laws",
3 filters: {
4 jurisdiction: "california",
5 document_type: "statute",
6 year: "2024"
7 }
8});

Available Filters:

  • jurisdiction: State/federal (e.g., “california”, “federal”, “texas”)
  • document_type: Document category (“statute”, “regulation”, “case_law”)
  • year: Publication year
  • category: Subject matter category
  • Custom metadata fields

Response Format

Search Result Structure

1interface SearchResponse {
2 results: SearchResult[];
3 total: number;
4 query: string;
5 took_ms: number;
6}
7
8interface SearchResult {
9 id: string;
10 title: string;
11 excerpt: string;
12 score: number;
13 document_type: string;
14 jurisdiction?: string;
15 citation?: string;
16 url?: string;
17 metadata: Record<string, any>;
18
19 // When include_context: true
20 before_context?: string;
21 after_context?: string;
22}

Understanding Scores

Search scores range from 0.0 to 1.0:

  • 0.9 - 1.0: Highly relevant, exact matches
  • 0.7 - 0.9: Very relevant, strong semantic match
  • 0.5 - 0.7: Relevant, good contextual match
  • < 0.5: Potentially relevant, review carefully
1results.results.forEach(result => {
2 if (result.score > 0.9) {
3 console.log('⭐ Highly relevant:', result.title);
4 } else if (result.score > 0.7) {
5 console.log('✓ Relevant:', result.title);
6 } else {
7 console.log('? Review:', result.title);
8 }
9});

Get Document by ID

Retrieve full document content:

1const document = await client.vectorSearch.getDocument('doc-123');
2
3console.log(document.title);
4console.log(document.full_text);
5console.log(document.metadata);

Advanced Use Cases

Search across multiple states:

1const states = ['california', 'texas', 'florida'];
2const allResults = await Promise.all(
3 states.map(state =>
4 client.vectorSearch.search({
5 query: "public meeting notice requirements",
6 filters: { jurisdiction: state }
7 })
8 )
9);
10
11// Combine and sort by score
12const combined = allResults
13 .flatMap(r => r.results)
14 .sort((a, b) => b.score - a.score);

Start broad, then narrow with filters:

1// First: Broad search
2const broad = await client.vectorSearch.search({
3 query: "privacy exemptions",
4 limit: 100
5});
6
7// Analyze results to find patterns
8const jurisdictions = new Set(
9 broad.results.map(r => r.jurisdiction)
10);
11
12console.log('Found in jurisdictions:', Array.from(jurisdictions));
13
14// Second: Focused search
15const focused = await client.vectorSearch.search({
16 query: "privacy exemptions",
17 filters: { jurisdiction: "california" }
18});

Citation Extraction

Extract legal citations from results:

1const results = await client.vectorSearch.search({
2 query: "attorney general opinions on FOIA"
3});
4
5const citations = results.results
6 .filter(r => r.citation)
7 .map(r => ({
8 citation: r.citation,
9 title: r.title,
10 url: r.url
11 }));
12
13console.log('Found citations:', citations);

Error Handling

1import { TheholetruthApiError } from '@hole-foundation/sdk';
2
3try {
4 const results = await client.vectorSearch.search({
5 query: "FOIA exemptions",
6 limit: 10
7 });
8} catch (error) {
9 if (error instanceof TheholetruthApiError) {
10 if (error.status === 400) {
11 console.error('Invalid query:', error.message);
12 } else if (error.status === 429) {
13 console.error('Rate limit exceeded');
14 } else if (error.status === 401) {
15 console.error('Authentication failed');
16 }
17 }
18 throw error;
19}

Best Practices

1. Use Specific Queries

1// ❌ Too vague
2await client.vectorSearch.search({ query: "laws" });
3
4// ✅ Specific
5await client.vectorSearch.search({
6 query: "California Public Records Act response deadlines"
7});

2. Set Appropriate Limits

1// For quick lookup
2await client.vectorSearch.search({
3 query: "statute text",
4 limit: 5
5});
6
7// For comprehensive research
8await client.vectorSearch.search({
9 query: "privacy exemptions across jurisdictions",
10 limit: 50
11});

3. Use Filters to Narrow Results

1// Instead of this
2await client.vectorSearch.search({
3 query: "California open meeting laws"
4});
5
6// Do this
7await client.vectorSearch.search({
8 query: "open meeting laws",
9 filters: { jurisdiction: "california" }
10});

4. Handle Context Appropriately

1// Only request context when needed
2const quickSearch = await client.vectorSearch.search({
3 query: "FOIA",
4 limit: 10
5 // No include_context
6});
7
8// Request context for detailed review
9const detailedSearch = await client.vectorSearch.search({
10 query: "FOIA exemption (b)(6)",
11 limit: 5,
12 include_context: true
13});

Performance Tips

Caching

Cache frequently searched queries:

1const cache = new Map<string, SearchResponse>();
2
3async function cachedSearch(query: string) {
4 if (cache.has(query)) {
5 return cache.get(query)!;
6 }
7
8 const results = await client.vectorSearch.search({ query });
9 cache.set(query, results);
10 return results;
11}

Pagination

For large result sets, use limit with offset pattern:

1async function getAllResults(query: string) {
2 const pageSize = 50;
3 let offset = 0;
4 const allResults = [];
5
6 while (true) {
7 const results = await client.vectorSearch.search({
8 query,
9 limit: pageSize,
10 filters: { offset }
11 });
12
13 allResults.push(...results.results);
14
15 if (results.results.length < pageSize) break;
16 offset += pageSize;
17 }
18
19 return allResults;
20}

Rate Limits

  • Free tier: 100 requests/day
  • Standard tier: 1,000 requests/day
  • Premium tier: 10,000 requests/day

See Rate Limits for details.

Support