Skip to main content

Semantic Search

Use semantic search when exact keyword matching is too brittle for filing research. This surface is built for concept retrieval across filing sections and related-filing discovery, not just literal text lookup.

Pick the right mode

ModeBest forWhy
keywordExact lookups such as AAPL 10-K 2024 or a known phraseUses traditional keyword retrieval and works best when precision depends on literal terms
semanticConcept-based discovery such as supply chain stress, AI capex, or pricing pressureUses vector similarity to retrieve filings and sections that discuss the idea even when they use different wording
hybridMost real research workflowsRuns keyword and vector retrieval together, merges them, and reranks the combined results for the strongest overall quality

Start with this workflow

1

Start with `hybrid`

Use hybrid first unless you know you need strict keyword matching. It is the best default when you want both literal matches and concept recall.
2

Add narrow filters second

Once the result set looks right, add filters like ticker, form, filing_year, or limit to focus the search on the issuer or filing family you care about.
3

Pivot from sections to filing-level discovery

When one filing looks especially relevant, use the similar-filings endpoint to expand outward from that source filing instead of rewriting the original search from scratch.

Common research plays

Find a concept across filings

curl -H "x-api-key: $OMNI_DATASTREAM_API_KEY" \
  "https://api.secapi.ai/v1/search/semantic?q=companies+discussing+supply+chain+disruptions&mode=hybrid&limit=10"
Use this when the concept matters more than exact wording. hybrid is usually the right starting point.

Narrow the search to a filing family

curl -H "x-api-key: $OMNI_DATASTREAM_API_KEY" \
  "https://api.secapi.ai/v1/search/semantic?q=filings+about+AI+capital+expenditure&mode=semantic&form=10-K&limit=10"
Use this when you already know the filing type you care about and want higher topical precision inside that family.
curl -H "x-api-key: $OMNI_DATASTREAM_API_KEY" \
  "https://api.secapi.ai/v1/filings/0000320193-25-000079/similar?limit=5&form=10-K"
Use this when you have found one especially relevant filing and want to branch out to near neighbors with similar content.

What powers this surface

Semantic search is backed by Voyage AI voyage-4-large embeddings, Pinecone vector retrieval, and Voyage rerank-2 reranking. voyage-4-large is the primary finance retrieval model for this surface. In hybrid mode, Datastream combines keyword retrieval with vector retrieval, merges the candidate set with Reciprocal Rank Fusion, and reranks the final list for better research quality.

Cost note

  • Pay As You Go price: $0.04 per call under semantic_search
  • plan discounts apply on the same meter family
  • Voyage 4 uses a shared embedding space, which lets Datastream pair a higher-quality index path with a cheaper query path when cost efficiency matters

GET /v1/search/semantic

Full API reference for semantic and hybrid section search.

GET /v1/filings/{filingId}/similar

Find filings with semantically similar content to a source filing.

Plans and pricing

Review the semantic_search meter family and launch pricing posture.