API Reference
Search & Retrieval API
Find relevant documents through the SDK search methods. REST details are included as a contract reference for debugging and advanced integrations.
SDK methods
Most applications should call the SDK. The SDK handles authentication, URL construction, retries, and response parsing.
| Operation | Python | TypeScript | .NET | Go |
|---|---|---|---|---|
| Search for metadata and passages | search | search | SearchAsync | Search |
| Retrieve full content for RAG | retrieve | retrieve | RetrieveAsync | Retrieve |
| Search with your own query vector | search_by_vector | searchByVector | SearchByVectorAsync | SearchByVector |
| Run multiple searches in one call | batch_search | batchSearch | BatchSearchAsync | BatchSearch |
Search hits expose score: a calibrated relevance integer from 0 to 100, where higher is better. Treat any cutoff as application-specific and validate it against your own documents.
Search
Use search when you need ranked document IDs, titles, content types, and matched passages without downloading full document content.
results = client.search("deployment best practices", k=10)
for res in results:
print(res.doc_id, res.score, res.passage)
Each SearchResult has this shape:
interface SearchResult {
doc_id: string;
score: number; // 0-100, higher is more relevant
title?: string;
content_type: string;
content?: string; // present only when inline content is requested
passage?: string; // matched chunk when available
}
Parameters
| Parameter | Type | Default | Applies to | Notes |
|---|---|---|---|---|
query / q | string | required | search, retrieve, batch query | Natural-language query text. |
k | int | 10 for search, 5 for retrieve | all search methods | Upper bound on returned results. |
tags | string[] | none | search, retrieve, search_by_vector | AND filter: every listed tag must be present. |
include_content / includeContent | bool | false | search, search_by_vector, batch query | Requests full document content inline. retrieve sets this for you. |
embedding | float[] | required | search_by_vector | Pre-computed query vector for BYOE workflows. |
Retrieve for RAG
Use retrieve when you want text to pass to an LLM. It performs search, asks for inline content when the server supports it, deduplicates by doc_id, and falls back to downloading document text if needed.
results = client.retrieve("deployment best practices", k=5)
for res in results:
print(res.doc_id, res.title, res.score)
print(res.content[:200])
RetrievalResult is a SearchResult with content guaranteed:
interface RetrievalResult extends SearchResult {
content: string;
}
Filtering
Partition scoping
For multi-tenant apps, scope a search to a single end-client with a partition. Unlike tags (a post-filter), a partition is a hard boundary the server applies before the search runs, so a scoped query never considers another partition's documents and a selective partition keeps full recall. Scope a client once and search through it — there's no per-call partition argument:
acme = client.partition("client_acme")
results = acme.retrieve("billing preferences", k=10) # only ever Acme's docs
To prove a scoped search stays in its partition, use search_trace / searchTrace (returns the partitions a query touched) or the one-line verify_isolation / verifyIsolation self-test — see Provable isolation.
Tags
Tags are the supported post-filter today. Pass tags when inserting documents, then pass the same tags to search or retrieve. A result must match every requested tag.
client.insert_text(
"Acme prefers invoices in EUR, billed quarterly.",
filename="acme-billing.txt",
tags=["customer:acme", "kind:memory"],
)
results = client.retrieve(
"billing preferences",
k=10,
tags=["customer:acme", "kind:memory"],
)
There is no rich metadata query DSL yet: no OR groups, range operators, nested predicates, or arbitrary JSON metadata filters. Use stable tag strings such as customer:acme, user:42, and kind:policy, and keep your own metadata table when you need to audit or enumerate tags.
When a tag matches only a small slice of your documents, a filtered search can return fewer than k results even though more matching documents exist. Request a larger k and filter weak matches by score in your application:
results = client.retrieve("billing preferences", k=10, tags=["customer:acme"])
strong = [r for r in results if r.score >= 60]
The hosted REST API also accepts a max_distance parameter for advanced distance-threshold filtering before scores are returned. Prefer SDK-level over-retrieval plus client-side score filtering unless you are deliberately working at the REST contract layer.
Batch search
Use batch search when you have several independent queries and want one network round trip.
from aether import BatchSearchQuery
responses = client.batch_search([
BatchSearchQuery(q="deployment", k=3),
BatchSearchQuery(q="billing preferences", k=3),
])
for response in responses:
print(response.query, [hit.doc_id for hit in response.results])
Batch responses are returned in the same order as the input queries. For filtered searches, prefer search or retrieve with tags until the batch tag encoding is aligned across the SDK models and the REST handler.
Search by vector
Use search_by_vector / searchByVector / SearchByVectorAsync / SearchByVector when you generate the query embedding yourself.
results = client.search_by_vector([0.1, 0.2, 0.3, ...], k=5)
for res in results:
print(res.doc_id, res.score)
Your vector length must match the active embedding index. The default hosted configuration uses minilm-l6-v2; the node detects the model output dimension and defaults to 384 dimensions for the MiniLM path. A mismatched vector returns 400 Bad Request.
REST contract
The SDKs call these routes internally. Use them directly only for debugging, custom clients, or advanced integrations that cannot use an SDK.
| Method | Path | Purpose |
|---|---|---|
GET | /search | Search by natural-language query. |
POST | /search/embed | Search by caller-provided embedding vector. |
POST | /search/batch | Run multiple natural-language searches. |
GET /search
| Query parameter | Type | Required | Notes |
|---|---|---|---|
q | string | yes | Natural-language query. |
k | int | no | Defaults to 10. |
include_content | bool | no | Adds content to each result when possible. |
tags | comma-separated string | no | AND filter. Tag values must not contain commas. |
max_distance | float | no | Advanced distance threshold. Results outside the threshold are dropped before the response is scored. |
POST /search/embed
{
"embedding": [0.1, 0.2, 0.3],
"k": 5,
"include_content": false,
"tags": ["customer:acme"],
"max_distance": 0.4
}
POST /search/batch
{
"queries": [
{
"q": "deployment",
"k": 3,
"include_content": false,
"tags": "customer:acme",
"max_distance": 0.4
}
]
}
In the current REST handler, batch-query tags are a comma-separated string. The SDK batch models expose tag arrays, so avoid filtered batch search until that contract is reconciled; use individual SDK search / retrieve calls for filtered queries.
Response shape
{
"query": "deployment",
"results": [
{
"doc_id": "doc_123",
"score": 87,
"title": "Production setup",
"content_type": "text/plain",
"passage": "Deploy from a protected branch...",
"content": "Deploy from a protected branch..."
}
]
}
Batch search wraps one response per query:
{
"results": [
{
"query": "deployment",
"results": []
}
]
}
Errors
Search endpoints use the shared API error shape:
{
"error": "Embedding dimension mismatch: got 3, expected 384",
"code": null,
"request_id": "req_..."
}
Common statuses are 400 for invalid input, 401 for missing or invalid authentication, 402 for plan limits, 429 for rate limits, and 500 / 503 for transient server errors. See Errors for retry guidance.