Aether Docs

How to keep each of your customers' documents and memory separate on Aether — server-enforced isolation with partitions, hard isolation with workspaces, and tags for organizing the rest.

If you're building a product on Aether, your customers' data shares your account. Aether gives you three levels of separation, and picking the right one up front saves a painful migration later:

Partitions are a server-enforced boundary inside one workspace. You scope a client to a partition — one line, client.partition("<end-client-id>") — and every read and write on that handle is confined to it, enforced by the server before any search runs. Partitions need no setup (the first write creates one) and scale to as many end-clients as you have. This is the right tool for almost every multi-tenant app: one API key, one partition per end-client.
Workspaces are the hardest isolation: a fully separate store with its own API keys and its own usage tracking. Use one per organization when you want separate billing and separate keys; data never crosses a workspace.
Tags are labels for organizing documents within a partition (by project, kind, or status). They are not a tenant boundary — see Tags below.

Rule of thumb: a partition per end-client is the default. Reach for a workspace per customer organization only when you need separate keys or separate invoices; use tags to organize inside a partition.

Scope multi-tenant operations to a partition

If one API key serves more than one of your end-clients, every read and write must name a partition — go through client.partition(id), never the bare client. A key configured as multi-tenant enforces this: an unscoped call fails loud with a clear error instead of silently spanning end-clients. When you (or your AI pair-programmer) write Aether code for a multi-tenant app, scope it to a partition by default.

Partition per end-client

A partition is a hard boundary the server applies before a search even looks at your vectors, so a scoped query can never see — or rank, or leak — another partition's documents, and a selective partition keeps full search quality. You don't pre-declare partitions; the first scoped write creates one.

Scope a client to a partition once, then use it like any other client — there is no per-call partition argument to forget:

from aether import AetherClient

client = AetherClient(api_key=os.environ["AETHER_API_KEY"])

# One line scopes everything below to this end-client.
acme = client.partition("client_acme")
acme.insert_text("Acme's onboarding notes…", filename="onboarding.txt")
hits = acme.retrieve("what's our refund window?", k=5)  # only ever Acme's docs

To serve a different end-client, scope a different handle (client.partition("client_globex")). Crossing a partition is always a separate, explicit object — you can't reach another end-client's data by forgetting an argument. A partition id is any string you choose, 1–256 characters; your own end-client id is the natural choice.

Secure by default

Mark the API key multi-tenant (in the Dashboard when you mint it) and the server refuses any unscoped read or write — a missing partition is a loud 400, not a silent cross-client leak:

from aether import PartitionRequiredError

try:
    client.search("anything")          # unscoped, on a multi-tenant key
except PartitionRequiredError:
    pass                                # → use client.partition(id).search(...)

A single-tenant key (the default for hello-world and personal projects) treats an unscoped call as one default partition, so simple apps stay frictionless. Flip a key to multi-tenant the moment it serves more than one of your end-clients.

Managing partitions

Partitions are create-on-write, but you can still enumerate and tear them down:

# List every partition, with its document count and any likely-typo warnings.
listing = client.list_partitions()
for p in listing.partitions:
    print(p.id, p.document_count)
for w in listing.warnings:      # e.g. "Acme" vs "acme", or a one-document ghost
    print(w.detail)

# Offboard an end-client / honor a GDPR erasure — one call shreds the whole partition.
deleted = client.delete_partition("client_acme")

delete_partition is a hard delete: it shreds every document in the partition (nothing is left to restore) and is the cheap teardown for client offboarding. It's idempotent — deleting a partition that doesn't exist returns 0 and is never an error. The warnings on list_partitions flag the create-on-write footgun: a fat-fingered id ("acme" vs "acme ") silently splits an end-client's data, so near-duplicate ids and lone single-document partitions are surfaced for you to reconcile.

Provable isolation

Isolation isn't just a claim you make to your customers — it's one you can verify. Any search can return a trace of which partition(s) it actually touched, and the SDKs ship a one-line self-test:

# Drop this into your own test suite — it proves a scoped search never leaks.
check = client.partition("client_acme").verify_isolation("refund window")
assert check.ok                 # nothing left client_acme
assert check.leaked == []       # offending partitions, if any

# Or read the raw trace.
traced = client.partition("client_acme").search_trace("refund window")
print(traced.trace.partitions_touched)   # → ["client_acme"]
print(traced.trace.boundary)             # → "partition"

The isolation guarantee

Aether guarantees that data written under a partition is only ever returned to that partition. A scoped query's candidate set is the partition's own documents and nothing else, enforced before ranking — so cross-partition content can't surface even when it's a closer semantic match.

The boundary you own is the request → tenant mapping: deciding which partition id a given end-user's request belongs to, and passing it. Aether keeps the data apart; your app decides whose request is whose.

Workspace per customer

This is the recommended pattern when your customers' data must never mix: each customer gets their own workspace, and your backend holds one API key per customer.

Create a workspace in the Aether Dashboard, named after the customer. One account can hold as many workspaces as you need, and you can invite teammates to a workspace by email with a role (owner, admin, or member).
Mint an API key inside that workspace. Owners and admins can create keys.
Store the key in your backend, keyed by your own customer ID — a database column, a secrets manager entry, whatever your stack already trusts. It's a customer secret; treat it like one.

From then on, scoping a request to a customer is just picking the right client:

from aether import AetherClient

_clients: dict[str, AetherClient] = {}

def client_for(customer_id: str) -> AetherClient:
    """One client per customer, authenticated with that customer's workspace key."""
    if customer_id not in _clients:
        api_key = load_workspace_key(customer_id)  # your key store
        _clients[customer_id] = AetherClient(api_key=api_key)
    return _clients[customer_id]

# Every call through this client lands in Acme's workspace — and only Acme's
client_for("acme").insert_text("Acme's onboarding notes…")

There is no tenant ID to pass on each request and no filter to forget: a request authenticated with Acme's key physically cannot read or write another workspace.

Rotation and revocation

Because each customer has their own key, key hygiene is per customer. To rotate: mint a new key in the customer's workspace, swap it into your key store, then revoke the old one in the Dashboard. To offboard a customer, revoke their workspace's keys — every other customer is untouched.

Billing your customers

The Dashboard tracks usage per workspace, with daily history for storage, queries, and inserts. With a workspace per customer, that page is your metering: read off each customer's workspace to invoice them, spot heavy users, or set internal alerts. You don't have to count requests in your own code.

Workspaces are created in the Dashboard

There is no API to create workspaces programmatically today — a human creates each one in the Dashboard and mints its keys. That makes a workspace per organization a great fit for managed services, agencies, and B2B products where customers onboard at human speed. If your product signs up end-clients at machine speed (self-serve, thousands of users), don't mint a workspace each — use one workspace with a partition per end-client. The boundary is still server-enforced, and partitions need no provisioning step.

Tags: organizing within a partition

Tags are labels for organizing documents inside a partition — by project, kind, or status — not a tenant boundary. Use a partition to keep end-clients apart (the server enforces it); use tags to slice one end-client's documents further. Every insert variant — insert, insert_text, insert_async, insert_stream, insert_with_embeddings, and batch insert — accepts a flat list of string tags, and search, retrieve, search_by_vector, and batch search accept a tag filter.

# Write: stamp every document with its owner
client.insert_text(
    "Q3 kickoff notes…",
    filename="kickoff.md",
    tags=["project:apollo", "user:42"],
)

# Read: every listed tag must match (AND)
results = client.retrieve(
    "what did we decide about launch dates?",
    k=10,
    tags=["project:apollo", "user:42"],
)

Use a key:value slug convention — customer:acme, user:42, kind:memory. It keeps tags self-describing and composable: filter on ["user:42"] for everything a user owns, or ["user:42", "kind:memory"] for one kind of thing they own.

Know the caveats before you build on tags:

Filters are AND'd. A document must carry every tag in the filter to match. There is no OR — to search two projects, run two queries (or one batch search).
No commas in tag values. Tags travel comma-joined on the wire, so a comma inside a value will split it into two tags. The key:value slug convention sidesteps this.
Tags are write-only. They are not returned on document records or search results, so you cannot read back, enumerate, or audit a document's tags through the API. If you need that — and most multi-tenant apps do — keep your own doc_id → tags mapping in your database at insert time.
Narrow tags can under-fill k. Tag filters are applied to a candidate set retrieved from the whole store, not during the index search itself. When a tag matches only a small slice of your documents, a filtered query can return fewer than k results even though more matching documents exist. Request a larger k than you need when filtering on a narrow tag, then drop weak matches by score.
Tags are not a tenant boundary. Any key for the workspace can query with no tag filter and see every tag. To keep end-clients apart, scope to a partition — that boundary the server enforces. Tags organize within a partition; they don't replace it.

Per-user agent memory

A common shape for the tags pattern: an agent or chatbot that remembers facts about each end user. One workspace holds all users' memories; every memory carries user:<id> and kind:memory tags.

Store a fact when the conversation surfaces one, and keep the returned doc_id in your own database — remember, tags can't be read back, so this mapping is how you'll find the fact again to update or delete it:

record = client.insert_text(
    "Prefers answers in French. Works in Lyon, UTC+1.",
    filename="memory.txt",
    tags=["user:42", "kind:memory"],
)
save_memory_ref(user_id=42, doc_id=record.doc_id)  # your table

Recall at session start by retrieving against the user's first message, filtered to their memories. Over-request k (narrow tags can under-fill) and filter by score so an off-topic session doesn't drag in unrelated facts:

results = client.retrieve(
    first_message,
    k=15,
    tags=["user:42", "kind:memory"],
)
memories = [r.content for r in results if r.score >= 60]

Evolve a fact instead of accumulating contradictions: keep the doc_id and call update, which re-embeds the new content under the same ID. update replaces the document's tags, so always re-send them:

from pathlib import Path

# Python's update() takes a file path — write the revised fact first
revised = Path("memory.txt")
revised.write_text("Prefers answers in French. Moved to Montreal, UTC-5.")

client.update(
    doc_id,
    revised,
    tags=["user:42", "kind:memory"],  # update replaces tags — re-send them
)

Clean up when a user deletes their account or asks to be forgotten — walk your doc_id mapping and delete each memory:

client.delete(doc_id)  # soft delete — restore(doc_id) undoes it

There is no TTL

Documents never expire on their own — there's no TTL or automatic cleanup. If memories should age out (say, anything untouched for 90 days), that's your job: track timestamps alongside your doc_id mapping and run your own sweep that calls delete.

What this doesn't do

Honest limits to design around:

You own the request → partition mapping. Aether keeps each partition's data apart and proves it; deciding which partition a given request belongs to (and passing that id) is your app's job. Get the mapping right and the boundary holds.
No cross-partition search. That's the point of a partition, not a gap — but a "search across all my end-clients" admin view can't be one scoped query. Fan out per partition and merge, or run it unscoped under a single-tenant/admin key.
Usage is per-workspace, not per-partition. The Dashboard meters the whole workspace; it can't split usage across the partitions inside it. If you need per-end-client billing, count inserts and queries in your own application, keyed by partition id.
No workspace-provisioning API. Workspaces and their keys are created by a human in the Dashboard. Partitions, by contrast, need no provisioning — the first scoped write creates one.

Next steps

Authentication — API keys, environment variables, client setup
Tuning retrieval — picking k and score thresholds
Documents API — full reference for insert, update, and delete
Limits & quotas — per-workspace limits to plan around