Guides
Quickstart: 5-minute RAG
Go from zero to a working RAG pipeline in under five minutes.
This guide walks you through installing the SDK, inserting documents, retrieving relevant passages, and connecting them to an LLM for grounded answers.
1. Install the SDK
pip install aether-ai
Get an API key from the Aether Dashboard and make it available to your app — for example as an environment variable (a .env file or your secrets manager works too):
export AETHER_API_KEY="your-api-key"
2. Insert documents
import os
from aether import AetherClient
client = AetherClient(api_key=os.environ["AETHER_API_KEY"])
# Insert a few knowledge base articles
client.insert_text("Employees accrue 20 days of PTO per year. Unused days roll over up to a maximum of 10.")
client.insert_text("The company matches 401(k) contributions up to 4% of base salary.")
client.insert_text("Remote work is available three days per week. Fridays are designated in-office collaboration days.")
client.insert_text("Health insurance covers the employee and dependents. Dental and vision are included at no extra cost.")
client.insert_text("The annual performance review cycle runs from January to March, with mid-year check-ins in July.")
print("Inserted 5 documents")
3. Retrieve relevant passages
retrieve() is the primary RAG method: it runs semantic search and returns the matching passages — content included — in a single call.
results = client.retrieve("How many vacation days do I get?", k=3)
print(f"Found {len(results)} passages")
for r in results:
print(f" - {r.content}")
Aether returns the most semantically similar passages first. For most RAG use cases this is everything you need — pass r.content straight to your LLM. If you want to inspect ranking score, filter out weak matches, or reach for the lower-level search() primitive, see Tuning retrieval.
4. Connect to an LLM
Pass the retrieved passages to your LLM as context. Here's an example with Anthropic's Claude:
import anthropic
results = client.retrieve("How many vacation days do I get?", k=3)
context = "\n\n".join(f"[Source {i+1}]\n{r.content}" for i, r in enumerate(results))
claude = anthropic.Anthropic()
message = claude.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=300,
system=f"Answer the user's question using only the following context. Cite the source number.\n\n{context}",
messages=[{"role": "user", "content": "How many vacation days do I get?"}],
)
print(message.content[0].text)
Other LLMs
The same pattern works with any LLM provider. See the Integrations section for examples with OpenAI, Azure, Vercel AI SDK, and xAI Grok.
Next steps
- Tuning retrieval — pick
k, interpret score, filter weak matches - Authentication — manage API keys
- API Reference — full endpoint documentation
- Integrations — connect to your preferred LLM provider