Integrations
Vercel AI SDK
Build streaming chat and agentic RAG in Next.js by pairing Aether's vector search with the Vercel AI SDK.
The Vercel AI SDK (v5) gives you a unified interface over language model providers, streaming UI primitives for React, and a tool-calling loop for agents. Aether handles document storage and semantic retrieval. Together they cover the full stack of a knowledge-grounded chat app: ingest documents into Aether, retrieve relevant passages per request, and stream a grounded answer to the browser.
This guide builds up in four steps: a one-shot generateText call, a streaming chat route with useChat, an agent that decides when to search, and per-user scoping with tags.
Install and setup
Install the AI SDK core, a provider package, the React hooks, the Aether SDK, and Zod (used for tool schemas later):
npm install ai @ai-sdk/anthropic @ai-sdk/react @aether-ai/sdk zod
The examples use Anthropic Claude, but every snippet works with any AI SDK provider — swap @ai-sdk/anthropic for @ai-sdk/openai, @ai-sdk/google, and so on, and change the model line.
Set two environment variables (in .env.local for local Next.js development):
AETHER_API_KEY— your Aether API key. TheAetherClientconstructor reads it automatically, so you never have to pass it in code.ANTHROPIC_API_KEY— your provider key. Theanthropic()model factory reads it the same way.
Server-side only
Both keys are secrets. Create the AetherClient and call the AI SDK only in route handlers, Server Components, or server actions — never in client components.
Quick RAG with generateText
The simplest pattern: retrieve relevant documents, format them as context, and pass them to the model in the system prompt. The formatContext() helper ships with the Aether SDK and turns retrieve() results into a numbered, LLM-ready context block so you don't have to hand-roll the string assembly.
import { AetherClient, formatContext } from "@aether-ai/sdk";
import { anthropic } from "@ai-sdk/anthropic";
import { generateText } from "ai";
const aether = new AetherClient(); // reads AETHER_API_KEY from the environment
const question = "What's the company match for 401k?";
// Find the most relevant documents and build a context block
const results = await aether.retrieve(question, 3);
const context = formatContext(results);
const { text } = await generateText({
model: anthropic("claude-sonnet-4-20250514"),
system: `Answer using only this context:\n\n${context}`,
prompt: question,
});
console.log(text);
By default formatContext() renders each result as [Source N] followed by the matched passage. Pass a template to change the shape — placeholders include {i}, {title}, {doc_id}, {text}, and {score}:
const context = formatContext(results, {
template: "[{title}] (score {score})\n{text}",
});
This assumes you've already inserted documents — see the Documents API if your store is empty.
Streaming chat route
For a chat UI you want streaming. The server side is a Next.js route handler at app/api/chat/route.ts: take the conversation, retrieve context for the latest user message, and return a UI message stream.
// app/api/chat/route.ts
import { AetherClient, formatContext } from "@aether-ai/sdk";
import { anthropic } from "@ai-sdk/anthropic";
import { convertToModelMessages, streamText, type UIMessage } from "ai";
const aether = new AetherClient();
export async function POST(req: Request) {
const { messages }: { messages: UIMessage[] } = await req.json();
// Use the latest user message as the retrieval query
const lastMessage = messages[messages.length - 1];
const query = lastMessage.parts
.filter((part) => part.type === "text")
.map((part) => part.text)
.join("\n");
const results = await aether.retrieve(query, 5);
const context = formatContext(results);
const result = streamText({
model: anthropic("claude-sonnet-4-20250514"),
system: `You are a helpful assistant. Ground your answers in this context:\n\n${context}`,
messages: convertToModelMessages(messages),
});
return result.toUIMessageStreamResponse();
}
Two AI SDK v5 details worth noting: UI messages and model messages are separate types, so the conversation from the browser passes through convertToModelMessages() before reaching the model, and toUIMessageStreamResponse() streams the reply back in the format the useChat hook expects.
On the client, useChat from @ai-sdk/react manages the conversation and posts to /api/chat by default. Message content lives in a parts array:
// app/page.tsx
"use client";
import { useChat } from "@ai-sdk/react";
import { useState } from "react";
export default function Chat() {
const { messages, sendMessage, status } = useChat();
const [input, setInput] = useState("");
return (
<div>
{messages.map((message) => (
<div key={message.id}>
<strong>{message.role === "user" ? "You: " : "AI: "}</strong>
{message.parts.map((part, i) =>
part.type === "text" ? <span key={i}>{part.text}</span> : null,
)}
</div>
))}
<form
onSubmit={(e) => {
e.preventDefault();
if (!input.trim()) return;
sendMessage({ text: input });
setInput("");
}}
>
<input
value={input}
onChange={(e) => setInput(e.target.value)}
placeholder="Ask a question..."
disabled={status !== "ready"}
/>
</form>
</div>
);
}
This route retrieves on every message, whether or not the question needs it. That's fine for a knowledge-base assistant where most questions do. When retrieval should be the model's decision, use a tool instead.
Agentic retrieval with tools
Instead of always stuffing context into the system prompt, expose Aether as tools and let the model decide when to search — and what to search for. The model often writes better retrieval queries than the raw user message (it strips filler, splits compound questions, and can search more than once).
In AI SDK v5, tools take an inputSchema (a Zod schema — not parameters, which was the v4 name), and stopWhen: stepCountIs(n) lets the model take multiple tool-call steps before producing its final answer.
// app/api/chat/route.ts
import { AetherClient, formatContext } from "@aether-ai/sdk";
import { anthropic } from "@ai-sdk/anthropic";
import {
convertToModelMessages,
stepCountIs,
streamText,
tool,
type UIMessage,
} from "ai";
import { z } from "zod";
const aether = new AetherClient();
export async function POST(req: Request) {
const { messages }: { messages: UIMessage[] } = await req.json();
const result = streamText({
model: anthropic("claude-sonnet-4-20250514"),
system:
"You are a helpful assistant. Search the knowledge base before " +
"answering factual questions. Save durable facts the user shares.",
messages: convertToModelMessages(messages),
tools: {
searchKnowledge: tool({
description: "Search the knowledge base for relevant documents.",
inputSchema: z.object({
query: z.string().describe("A natural-language search query"),
}),
execute: async ({ query }) => {
const results = await aether.retrieve(query, 5);
if (results.length === 0) return "No matching documents found.";
return formatContext(results);
},
}),
saveNote: tool({
description: "Save a note to the knowledge base for later retrieval.",
inputSchema: z.object({
note: z.string().describe("The note content to save"),
}),
execute: async ({ note }) => {
const doc = await aether.insertText(note, {
filename: `note-${Date.now()}.txt`,
tags: ["kind:note"],
});
return { saved: true, docId: doc.doc_id };
},
}),
},
stopWhen: stepCountIs(5),
});
return result.toUIMessageStreamResponse();
}
The same client component works unchanged — tool calls stream to the browser as their own message parts (typed tool-searchKnowledge, tool-saveNote), which you can render for a "searching..." indicator or ignore.
saveNote turns the chat into a memory loop: the model writes facts with insertText() and finds them again on later turns with retrieve(). Returning the formatted string (rather than raw result objects) keeps the tool result compact, which matters because tool results are fed back into the model's context.
Scoping documents per user
A shared chat app must not leak one user's notes into another's retrieval. Tag every document with its owner on write, and pass the same tag as a filter on read:
// Writing: tag the document with its owner
await aether.insertText(note, {
filename: `note-${Date.now()}.txt`,
tags: ["user:" + userId, "kind:note"],
});
// Reading: only this user's documents can match
const results = (await aether.retrieve(query, 10, {
tags: ["user:" + userId],
})).filter((r) => r.score >= 60);
In the agentic route above, derive userId from your session on the server and close over it in the tools' execute functions — never accept it from the model or the request body.
A few rules to know when using tags this way:
- Tag filters are AND-ed. A document matches only if it carries every tag in the filter, so
tags: ["user:42", "kind:note"]returns only user 42's notes. - Tag values must not contain commas. Stick to a
key:valueslug convention —user:42,customer:acme,kind:note. - Tags can't be read back.
retrieve()andsearch()results don't include a document's tags, and neither doesget(). If you need to enumerate or audit tags, keep your owndoc_id→ tags mapping in your database. - Narrow tags can return fewer than
kresults. Filtering happens on a candidate set drawn from the whole store, so when a tag matches only a small slice of your documents, a filtered search can come back short even though more matches exist. Request a largerkwhen filtering to a narrow tag — that's why the example asks for 10 — and filter weak matches byscorerather than padding the context with noise. See Tuning retrieval for picking a cutoff. update()replaces tags. If you update a document, pass the full tag list again — including theuser:tag — or the scoping is lost.
For the broader design space — per-user tags versus a store per tenant, and when to choose which — see Multi-tenant patterns.