Understanding Streaming in AI Applications
Streaming allows AI responses to be delivered incrementally as they're generated, rather than waiting for the complete response. This creates a more responsive user experience, especially for longer responses that might take several seconds to generate.
Benefits of Streaming
- Faster Time-to-First-Token: Users see content immediately
- Better UX: Real-time typing effect feels more natural
- Lower Perceived Latency: Users engage while content loads
- Memory Efficient: Process data as it arrives
The streamText Function
The streamText function is the primary way to stream AI responses:
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
const result = streamText({
model: openai('gpt-4-turbo'),
prompt: 'Write a poem about coding.',
});
// The result provides multiple ways to consume the stream
Consuming Streams
1. Using toDataStreamResponse (Recommended for API Routes)
// app/api/chat/route.ts
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
model: openai('gpt-4-turbo'),
messages,
});
// Returns a Response with proper streaming headers
return result.toDataStreamResponse();
}
2. Using toTextStreamResponse
// Returns plain text stream (simpler but less features)
export async function POST(req: Request) {
const { prompt } = await req.json();
const result = streamText({
model: openai('gpt-4-turbo'),
prompt,
});
return result.toTextStreamResponse();
}
3. Using Async Iterator
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
const result = streamText({
model: openai('gpt-4-turbo'),
prompt: 'Count from 1 to 10.',
});
// Process each chunk as it arrives
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}
// Or collect all text
const fullText = await result.text;
Stream Data Protocol
The AI SDK uses a special protocol for streaming that includes metadata:
// The data stream includes:
// - Text deltas (the actual content)
// - Tool calls and results
// - Finish reasons
// - Usage information
const result = streamText({
model: openai('gpt-4-turbo'),
messages,
onFinish: async ({ text, finishReason, usage }) => {
console.log('Finished:', finishReason);
console.log('Tokens used:', usage);
// Save to database, log analytics, etc.
await saveToDatabase(text);
},
});
Streaming with System Messages
const result = streamText({
model: openai('gpt-4-turbo'),
system: 'You are a helpful coding assistant. Be concise.',
messages: [
{ role: 'user', content: 'How do I sort an array in JavaScript?' }
],
});
return result.toDataStreamResponse();
Streaming with Options
const result = streamText({
model: openai('gpt-4-turbo'),
messages,
// Model parameters
temperature: 0.7,
maxTokens: 1000,
topP: 0.9,
// Callbacks
onChunk: ({ chunk }) => {
// Called for each chunk
console.log('Chunk:', chunk);
},
onFinish: ({ text, usage }) => {
// Called when stream completes
console.log('Total tokens:', usage.totalTokens);
},
// Abort signal for cancellation
abortSignal: controller.signal,
});
Client-Side Stream Handling
'use client';
import { useState } from 'react';
export default function StreamDemo() {
const [response, setResponse] = useState('');
const [isLoading, setIsLoading] = useState(false);
const handleStream = async () => {
setIsLoading(true);
setResponse('');
const res = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
messages: [{ role: 'user', content: 'Tell me a joke' }]
}),
});
const reader = res.body?.getReader();
const decoder = new TextDecoder();
while (reader) {
const { done, value } = await reader.read();
if (done) break;
const text = decoder.decode(value);
setResponse(prev => prev + text);
}
setIsLoading(false);
};
return (
<div>
<button onClick={handleStream} disabled={isLoading}>
{isLoading ? 'Streaming...' : 'Get Response'}
</button>
<div>{response}</div>
</div>
);
}
Use useChat Instead
While manual stream handling works, the useChat hook handles all this
complexity for you. Use it for chat interfaces instead of manual stream parsing.
Error Handling in Streams
try {
const result = streamText({
model: openai('gpt-4-turbo'),
messages,
});
return result.toDataStreamResponse();
} catch (error) {
if (error.name === 'AbortError') {
return new Response('Stream cancelled', { status: 499 });
}
console.error('Stream error:', error);
return new Response('Error generating response', { status: 500 });
}
Key Takeaways
- • Streaming delivers responses incrementally for better UX
- • Use
streamTextfor server-side streaming - •
toDataStreamResponse()handles headers and protocol automatically - • The
onFinishcallback is useful for logging and persistence - • Use
useChathook on the client for easier integration
Learn More
-
streamText Documentation →
Complete reference for the streamText function.
-
Stream Protocol →
Understand the data stream format and protocol.