Streaming

AI & LLM Glossary

What is Streaming?

Streaming: Streaming delivers model responses token-by-token as they're generated, improving perceived latency for end users.

Streaming Explained

Streaming responses allow your application to display output as it's generated rather than waiting for the complete response. This dramatically improves user experience for long responses. Most LLM APIs support streaming via Server-Sent Events (SSE). With streaming, users see the first token quickly even if the full response takes several seconds.

Related Terms

Track Your LLM Costs

Burnwise monitors every metric automatically. Start optimizing today.

Start Free Trial