Burnwise is an AI cost copilot that analyzes where your LLM budget actually goes, links costs to product features, and gives you concrete decisions to cut AI spending by 40% without sacrificing quality.

Which AI providers does Burnwise support?

Burnwise supports all major LLM providers including OpenAI (GPT-5.2, GPT-4), Anthropic (Claude 4.5), Google (Gemini 3.0), Mistral, xAI (Grok), DeepSeek, and Perplexity.

How long does it take to integrate Burnwise?

Burnwise can be integrated in under 5 minutes with just a few lines of code. Simply install the SDK, initialize with your API key, and wrap your existing AI client.

Does Burnwise track my prompts or completions?

No. Burnwise only tracks metadata like token counts, model names, costs, and latency. We never track prompt content, completion content, or any user data within prompts.

How much can I save with Burnwise?

Teams using Burnwise typically reduce their LLM costs by 20-40% through model arbitrage, feature-level optimization, and eliminating waste - all while maintaining output quality.

Quick Start

Get started with Burnwise in under 5 minutes. Track your LLM costs with just a few lines of code.

Need an API key? Sign up for free to get your Burnwise API key.

Compatibility: Node.js 18+, Vercel Edge Runtime, Cloudflare Workers

1Install the SDK

npm install @burnwise/sdk

2Initialize Burnwise

import { burnwise } from "@burnwise/sdk";

// Initialize once at app startup
burnwise.init({
  apiKey: process.env.BURNWISE_API_KEY!, // Get this from burnwise.io/onboarding
  debug: true, // Optional: shows init confirmation
});

// Check if SDK is ready (useful for conditional environments)
if (burnwise.isInitialized()) {
  // Safe to use burnwise.trace(), wrappers, etc.
}

3Wrap your AI client

import OpenAI from "openai";
import { burnwise } from "@burnwise/sdk";

// Wrap your OpenAI client
const openai = burnwise.openai.wrap(new OpenAI(), {
  feature: "chat-support",
});

// Use normally - costs are tracked automatically!
const response = await openai.chat.completions.create({
  model: "gpt-5.2",
  messages: [{ role: "user", content: "Hello!" }],
});

That's it! Your LLM costs are now being tracked. Check your dashboard to see real-time cost analytics.

Supported Providers

Burnwise supports all major LLM providers out of the box.

OpenAI

Full support for GPT-5.2, GPT-5.2-mini, GPT-4.1, o3, o4-mini, and all OpenAI models.

import OpenAI from "openai";
import { burnwise } from "@burnwise/sdk";

// Wrap your OpenAI client
const openai = burnwise.openai.wrap(new OpenAI(), {
  feature: "chat-support",
});

// Use normally - costs are tracked automatically!
const response = await openai.chat.completions.create({
  model: "gpt-5.2",
  messages: [{ role: "user", content: "Hello!" }],
});

Anthropic

Track costs for Claude 4.5 Opus, Claude 4.5 Sonnet, and Claude 4.5 Haiku.

import Anthropic from "@anthropic-ai/sdk";
import { burnwise } from "@burnwise/sdk";

const anthropic = burnwise.anthropic.wrap(new Anthropic(), {
  feature: "analysis",
});

const message = await anthropic.messages.create({
  model: "claude-4.5-sonnet",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Hello!" }],
});

Google Gemini

Support for Gemini 3.0 Pro, Gemini 3.0 Flash, and all Google AI models.

import { GoogleGenerativeAI } from "@google/generative-ai";
import { burnwise } from "@burnwise/sdk";

const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY!);
const model = burnwise.google.wrapModel(
  genAI.getGenerativeModel({ model: "gemini-3.0-flash" }),
  { feature: "summarization" }
);

const result = await model.generateContent("Hello!");

Mistral AI

Track Mistral Large 3, Mistral Medium 3, Mistral Small 3, and Devstral models.

import { Mistral } from "@mistralai/mistralai";
import { burnwise } from "@burnwise/sdk";

const mistral = burnwise.mistral.wrap(new Mistral(), {
  feature: "code-completion",
});

const response = await mistral.chat.complete({
  model: "mistral-large-3",
  messages: [{ role: "user", content: "Hello!" }],
});

xAI (Grok)

Support for Grok 4.1, Grok 4, and all Grok models via the OpenAI-compatible API.

import OpenAI from "openai";
import { burnwise } from "@burnwise/sdk";

const xai = burnwise.xai.wrap(
  new OpenAI({
    baseURL: "https://api.x.ai/v1",
    apiKey: process.env.XAI_API_KEY!,
  }),
  { feature: "reasoning" }
);

const response = await xai.chat.completions.create({
  model: "grok-4.1",
  messages: [{ role: "user", content: "Hello!" }],
});

DeepSeek

Support for DeepSeek V3.2, DeepSeek R1, and all DeepSeek models.

import OpenAI from "openai";
import { burnwise } from "@burnwise/sdk";

const deepseek = burnwise.deepseek.wrap(
  new OpenAI({
    baseURL: "https://api.deepseek.com/v1",
    apiKey: process.env.DEEPSEEK_API_KEY!,
  }),
  { feature: "coding" }
);

const response = await deepseek.chat.completions.create({
  model: "deepseek-v3.2",
  messages: [{ role: "user", content: "Hello!" }],
});

Perplexity

Support for Sonar Pro, Sonar Reasoning, and all Perplexity models.

import OpenAI from "openai";
import { burnwise } from "@burnwise/sdk";

const perplexity = burnwise.perplexity.wrap(
  new OpenAI({
    baseURL: "https://api.perplexity.ai",
    apiKey: process.env.PERPLEXITY_API_KEY!,
  }),
  { feature: "research" }
);

const response = await perplexity.chat.completions.create({
  model: "sonar-pro",
  messages: [{ role: "user", content: "What is Burnwise?" }],
});

Multi-Modal Support

Track costs across all AI content types: text (LLM), images, videos, and audio.

One dashboard for all AI costs: Whether you're using GPT-5.2 for chat, DALL-E for images, Veo for video, or TTS for audio—see everything in one place.

Image Generation

Calculate costs for DALL-E 3, Imagen 4.0, and Grok image generation.

import { calculateImageCost, IMAGE_PRICING } from "@burnwise/sdk";

// DALL-E 3 pricing (per image)
const cost = calculateImageCost("dall-e-3", 4, "1024x1024", "standard");
// → $0.16 (4 images × $0.04)

const hdCost = calculateImageCost("dall-e-3", 2, "1792x1024", "hd");
// → $0.24 (2 images × $0.12)

// Available models: dall-e-3, dall-e-2, imagen-4.0-*, grok-2-image-1212

Model	Provider	Cost/image
dall-e-3 (1024x1024)	OpenAI	$0.04
dall-e-3 (1024x1024, HD)	OpenAI	$0.08
dall-e-3 (1792x1024, HD)	OpenAI	$0.12
imagen-4.0-generate-001	Google	$0.04
grok-2-image-1212	xAI	$0.07

Video Generation

Calculate costs for Google Veo video generation models.

import { calculateVideoCost, VIDEO_PRICING } from "@burnwise/sdk";

// Veo 3.1 pricing (per second)
const cost = calculateVideoCost("veo-3.1-generate-preview", 8);
// → $3.20 (8 seconds × $0.40)

const fastCost = calculateVideoCost("veo-3.1-fast-generate-preview", 8);
// → $1.20 (8 seconds × $0.15)

// Available models: veo-3.1-*, veo-3.0-*

Model	Provider	Cost/second
veo-3.1-generate-preview	Google	$0.40
veo-3.1-fast-generate-preview	Google	$0.15

Audio (Text-to-Speech)

Calculate costs for OpenAI TTS and Whisper models.

import { calculateAudioCost, AUDIO_PRICING } from "@burnwise/sdk";

// TTS pricing (per 1K characters)
const cost = calculateAudioCost("tts-1", 5000);
// → $0.075 (5K chars × $0.015)

const hdCost = calculateAudioCost("tts-1-hd", 5000);
// → $0.15 (5K chars × $0.030)

// Available models: tts-1, tts-1-hd, whisper-1

Model	Provider	Cost
tts-1	OpenAI	$0.015 / 1K chars
tts-1-hd	OpenAI	$0.030 / 1K chars
whisper-1	OpenAI	$0.0001 / second

Manual Multi-Modal Tracking

Use the track() function with content-type specific fields for full control over what gets tracked.

import { track } from "@burnwise/sdk";

// Track image generation
await track({
  provider: "openai",
  model: "dall-e-3",
  contentType: "image",
  feature: "avatar-generation",
  imageCount: 4,
  imageResolution: "1024x1024",
  imageQuality: "hd",
  costUsd: 0.32, // 4 × $0.08
  latencyMs: 12000,
});

// Track video generation
await track({
  provider: "google",
  model: "veo-3.1-generate-preview",
  contentType: "video",
  feature: "marketing-video",
  videoDurationSec: 15,
  videoResolution: "1080p",
  costUsd: 6.0, // 15s × $0.40
  latencyMs: 45000,
});

// Track TTS
await track({
  provider: "openai",
  model: "tts-1-hd",
  contentType: "audio",
  feature: "podcast-narration",
  audioCharacters: 10000,
  audioVoice: "nova",
  costUsd: 0.30, // 10K chars × $0.03
  latencyMs: 5000,
});

Streaming Support

All provider wrappers support streaming responses with automatic token tracking. The SDK intercepts the stream, captures usage data from stream events, and tracks costs when the stream completes.

How It Works

•OpenAI-compatible APIs (OpenAI, xAI, DeepSeek, Perplexity): The SDK automatically adds stream_options.include_usage = true
•Anthropic: Usage is extracted from message_start and message_delta events
•Google Gemini: Both generateContent() and generateContentStream() are wrapped
•Mistral: The chat.stream() method is wrapped to capture usage

OpenAI Streaming

import OpenAI from "openai";
import { burnwise } from "@burnwise/sdk";

const openai = burnwise.openai.wrap(new OpenAI(), {
  feature: "chat-support",
});

// Streaming - usage is tracked automatically when stream completes
const stream = await openai.chat.completions.create({
  model: "gpt-5.2",
  messages: [{ role: "user", content: "Tell me a story" }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
// Usage tracked automatically when loop completes

Anthropic Streaming

import Anthropic from "@anthropic-ai/sdk";
import { burnwise } from "@burnwise/sdk";

const anthropic = burnwise.anthropic.wrap(new Anthropic(), {
  feature: "analysis",
});

// Streaming - usage is tracked automatically
const stream = await anthropic.messages.create({
  model: "claude-sonnet-4-5-20250929",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Write a poem" }],
  stream: true,
});

for await (const event of stream) {
  if (event.type === "content_block_delta") {
    process.stdout.write(event.delta.text || "");
  }
}
// Usage tracked automatically after stream completes

Google Gemini Streaming

import { GoogleGenerativeAI } from "@google/generative-ai";
import { burnwise } from "@burnwise/sdk";

const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY!);
const model = burnwise.google.wrapModel(
  genAI.getGenerativeModel({ model: "gemini-3.0-flash" }),
  { feature: "summarization" }
);

// Streaming
const result = await model.generateContentStream("Explain quantum computing");
for await (const chunk of result.stream) {
  process.stdout.write(chunk.text());
}
// Usage tracked automatically

Feature Tracking

Understand where your AI costs are going by tagging calls with features.

What are Features?

Features are labels you attach to your AI calls to track costs by use case. For example, you might have features like "chat-support", "document-analysis", or "auto-summary".

How to Use Features

// Track different features separately
const chatClient = burnwise.openai.wrap(new OpenAI(), {
  feature: "chat-support",
});

const analysisClient = burnwise.openai.wrap(new OpenAI(), {
  feature: "document-analysis",
});

const summaryClient = burnwise.openai.wrap(new OpenAI(), {
  feature: "auto-summary",
});

// Now you can see costs broken down by feature in the dashboard

Pro Tip

Use consistent feature names across your codebase. This makes it easier to track costs and identify optimization opportunities in the dashboard.

Hierarchical Agent Tracing

Track costs for multi-agent systems with parent-child relationships. See both individual sub-agent costs AND total orchestration costs.

Perfect for agent orchestration: When your main agent calls 10+ sub-agents, you can see the cost breakdown for each sub-agent and the total cost for the entire execution tree.

Basic Usage

Wrap your agent functions with burnwise.trace() to create hierarchical spans. Context propagates automatically via AsyncLocalStorage.

import { burnwise } from "@burnwise/sdk";

// Wrap agent execution to create a trace span
await burnwise.trace("idea-analysis", async () => {
  // All LLM calls inside are automatically tagged with:
  // - traceId: unique ID for the entire execution tree
  // - spanId: unique ID for this specific span
  // - spanName: "idea-analysis"
  // - traceDepth: 0 (root level)

  const market = await burnwise.trace("market-scan", async () => {
    // Nested span - same traceId, own spanId, parentSpanId points to parent
    return await marketAgent.run(idea);
  });

  const competitors = await burnwise.trace("competitor-analysis", async () => {
    return await competitorAgent.run(idea);
  });

  return { market, competitors };
});

How It Works

1. Automatic Context Propagation

When you call burnwise.trace(), it creates a trace context using Node.js AsyncLocalStorage. All LLM calls made within that function automatically inherit the trace context.

2. Tree Structure

Each span has the following fields:

• traceId: UUID shared by all spans in the same execution tree
• spanId: UUID unique to this specific span
• parentSpanId: UUID of the parent span (undefined for root)
• spanName: Human-readable name (e.g., "market-scan")
• traceDepth: Level in the tree (0 = root, max 3)

3. Depth Limit

Maximum 3 levels of nesting. If you exceed this, a warning is logged and the function runs without creating a new span.

Full Example: Multi-Agent Analysis

A complete example showing how to track an "idea-analysis" agent that orchestrates multiple sub-agents.

import { burnwise } from "@burnwise/sdk";
import Anthropic from "@anthropic-ai/sdk";

burnwise.init({ apiKey: process.env.BURNWISE_API_KEY! });

const anthropic = burnwise.anthropic.wrap(new Anthropic(), {
  feature: "idea-analysis",
});

async function analyzeIdea(idea: string) {
  return burnwise.trace("idea-analysis", async () => {
    // Market analysis sub-agent
    const market = await burnwise.trace("market-scan", async () => {
      const response = await anthropic.messages.create({
        model: "claude-sonnet-4-5-20250929",
        max_tokens: 2000,
        messages: [{ role: "user", content: `Analyze market for: ${idea}` }],
      });
      return response.content[0].text;
    });

    // Competitor analysis sub-agent
    const competitors = await burnwise.trace("competitor-analysis", async () => {
      const response = await anthropic.messages.create({
        model: "claude-sonnet-4-5-20250929",
        max_tokens: 2000,
        messages: [{ role: "user", content: `Find competitors for: ${idea}` }],
      });
      return response.content[0].text;
    });

    // Final synthesis with more powerful model
    const synthesis = await burnwise.trace("synthesis", async () => {
      const response = await anthropic.messages.create({
        model: "claude-opus-4-5-20251101",
        max_tokens: 4000,
        messages: [{
          role: "user",
          content: `Synthesize:\nMarket: ${market}\nCompetitors: ${competitors}`,
        }],
      });
      return response.content[0].text;
    });

    return { market, competitors, synthesis };
  });
}

// All 4 LLM calls tracked with same traceId
const analysis = await analyzeIdea("AI-powered recipe generator");

Tracing API Reference

// Async trace (most common)
const result = await burnwise.trace("span-name", async () => {
  return await doSomething();
});

// Sync trace for synchronous functions
const result = burnwise.traceSync("span-name", () => {
  return doSomethingSync();
});

// Trace with detailed result info
const { result, spanId, traceId, durationMs } = await burnwise.traceWithResult(
  "span-name",
  async () => await doSomething()
);

// Check if currently inside a trace
if (burnwise.isInTrace()) {
  console.log("Currently in a trace");
}

// Get current trace context
const context = burnwise.getTraceContext();
if (context) {
  console.log(`Trace: ${context.traceId}, Span: ${context.spanId}`);
}

View Traces in Dashboard

Once you use trace(), your execution flows appear in the Traces page:

• Tree visualization with expand/collapse for each span
• Cost breakdown: total per trace AND per individual span
• Latency bars showing relative duration
• Color-coded provider/model indicators
• Status tracking (success, error, timeout)

API Reference

Complete reference for all SDK methods and configuration options.

burnwise.init(config)

Initialize the Burnwise SDK with your configuration.

burnwise.init({
  // Required: Your Burnwise API key
  apiKey: "bw_live_xxx",

  // Optional: Base URL (for self-hosted)
  baseUrl: "https://api.burnwise.io",

  // Optional: Enable debug logging (shows init confirmation)
  debug: true,

  // Optional: Batch events (default: true)
  batchEvents: true,

  // Optional: Batch flush interval in ms (default: 5000)
  batchFlushInterval: 5000,

  // Optional: Maximum batch size (default: 100)
  maxBatchSize: 100,

  // Optional: Environment
  environment: "production", // "production" | "staging" | "development"
});

// With debug: true, you'll see:
// [Burnwise] ✅ Initialized (production)
// [Burnwise]    API Key: bw_live_xx...
// [Burnwise]    Endpoint: https://www.burnwise.io/api
// [Burnwise]    Batching: enabled (5000ms)

// Check if SDK is initialized
if (burnwise.isInitialized()) {
  // SDK is ready to use
}

Option	Type	Description
apiKey	string	Your Burnwise API key (required)
baseUrl	string	Custom API endpoint (default: https://api.burnwise.io)
debug	boolean	Enable debug logging (default: false)
batchEvents	boolean	Batch events before sending (default: true)
batchFlushInterval	number	Flush interval in ms (default: 5000)
maxBatchSize	number	Maximum batch size (default: 100)
environment	string	Environment: "production" \| "staging" \| "development"

burnwise.isInitialized()

Check if the SDK has been initialized. Useful for conditional environments where the SDK might not be initialized.

if (burnwise.isInitialized()) {
  // SDK is ready - safe to use burnwise.trace(), wrappers, etc.
}

track(event)

Manually track an LLM event. Useful for custom integrations or providers not directly supported.

import { track } from "@burnwise/sdk";

// For advanced use cases, track events manually
await track({
  provider: "openai",
  model: "gpt-5.2",
  feature: "custom-feature",
  promptTokens: 100,
  completionTokens: 50,
  latencyMs: 1200,
  costUsd: 0.002,
  status: "success",
});

Field Limits

When sending events to Burnwise, the following string field limits apply. If a field exceeds its limit, you'll receive a validation error with details about which field caused the issue.

Field	Max Length	Description
feature	200 chars	Feature slug for cost attribution
spanName	200 chars	Span name from trace context
model	100 chars	Model identifier
userId	255 chars	Hashed user identifier
sessionId	255 chars	Session identifier
requestId	255 chars	Request identifier
traceId	255 chars	Trace ID for grouping
spanId	255 chars	Span ID
parentSpanId	255 chars	Parent span ID
errorCode	100 chars	Error code if status is error

Privacy

Burnwise is designed with privacy as a core principle.

What We Track

Token counts (input and output)
Model name and provider
Cost (calculated from tokens)
Latency
Feature tags you define

What We NEVER Track

Prompt content
Completion content
User data within prompts
System prompts
Function/tool definitions

Compliance

GDPR compliant
SOC 2 Type II (in progress)
All data encrypted in transit (TLS 1.3)
All data encrypted at rest (AES-256)
EU data residency available