Models

google/gemini-3-flash-preview

Gemini 3 Pro is Google's flagship frontier model for high-precision multimodal reasoning, combining strong performance across text, image, video, audio, and code with a 1M-token context window.

1050K Context$0.5/M Input tokens | $3.0/M Output tokens

google/gemini-3-pro-preview

Gemini 3 Pro is Google's flagship frontier model for high-precision multimodal reasoning, combining strong performance across text, image, video, audio, and code with a 1M-token context window.

1050K Context$2.0/M Input tokens | $12.0/M Output tokens

openai/gpt-5.1

Smaller, cheaper GPT-5 variant retaining strong quality with tool-use support for practical apps.

400K Context$1.3/M Input tokens | $1.0/M Output tokens

minimaxai/minimax-m2

A mini model built for Max coding & agentic workflows with just 10 billion activated parameters.

262K Context$0.3/M Input tokens | $1.1/M Output tokens

anthropic/claude-haiku-4-5

Our fastest model with near-frontier intelligence.

200K Context$1.0/M Input tokens | $5.0/M Output tokens

anthropic/claude-sonnet-4-5

Our smartest model for complex agents and coding.

200K Context$3.0/M Input tokens | $15.0/M Output tokens

qwen/qwen3-235b-a22b-instruct-2507

Qwen3 235B A22B Instruct (2507) for premium general-purpose assistance with strong multilingual depth.

262K Context$0.1/M Input tokens | $0.5/M Output tokens

qwen/qwen3-235b-a22b-thinking-2507

Reasoning-focused Qwen3 235B A22B (2507) tuned for multi-step problem solving and analysis.

262K Context$0.2/M Input tokens | $2.4/M Output tokens

moonshotai/kimi-k2-instruct

Moonshot Kimi K2 general-purpose instruct model for high-quality chat, analysis, and content writing.

131K Context$0.5/M Input tokens | $2.0/M Output tokens

openai/gpt-5

Flagship GPT-5 general model for advanced reasoning, long-context understanding, and high-fidelity generation.

400K Context$1.3/M Input tokens | $100.0/M Output tokens

openai/gpt-5-mini

Smaller, cheaper GPT-5 variant retaining strong quality with tool-use support for practical apps.

400K Context$0.3/M Input tokens | $2.0/M Output tokens

openai/gpt-5-nano

Ultra-efficient GPT-5 tier for very low-cost, low-latency tasks and on-device style workloads.

400K Context$0.1/M Input tokens | $0.4/M Output tokens

anthropic/claude-4.1-opus

Claude Opus 4.1 is Anthropic’s most capable flagship model with highest reasoning depth, vision, extended thinking, and a long 200K-token context window.

200K Context$15.0/M Input tokens | $75.0/M Output tokens

qwen/qwen3-coder-480b-a35b-instruct

Ultra-large code LLM geared for complex generation, repository-scale refactors, and multi-language support.

262K Context$0.4/M Input tokens | $1.6/M Output tokens

black-forest-labs/flux.2-max

The new top-tier image model from Black Forest Labs, significantly pushing image quality and editing consistency.

$0.0700 per image

black-forest-labs/flux.2-dev

Brand-new Flux2 Dev introduces a faster, more modular architecture for next-generation image generation pipelines.

$0.0013 - $0.3125 per image

black-forest-labs/flux.2-pro

Multi-reference visual intelligence with unprecedented detail, color precision, and spatial reasoning. The most advanced image generation and editing model.

$0.0150 per image

meta/llama-3.1-8b-instruct-turbo

Llama 3.1 8B Instruct Turbo is a mid-size assistant for robust chat, reasoning, and code with tool use and long context.

131K Context$0.0/M Input tokens | $0.1/M Output tokens

meta/llama-3.1-70b-instruct-turbo

Llama 3.1 70B Instruct Turbo is an updated high-capacity model with improved instruction following, tool use, and longer context.

131K Context$0.1/M Input tokens | $0.3/M Output tokens

meta/llama-3.1-405b-instruct-turbo

Llama 3.1 405B Instruct Turbo is an ultra-scale model for top-tier accuracy, complex reasoning, and high-quality code generation.

131K Context$0.8/M Input tokens | $0.8/M Output tokens