List of Top models, Vision & Audio APIs

Explore the Best AI Models

xai/grok-2-1212

X.AI grok 2 1212 language model for text generation and understanding.

131K Context$2.0/M Input tokens | $10.0/M Output tokens

xai/grok-imagine-image-pro

X.AI grok imagine image pro model for generating high-quality images and visual content.

Formula: 0.07

xai/grok-imagine-image

Advanced image generation model with support for multiple resolutions and styles.

Formula: 0.02

xai/grok-2-image-1212

X.AI grok 2 image 1212 model for generating high-quality images and visual content.

Formula: 0.07

xai/grok-4-0709

Grok 4 with July 9th 2024 training cutoff, featuring enhanced reasoning capabilities.

256K Context$3.0/M Input tokens | $15.0/M Output tokens

xai/grok-4-1-fast-non-reasoning

X.AI grok 4 1 fast non reasoning language model for text generation and understanding.

2000K Context$0.2/M Input tokens | $0.5/M Output tokens

xai/grok-2-vision-1212

X.AI grok 2 vision 1212 multimodal vision model for understanding and generating content from visual inputs.

33K Context$2.0/M Input tokens | $10.0/M Output tokens

xai/grok-4-1-fast-reasoning

X.AI grok 4 1 fast reasoning language model for text generation and understanding.

2000K Context$0.2/M Input tokens | $0.5/M Output tokens

xai/grok-4-fast-non-reasoning

X.AI grok 4 fast non reasoning language model for text generation and understanding.

2000K Context$0.2/M Input tokens | $0.5/M Output tokens

xai/grok-3

Grok 3 is a powerful language model with advanced capabilities for text generation and understanding.

131K Context$3.0/M Input tokens | $15.0/M Output tokens

xai/grok-4-fast-reasoning

X.AI grok 4 fast reasoning language model for text generation and understanding.

2000K Context$0.2/M Input tokens | $0.5/M Output tokens

xai/grok-3-mini

Lightweight version of Grok 3 optimized for cost-effective applications.

131K Context$0.3/M Input tokens | $0.5/M Output tokens

xai/grok-code-fast-1

X.AI grok code fast 1 specialized model for code generation and understanding.

256K Context$0.2/M Input tokens | $1.5/M Output tokens

google/gemini-3-flash-preview

Gemini 3 Pro is Google's flagship frontier model for high-precision multimodal reasoning, combining strong performance across text, image, video, audio, and code with a 1M-token context window.

1050K Context$0.5/M Input tokens | $3.0/M Output tokens

openai/gpt-5.2

Flagship model for coding and agentic tasks across industries. Supports reasoning with effort levels: none, low, medium, high, xhigh.

400K Context$1.8/M Input tokens | $14.0/M Output tokens

arcee-ai/trinity-mini

Vision-capable model for image understanding, OCR, captioning, and multimodal Q&A.

0K ContextNo pricing info

google/gemini-3-pro-preview

Gemini 3 Pro is Google's flagship frontier model for high-precision multimodal reasoning, combining strong performance across text, image, video, audio, and code with a 1M-token context window.

1050K Context$2.0/M Input tokens | $12.0/M Output tokens

openai/gpt-5.1

Smaller, cheaper GPT-5 variant retaining strong quality with tool-use support for practical apps.

400K Context$1.3/M Input tokens | $1.0/M Output tokens

minimaxai/minimax-m2

A mini model built for Max coding & agentic workflows with just 10 billion activated parameters.

262K Context$0.3/M Input tokens | $1.1/M Output tokens

anthropic/claude-haiku-4-5

Our fastest model with near-frontier intelligence.

200K Context$1.0/M Input tokens | $5.0/M Output tokens