Models

Runtime registry with capability flags, context limits, pricing, and route state.

37 models indexed

Family

All anthropic cohere deepseek elevenlabs gemma general google llama mistral openai qwen runway stability xai

Capability

All tools vision json_mode streaming completion

Runtime state

All warm loading cold

Anthropic Claude 3.7 Sonnet

anthropic-anthropic-claude-37-sonnet • anthropic • Anthropic

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.37/0.84 per 1M

Class: closed

Model detail + compatibility

Anthropic Claude 3.7 Sonnet listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"anthropic-anthropic-claude-37-sonnet"}'

AWS Nova Pro

bedrock-aws-nova-pro • general • AWS Bedrock

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 1.3/1.7 per 1M

Class: closed

Model detail + compatibility

AWS Nova Pro listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"bedrock-aws-nova-pro"}'

Claude Opus 4.1

anthropic-claude-opus-41 • anthropic • Anthropic

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.7/2.2 per 1M

Class: closed

Model detail + compatibility

Claude Opus 4.1 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"anthropic-claude-opus-41"}'

Claude Opus 4.6

claude-opus-4-6 • anthropic • Anthropic

warm

toolsjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 3/15 per 1M

Class: closed

Model detail + compatibility

Strong long-form reasoning and coding model with robust instruction quality.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: not supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"claude-opus-4-6"}'

Claude Sonnet 4

anthropic-claude-sonnet-4 • anthropic • Anthropic

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.9/2.7 per 1M

Class: closed

Model detail + compatibility

Claude Sonnet 4 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"anthropic-claude-sonnet-4"}'

Cohere Command A

cohere-cohere-command-a • cohere • Cohere

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.61/1.32 per 1M

Class: closed

Model detail + compatibility

Cohere Command A listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"cohere-cohere-command-a"}'

Cohere Command R+

cohere-cohere-command-r • cohere • Cohere

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 1.1/2.2 per 1M

Class: closed

Model detail + compatibility

Cohere Command R+ listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"cohere-cohere-command-r"}'

DeepSeek R1

deepseek-deepseek-r1 • deepseek • DeepSeek

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 1.5/4.2 per 1M

Class: open

Model detail + compatibility

DeepSeek R1 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"deepseek-deepseek-r1"}'

DeepSeek V3.1

deepseek-deepseek-v31 • deepseek • DeepSeek

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 1.7/4.7 per 1M

Class: open

Model detail + compatibility

DeepSeek V3.1 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"deepseek-deepseek-v31"}'

Gemini 2.5 Flash

google-gemini-25-flash • google • Google

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.45/1 per 1M

Class: closed

Model detail + compatibility

Gemini 2.5 Flash listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"google-gemini-25-flash"}'

Gemini 2.5 Pro

google-gemini-25-pro • google • Google

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 1.1/3.2 per 1M

Class: closed

Model detail + compatibility

Gemini 2.5 Pro listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"google-gemini-25-pro"}'

Gemma 3 27B

google-gemma-3-27b • gemma • Google

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.7/6.7 per 1M

Class: closed

Model detail + compatibility

Gemma 3 27B listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"google-gemma-3-27b"}'

Gemma 3 27B Instruct

google-gemma-3-27b-it • gemma • Google

warm

toolsjson_modestreamingcompletion

Context: 131,072

Max output: 16,384

Pricing: 0.16/0.5 per 1M

Class: open

Model detail + compatibility

Open-weight instruction model optimized for cost-efficient coding and multilingual tasks.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: not supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"google-gemma-3-27b-it"}'

GLM-5

zhipu-glm-5 • general • Zhipu AI

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 1.7/3.7 per 1M

Class: closed

Model detail + compatibility

GLM-5 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"zhipu-glm-5"}'

Grok 4

xai-grok-4 • xai • xAI

cold

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.9/1.7 per 1M

Class: closed

Model detail + compatibility

Grok 4 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"xai-grok-4"}'

Hugging Face Zephyr 2

huggingface-hugging-face-zephyr-2 • general • Hugging Face

cold

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.21/0.52 per 1M

Class: open

Model detail + compatibility

Hugging Face Zephyr 2 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"huggingface-hugging-face-zephyr-2"}'

Jamba 1.6

nvidia-jamba-16 • general • Nvidia

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.69/1.48 per 1M

Class: closed

Model detail + compatibility

Jamba 1.6 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"nvidia-jamba-16"}'

Kimi K2

qwen-kimi-k2 • qwen • Qwen

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 2.3/5.2 per 1M

Class: open

Model detail + compatibility

Kimi K2 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"qwen-kimi-k2"}'

Llama 3.3 70B Instruct

meta-llama-3-3-70b-instruct • llama • Meta

warm

toolsjson_modestreamingcompletion

Context: 128,000

Max output: 16,384

Pricing: 0.59/0.79 per 1M

Class: open

Model detail + compatibility

Strong open foundation model for agentic pipelines and coding assistance.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: not supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"meta-llama-3-3-70b-instruct"}'

Llama 4 Maverick

meta-llama-4-maverick • llama • Meta

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 2.3/6.2 per 1M

Class: open

Model detail + compatibility

Llama 4 Maverick listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"meta-llama-4-maverick"}'

Llama 4 Scout

meta-llama-4-scout • llama • Meta

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.21/0.52 per 1M

Class: open

Model detail + compatibility

Llama 4 Scout listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"meta-llama-4-scout"}'

MiniMax M2.5

minimax-minimax-m25 • general • Minimax

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.13/0.36 per 1M

Class: closed

Model detail + compatibility

MiniMax M2.5 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"minimax-minimax-m25"}'

Mistral Large 2

mistral-mistral-large-2 • mistral • Mistral

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.5/7.2 per 1M

Class: open

Model detail + compatibility

Mistral Large 2 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"mistral-mistral-large-2"}'

Mixtral 8x22B

mistral-mixtral-8x22b • mistral • Mistral

cold

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.37/0.84 per 1M

Class: open

Model detail + compatibility

Mixtral 8x22B listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"mistral-mixtral-8x22b"}'

Nemotron Ultra

nvidia-nemotron-ultra • general • Nvidia

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.37/0.84 per 1M

Class: closed

Model detail + compatibility

Nemotron Ultra listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"nvidia-nemotron-ultra"}'

OLMo 2 32B

huggingface-olmo-2-32b • general • Hugging Face

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.61/1.32 per 1M

Class: open

Model detail + compatibility

OLMo 2 32B listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"huggingface-olmo-2-32b"}'

OpenAI GPT-4.1

openai-openai-gpt-41 • openai • OpenAI

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.13/0.36 per 1M

Class: open

Model detail + compatibility

OpenAI GPT-4.1 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"openai-openai-gpt-41"}'

OpenAI GPT-5

openai-openai-gpt-5 • openai • OpenAI

cold

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.3/1.2 per 1M

Class: open

Model detail + compatibility

OpenAI GPT-5 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"openai-openai-gpt-5"}'

OpenAI GPT-5 (OpenRouter)

openai-gpt-5-openrouter • openai • OpenAI

warm

toolsvisionjson_modestreamingcompletion

Context: 400,000

Max output: 50,000

Pricing: 1.25/10 per 1M

Class: open

Model detail + compatibility

Frontier model on OpenRouter with multimodal inputs and strong reasoning depth for complex tasks.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"openai-gpt-5-openrouter"}'

OpenAI GPT-5.3 Codex

gpt-5-3-codex • openai • OpenAI

warm

toolsjson_modestreamingcompletion

Context: 400,000

Max output: 50,000

Pricing: 1.75/14 per 1M

Class: open

Model detail + compatibility

High-end coding and reasoning model with large context and tool support.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: not supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"gpt-5-3-codex"}'

OpenRouter Quasar

openrouter-openrouter-quasar • general • OpenRouter

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.69/1.48 per 1M

Class: open

Model detail + compatibility

OpenRouter Quasar listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"openrouter-openrouter-quasar"}'

Perplexity Sonar Large

perplexity-perplexity-sonar-large • general • Perplexity

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.13/0.36 per 1M

Class: closed

Model detail + compatibility

Perplexity Sonar Large listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"perplexity-perplexity-sonar-large"}'

Phi-4

microsoft-phi-4 • general • Microsoft

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.45/1 per 1M

Class: closed

Model detail + compatibility

Phi-4 listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"microsoft-phi-4"}'

Qwen3 235B

qwen-qwen3-235b • qwen • Qwen

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 2.1/5.7 per 1M

Class: open

Model detail + compatibility

Qwen3 235B listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"qwen-qwen3-235b"}'

Qwen3 Max

qwen-qwen3-max • qwen • Qwen

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 1.9/5.2 per 1M

Class: open

Model detail + compatibility

Qwen3 Max listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"qwen-qwen3-max"}'

xAI Grok Mini

xai-xai-grok-mini • xai • xAI

warm

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 1.9/3.2 per 1M

Class: closed

Model detail + compatibility

xAI Grok Mini listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"xai-xai-grok-mini"}'

Yi Large

qwen-yi-large • qwen • Qwen

toolsvisionjson_modestreamingcompletion

Context: 200,000

Max output: 25,000

Pricing: 0.21/0.52 per 1M

Class: open

Model detail + compatibility

Yi Large listed under llm foundation models for AI Bazaar discovery and comparison workflows.

temperature: supported

top_p: supported

top_k: supported

min_p: supported

max_tokens: supported

frequency_penalty: supported

presence_penalty: supported

stop: supported

seed: supported

tools: supported

vision: supported

stream: supported

response_format_json: supported

curl -X POST /api/v1/chat/completions -d '{"model":"qwen-yi-large"}'