Anthropic Claude 3.7 Sonnet
anthropic-anthropic-claude-37-sonnet • anthropic • Anthropic
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.37/0.84 per 1M
Class: closed
Model detail + compatibility
Anthropic Claude 3.7 Sonnet listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"anthropic-anthropic-claude-37-sonnet"}'
AWS Nova Pro
bedrock-aws-nova-pro • general • AWS Bedrock
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 1.3/1.7 per 1M
Class: closed
Model detail + compatibility
AWS Nova Pro listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"bedrock-aws-nova-pro"}'
Claude Opus 4.1
anthropic-claude-opus-41 • anthropic • Anthropic
loading
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.7/2.2 per 1M
Class: closed
Model detail + compatibility
Claude Opus 4.1 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"anthropic-claude-opus-41"}'
Claude Opus 4.6
claude-opus-4-6 • anthropic • Anthropic
warm
toolsjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 3/15 per 1M
Class: closed
Model detail + compatibility
Strong long-form reasoning and coding model with robust instruction quality.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: not supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"claude-opus-4-6"}'
Claude Sonnet 4
anthropic-claude-sonnet-4 • anthropic • Anthropic
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.9/2.7 per 1M
Class: closed
Model detail + compatibility
Claude Sonnet 4 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"anthropic-claude-sonnet-4"}'
Cohere Command A
cohere-cohere-command-a • cohere • Cohere
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.61/1.32 per 1M
Class: closed
Model detail + compatibility
Cohere Command A listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"cohere-cohere-command-a"}'
Cohere Command R+
cohere-cohere-command-r • cohere • Cohere
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 1.1/2.2 per 1M
Class: closed
Model detail + compatibility
Cohere Command R+ listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"cohere-cohere-command-r"}'
DeepSeek R1
deepseek-deepseek-r1 • deepseek • DeepSeek
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 1.5/4.2 per 1M
Class: open
Model detail + compatibility
DeepSeek R1 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"deepseek-deepseek-r1"}'
DeepSeek V3.1
deepseek-deepseek-v31 • deepseek • DeepSeek
loading
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 1.7/4.7 per 1M
Class: open
Model detail + compatibility
DeepSeek V3.1 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"deepseek-deepseek-v31"}'
Gemini 2.5 Flash
google-gemini-25-flash • google • Google
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.45/1 per 1M
Class: closed
Model detail + compatibility
Gemini 2.5 Flash listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"google-gemini-25-flash"}'
Gemini 2.5 Pro
google-gemini-25-pro • google • Google
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 1.1/3.2 per 1M
Class: closed
Model detail + compatibility
Gemini 2.5 Pro listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"google-gemini-25-pro"}'
Gemma 3 27B
google-gemma-3-27b • gemma • Google
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.7/6.7 per 1M
Class: closed
Model detail + compatibility
Gemma 3 27B listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"google-gemma-3-27b"}'
Gemma 3 27B Instruct
google-gemma-3-27b-it • gemma • Google
warm
toolsjson_modestreamingcompletion
Context: 131,072
Max output: 16,384
Pricing: 0.16/0.5 per 1M
Class: open
Model detail + compatibility
Open-weight instruction model optimized for cost-efficient coding and multilingual tasks.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: not supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"google-gemma-3-27b-it"}'
GLM-5
zhipu-glm-5 • general • Zhipu AI
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 1.7/3.7 per 1M
Class: closed
Model detail + compatibility
GLM-5 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"zhipu-glm-5"}'
Grok 4
xai-grok-4 • xai • xAI
cold
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.9/1.7 per 1M
Class: closed
Model detail + compatibility
Grok 4 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"xai-grok-4"}'
Hugging Face Zephyr 2
huggingface-hugging-face-zephyr-2 • general • Hugging Face
cold
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.21/0.52 per 1M
Class: open
Model detail + compatibility
Hugging Face Zephyr 2 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"huggingface-hugging-face-zephyr-2"}'
Jamba 1.6
nvidia-jamba-16 • general • Nvidia
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.69/1.48 per 1M
Class: closed
Model detail + compatibility
Jamba 1.6 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"nvidia-jamba-16"}'
Kimi K2
qwen-kimi-k2 • qwen • Qwen
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 2.3/5.2 per 1M
Class: open
Model detail + compatibility
Kimi K2 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"qwen-kimi-k2"}'
Llama 3.3 70B Instruct
meta-llama-3-3-70b-instruct • llama • Meta
warm
toolsjson_modestreamingcompletion
Context: 128,000
Max output: 16,384
Pricing: 0.59/0.79 per 1M
Class: open
Model detail + compatibility
Strong open foundation model for agentic pipelines and coding assistance.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: not supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"meta-llama-3-3-70b-instruct"}'
Llama 4 Maverick
meta-llama-4-maverick • llama • Meta
loading
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 2.3/6.2 per 1M
Class: open
Model detail + compatibility
Llama 4 Maverick listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"meta-llama-4-maverick"}'
Llama 4 Scout
meta-llama-4-scout • llama • Meta
loading
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.21/0.52 per 1M
Class: open
Model detail + compatibility
Llama 4 Scout listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"meta-llama-4-scout"}'
MiniMax M2.5
minimax-minimax-m25 • general • Minimax
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.13/0.36 per 1M
Class: closed
Model detail + compatibility
MiniMax M2.5 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"minimax-minimax-m25"}'
Mistral Large 2
mistral-mistral-large-2 • mistral • Mistral
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.5/7.2 per 1M
Class: open
Model detail + compatibility
Mistral Large 2 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"mistral-mistral-large-2"}'
Mixtral 8x22B
mistral-mixtral-8x22b • mistral • Mistral
cold
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.37/0.84 per 1M
Class: open
Model detail + compatibility
Mixtral 8x22B listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"mistral-mixtral-8x22b"}'
Nemotron Ultra
nvidia-nemotron-ultra • general • Nvidia
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.37/0.84 per 1M
Class: closed
Model detail + compatibility
Nemotron Ultra listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"nvidia-nemotron-ultra"}'
OLMo 2 32B
huggingface-olmo-2-32b • general • Hugging Face
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.61/1.32 per 1M
Class: open
Model detail + compatibility
OLMo 2 32B listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"huggingface-olmo-2-32b"}'
OpenAI GPT-4.1
openai-openai-gpt-41 • openai • OpenAI
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.13/0.36 per 1M
Class: open
Model detail + compatibility
OpenAI GPT-4.1 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"openai-openai-gpt-41"}'
OpenAI GPT-5
openai-openai-gpt-5 • openai • OpenAI
cold
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.3/1.2 per 1M
Class: open
Model detail + compatibility
OpenAI GPT-5 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"openai-openai-gpt-5"}'
OpenAI GPT-5 (OpenRouter)
openai-gpt-5-openrouter • openai • OpenAI
warm
toolsvisionjson_modestreamingcompletion
Context: 400,000
Max output: 50,000
Pricing: 1.25/10 per 1M
Class: open
Model detail + compatibility
Frontier model on OpenRouter with multimodal inputs and strong reasoning depth for complex tasks.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"openai-gpt-5-openrouter"}'
OpenAI GPT-5.3 Codex
gpt-5-3-codex • openai • OpenAI
warm
toolsjson_modestreamingcompletion
Context: 400,000
Max output: 50,000
Pricing: 1.75/14 per 1M
Class: open
Model detail + compatibility
High-end coding and reasoning model with large context and tool support.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: not supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"gpt-5-3-codex"}'
OpenRouter Quasar
openrouter-openrouter-quasar • general • OpenRouter
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.69/1.48 per 1M
Class: open
Model detail + compatibility
OpenRouter Quasar listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"openrouter-openrouter-quasar"}'
Perplexity Sonar Large
perplexity-perplexity-sonar-large • general • Perplexity
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.13/0.36 per 1M
Class: closed
Model detail + compatibility
Perplexity Sonar Large listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"perplexity-perplexity-sonar-large"}'
Phi-4
microsoft-phi-4 • general • Microsoft
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.45/1 per 1M
Class: closed
Model detail + compatibility
Phi-4 listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"microsoft-phi-4"}'
Qwen3 235B
qwen-qwen3-235b • qwen • Qwen
loading
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 2.1/5.7 per 1M
Class: open
Model detail + compatibility
Qwen3 235B listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"qwen-qwen3-235b"}'
Qwen3 Max
qwen-qwen3-max • qwen • Qwen
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 1.9/5.2 per 1M
Class: open
Model detail + compatibility
Qwen3 Max listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"qwen-qwen3-max"}'
xAI Grok Mini
xai-xai-grok-mini • xai • xAI
warm
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 1.9/3.2 per 1M
Class: closed
Model detail + compatibility
xAI Grok Mini listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"xai-xai-grok-mini"}'
Yi Large
qwen-yi-large • qwen • Qwen
loading
toolsvisionjson_modestreamingcompletion
Context: 200,000
Max output: 25,000
Pricing: 0.21/0.52 per 1M
Class: open
Model detail + compatibility
Yi Large listed under llm foundation models for AI Bazaar discovery and comparison workflows.
temperature: supported
top_p: supported
top_k: supported
min_p: supported
max_tokens: supported
frequency_penalty: supported
presence_penalty: supported
stop: supported
seed: supported
tools: supported
vision: supported
stream: supported
response_format_json: supported
curl -X POST /api/v1/chat/completions -d '{"model":"qwen-yi-large"}'