OPTIMIZED FOR AMD
01
AMD Instinct™ GPUs
High memory and compute capacity for long context inference
02
ROCm-Optimized
Tuned kernels, parallelism schemes, and libraries for maximum throughput
03
High-Performance Networking
Parallelism tuned on AMD Infinity Fabric for low latency and high throughput
Built to excel in what agents need most.
Long Context
Built on AMD Instinct™ GPUs with high memory capacity to handle extremely long contexts without degradation.
Long Horizon Workloads
Maintain coherence across thousands of steps, tool calls, and decision in agentic workflows.
Throughput at Scale
High-throughput inference stack optimized for real-time multi-agent systems.
Model Library

MiMo-V2.5-Pro
Flagship agentic MoE model for complex software engineering and long-horizon autonomous tasks with 1M-token context.
Learn more

Kimi-K2.6
Multimodal agentic model with long-horizon coding and agent swarm.
Learn more

GLM 5.1
Refined post-training for coding and agentic engineering workflows.
Learn more

DeepSeek-V3.2
DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.
Learn more
Pricing
Simple, transparent pricing.
Model
Provider
Input Price
Cached Input Price
Output Price


