Inference built for agentic intelligence.

Zyphra Inference is purpose-built to serve long horizon agentic workloads where context length and multi-step reasoning define performance.

Read Technical Blog

Inference built for agentic intelligence.

Zyphra Inference is purpose-built to serve long horizon agentic workloads where context length and multi-step reasoning define performance.

Read Technical Blog

OPTIMIZED FOR AMD

Every layer, optimized for performance.

Deep Integration across the AMD stack to deliver the best performance and efficiency for long-context, agentic workloads.

01 AMD Instinct™ GPUs

High memory and compute capacity for long context inference

02 ROCm-Optimized

Tuned kernels, parallelism schemes, and libraries for maximum throughput

03 High-Performance Networking

Parallelism tuned on AMD Infinity Fabric for low latency and high throughput

Built to excel in what agents need most.

Long Context

Built on AMD Instinct™ GPUs with high memory capacity to handle extremely long contexts without degradation.

Long Horizon Workloads

Maintain coherence across thousands of steps, tool calls, and decision in agentic workflows.

Throughput at Scale

High-throughput inference stack optimized for real-time multi-agent systems.

Model Library

Open-source Third-party Models

Kimi-K2.6

Multimodal agentic model with long-horizon coding and agent swarm.

Learn more

GLM 5.1

Refined post-training for coding and agentic engineering workflows.

Learn more

DeepSeek-V3.2

DeepSeek-V3.2, a model that harmonizes high computational efficiency with superior reasoning and agent performance.

Learn more

Pricing

Simple, transparent pricing.

Serverless Inference Pricing per Model

price per 1M tokens

Model

Provider

Input Price

Cached Input Price

Output Price

GLM 5.1

ZAI

$1.20

$0.20

$4.00

DeepSeek-V3.2

DeepSeek

$0.25

$0.15

$0.50

Kimi-K2.6

Moonshot

$0.80

$0.15

$3.50

Pricing

Simple, transparent pricing.

Serverless Inference Pricing per Model

price per 1M tokens

Model

Provider

Input Price

Cached Input Price

Output Price

Context length

GLM 5.1

ZAI

$1.20

$0.20

$4.00

200k

DeepSeek-V3.2

DeepSeek

$0.25

$0.15

$0.50

163.8K

Kimi-K2.6

Moonshot

$0.80

$0.15

$3.50

256K

The inference platform for what's next.

View Docs

Status