Last updated: June 23, 2026

Fugu Ultra is Sakana AI’s quality-first version of Sakana Fugu, a multi-agent model orchestration system exposed through one OpenAI-compatible API. If standard Fugu is the lower-latency default, Fugu Ultra is the variant Sakana positions for harder multi-step tasks where answer depth matters more than response time.

Quick answer: Fugu Ultra is not just a bigger single chatbot. It is part of Sakana Fugu, a learned orchestration system that can coordinate a pool of expert AI agents. Sakana lists fugu-ultra-20260615 pricing at $5 per 1M input tokens, $30 per 1M output tokens, and $0.50 per 1M cached input tokens, with higher rates for context above 272K tokens. Sakana’s published benchmark table shows strong results, but those are vendor-published results and should be validated on your own workload.

For the broader launch and Claude Fable 5 comparison, read the main Sakana AI Fugu review. This page focuses only on the Fugu Ultra query cluster: model type, pricing, benchmarks, use cases, and evaluation checklist.

What is Fugu Ultra?

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Fugu Ultra is the performance-oriented variant of Sakana Fugu. Sakana describes Fugu as a multi-agent system delivered as one model: you call one API, while Fugu handles model selection, routing, coordination, and synthesis behind the scenes.

The difference is operating mode:

Model Practical role Tradeoff
Fugu Everyday coding, code review, chatbots, and interactive work Designed to balance latency and quality
Fugu Ultra Hard research, long coding tasks, paper reproduction, patent/literature analysis, and complex reasoning Prioritizes answer quality and deeper orchestration, with more latency and potentially more usage

Sakana’s Fugu product page says Fugu and Fugu Ultra are both available through one OpenAI-compatible API. That matters for developers because you can test the two modes without rebuilding your whole integration.

Fugu Ultra vs Fugu

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

The simplest way to choose is to start with the task shape.

Use Fugu first when:

  • the user is waiting interactively;
  • latency matters;
  • the task is routine coding, editing, or summarization;
  • you need configurable agent opt-outs for data, privacy, or compliance;
  • a cheaper or simpler model can solve the task reliably.

Evaluate Fugu Ultra when:

  • the task is multi-step and failure is expensive;
  • the work benefits from planning, verification, and synthesis;
  • the prompt contains a paper, codebase, dataset, patent set, or long document set;
  • answer depth matters more than immediate response time;
  • you are comparing against premium models such as Claude Fable 5, Opus 4.8, GPT-5.5, or Gemini 3.1 Pro.

Sakana’s own examples for Fugu Ultra include Kaggle competitions, paper reproduction, cybersecurity analysis, and literature or patent investigations. Treat those as starting points for evaluation rather than fixed expectations.

Fugu Ultra pricing

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana lists fixed pay-as-you-go pricing for fugu-ultra-20260615:

Token type Price per 1M tokens Price when context is over 272K
Input $5 $10
Output $30 $45
Cached input $0.50 $1.00

Sakana also lists monthly subscription tiers that include access to both Fugu and Fugu Ultra:

Plan Price Positioning
Standard $20/month Lightweight daily usage
Pro $100/month Focused coding, review, research, and analysis sessions
Max $200/month Heavier long-running workloads

The important evaluation metric is cost per completed task, not only price per token. Fugu Ultra may use hidden orchestration work behind the final answer. Sakana’s pricing page explains that orchestration tokens can count toward usage. For a fair test, log input, output, cached-input, orchestration usage, latency, retry rate, and human review time.

Fugu Ultra benchmark results

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana’s published Fugu table reports the following Fugu Ultra scores:

Benchmark Fugu Ultra score Notes
SWE Bench Pro 73.7 Software engineering benchmark in Sakana’s table
TerminalBench 2.1 82.1 Terminal/task execution benchmark
LiveCodeBench 93.2 Competitive/programming benchmark
LiveCodeBench Pro 90.8 Harder coding variant
Humanity’s Last Exam 50.0 Broad hard reasoning/knowledge benchmark
CharXiv Reasoning 86.6 Scientific/visual reasoning benchmark
GPQA-D 95.5 Graduate-level science QA benchmark
SciCode 58.7 Scientific coding benchmark
MRCRv2 93.6 Long-context/retrieval-style benchmark

These numbers are useful, but they should be read with the caveat Sakana itself gives: baseline scores are provider-reported, and the Fugu results are from Sakana’s June 2026 evaluation. Teams should not make production routing decisions from the table alone.

A better internal test is to choose 20-50 representative tasks and compare:

  • success rate;
  • answer quality after human review;
  • cost per successful task;
  • latency;
  • retry count;
  • policy or compliance failures;
  • whether the final answer is easier to audit.

Fugu Ultra vs Claude Fable 5

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Fugu Ultra and Claude Fable 5 are often compared because both target difficult long-horizon work. The comparison is useful, but the products are structurally different.

Comparison Fugu Ultra Claude Fable 5
Type Multi-agent orchestrator exposed as one model Single Anthropic Mythos-class model
Provider Sakana AI Anthropic
API style OpenAI-compatible Sakana API Claude API
Pricing headline $5 input / $30 output per 1M tokens $10 input / $50 output per 1M tokens when available
Transparency Less visible worker routing Clear model identity, with documented safeguards
Access context Sakana says Fugu is unavailable in EU/EEA while compliance work continues Anthropic’s Fable page listed access as unavailable in the June 23 source check

The practical question is not “which one wins?” It is: which option fits your workload, compliance constraints, latency tolerance, and provider-risk model? If Fable access is unavailable or your team wants a provider-flexible orchestration layer, Fugu Ultra may be worth evaluating. If you need a single known model identity and Anthropic access is available, Fable 5 may be simpler to govern.

Is Fugu Ultra open source?

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

No — not in the sense of “the full hosted model orchestration service is open source.” Sakana has a public SakanaAI/fugu GitHub repository, and that repository includes setup docs, configuration materials, the README, assets, and the technical report. But the hosted Fugu and Fugu Ultra service is an API product.

A procurement-friendly distinction:

  • public GitHub repo: yes;
  • public technical report: yes;
  • open-source hosted orchestration service: no;
  • exact worker routing visible to users: no;
  • API product with documented pricing: yes.

When should you test Fugu Ultra?

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Test Fugu Ultra when the work is hard enough that orchestration may matter. Good candidates include:

  • long codebase review;
  • migration planning;
  • paper reproduction;
  • patent and literature analysis;
  • Kaggle-style data science;
  • research synthesis;
  • scoped security review with evidence requirements;
  • complex tasks where a single-model answer often needs multiple retries.

Do not start with Fugu Ultra for every task. For short writing, simple summarization, quick Q&A, or basic code snippets, a lower-latency and lower-cost model may be enough.

Sources checked

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

FAQ

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.
What is Fugu Ultra?
Fugu Ultra is Sakana AI’s quality-first version of Sakana Fugu. It is designed for complex multi-step tasks where answer depth matters more than latency, such as hard coding, research, paper reproduction, and literature or patent analysis.
Is Fugu Ultra different from Sakana Fugu?
Fugu Ultra is one of the two Sakana Fugu variants. Standard Fugu is the balanced, lower-latency option. Fugu Ultra uses deeper orchestration and is positioned for harder tasks.
How much does Fugu Ultra cost?
Sakana lists fugu-ultra-20260615 at $5 per 1M input tokens, $30 per 1M output tokens, and $0.50 per 1M cached input tokens. For context above 272K tokens, Sakana lists $10 input, $45 output, and $1.00 cached input per 1M tokens.
Is Fugu Ultra open source?
The full hosted Fugu Ultra orchestration service is not open source. Sakana has a public GitHub repository with setup materials and the technical report, but the API service and exact orchestration behavior are proprietary.
Should I use Fugu or Fugu Ultra?
Use standard Fugu first for everyday interactive work where latency matters. Evaluate Fugu Ultra for harder multi-step tasks where planning, verification, and synthesis may justify higher cost or latency.