Sakana AI Fugu Review: Fugu Ultra vs Fable 5

Q: Does Sakana Fugu work with Codex or Cursor?

Sakana’s GitHub README lists a codex-fugu launcher for Codex after installation. Cursor usage depends on whether your Cursor setup supports the needed custom OpenAI-compatible endpoint and model ID; teams should verify with a small non-sensitive test before using it on real repositories.

Last updated: June 22, 2026

Sakana AI is getting attention because it is taking a different approach from simply training another large chatbot. Its latest launch, Sakana Fugu, is a model interface designed to coordinate other models. That makes the natural comparison Fugu Ultra vs Claude Fable 5: one is a multi-model orchestrator, the other is Anthropic’s high-end Mythos-class model for long-horizon work.

On June 22, 2026, the Tokyo AI lab released Fugu and Fugu Ultra, described by Sakana as a multi-agent orchestration system delivered through a single model API. Instead of asking developers to manually pick between Claude, GPT, Gemini, open models, custom agents, verification passes, and retry loops, Fugu tries to decide that internally.

Quick answer: Sakana Fugu is an OpenAI-compatible API that behaves like one model, but behind the scenes it can route, delegate, verify, and synthesize work across a pool of specialist AI agents. The standard Fugu model is built for everyday latency-sensitive work. Fugu Ultra is positioned for harder tasks where quality matters more than speed. Sakana reports strong Fugu Ultra results on coding, reasoning, science, and agentic benchmarks — including comparisons to Claude Fable 5 and Mythos Preview — but those results should be read as vendor-published evidence, not independent proof.

Review verdict: Fugu may be worth evaluating if your team needs a multi-model AI layer for coding, research, or difficult analysis. It should not be treated as a one-to-one substitute for Claude Fable 5, because Fable 5 is a single premium base model with clearer model identity, while Fugu is an orchestration system with less visible routing. The main reason to test Fugu Ultra is not to declare a universal winner; it is to see whether learned orchestration changes outcomes on your own workload while reducing dependence on a single model provider.

If you are choosing AI tools for coding, research, product analysis, or long-horizon agent work, Fugu matters because it points to a broader shift: the product may no longer be one model. The product may be the orchestrator that knows which model, agent, tool, and verification path to use.

What is Sakana AI?

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana AI is a Tokyo-based frontier AI research and product company founded in 2023 by David Ha, Ren Ito, and Llion Jones. The company describes itself as focused on nature-inspired intelligence, collective intelligence, evolutionary optimization, automated science, and AI systems tailored to Japan.

That founder mix is part of why the lab gets attention:

David Ha, CEO, previously worked at Google Brain and is known for research on self-organizing systems and evolutionary approaches to AI.
Ren Ito, chairman, has a background spanning Japan’s Ministry of Foreign Affairs, Mercari, and Stability AI.
Llion Jones, CTO, is one of the co-authors of the 2017 Transformer paper, Attention Is All You Need.

Sakana has also raised substantial capital for that strategy. In November 2025, TechCrunch reported that Sakana closed a ¥20 billion, approximately $135 million, Series B round at a $2.65 billion post-money valuation. That funding context matters because Sakana is not just publishing research demos; it is trying to turn orchestration, automated research, and Japan-focused enterprise AI into products.

Sakana has been building a research story around models that collaborate, evolve, and improve rather than relying only on brute-force scale. Notable examples include:

The AI Scientist, an agentic research system that can generate research ideas, run experiments, write papers, and review outputs.
AI Scientist-v2, which Sakana says produced the first fully AI-generated paper to pass a human peer-review process at a top AI workshop.
The Darwin Gödel Machine, a self-improving coding-agent system that rewrites its own code and validates changes on benchmarks.
Sakana Marlin, a long-horizon autonomous research product positioned as a “virtual CSO” for deep strategy research.
Sakana Fugu, the new multi-agent model-orchestration API.

That is the through-line: Sakana is less interested in a single chat interface and more interested in systems where many agents search, critique, combine, and improve.

Why Sakana AI is getting attention right now

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

The current spike in attention is mostly about Sakana Fugu, launched publicly on June 22, 2026. It follows several months of Sakana announcements that all point in the same direction.

The timing also matters for search demand. Claude Fable 5 launched on June 9, then Anthropic announced a government-directed access suspension on June 12. That made “Fable 5 alternative,” “Fugu Ultra vs Fable 5,” and “AI model access restrictions” more commercially relevant than a normal model benchmark story.

Date	Sakana AI milestone	Why it mattered
March 25, 2026	Mitsubishi Electric announced an investment in Sakana AI	Signaled demand for enterprise AI in manufacturing, infrastructure, and Japan-specific industrial use cases.
March 26, 2026	The AI Scientist work was published in Nature	Put automated AI research into a mainstream scientific venue.
April 24, 2026	Sakana opened Fugu beta	Previewed Fugu as a commercial multi-agent orchestration system.
June 5, 2026	Sakana announced its Recursive Self-Improvement Lab	Connected AI Scientist, Darwin Gödel Machine, and self-improving agents into one research agenda.
June 15, 2026	Sakana Marlin launched	Turned long-running autonomous research into a paid product for strategy work.
June 22, 2026	Sakana Fugu launched	Productized learned model orchestration as a single API.

The headline is not just “new model.” The headline is that Sakana is packaging multi-agent coordination as something developers can call like a normal model.

For broader context on agent tools, compare this with AI agents for business and AI agents for coding. Fugu sits in a different category: it is not just an app, and not exactly a raw model. It is an orchestration layer sold as a model endpoint.

What is Sakana Fugu?

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana Fugu is a multi-agent system that behaves like a single model.

You send a request to one endpoint. Fugu decides whether to answer directly, route the task to one underlying worker, or coordinate multiple agents. The system can handle model selection, delegation, verification, and final synthesis internally.

Sakana’s product page describes the pitch in three parts:

One API for multiple models. Developers call one OpenAI-compatible API instead of writing their own router or agent workflow.
Complex-task support. Fugu is aimed at coding, reasoning, research, and other work where a single model pass may be insufficient.
Configurable agent selection. For the standard Fugu model, teams can opt out of specific models or providers for privacy, compliance, or organizational constraints.

The important nuance: Fugu is not simply another LLM competing head-to-head with Claude or GPT as a standalone model. It is closer to a learned conductor. The intelligence is in deciding how to use the model pool.

That distinction matters because benchmark results from an orchestrator are not the same thing as benchmark results from a single base model. If Fugu calls other frontier models under the hood, then part of the capability comes from those workers. Sakana’s claim is that the coordination layer adds value beyond simply calling one worker directly.

Is Fugu AI an LLM?

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Searchers often use phrases like Fugu AI, Fugu LLM, or Fugu AI model. The clean answer is: Sakana Fugu is presented as a model endpoint, but it is better described as a learned orchestration model rather than a single standalone LLM.

From the developer’s point of view, you call Fugu like a model. From the system-design point of view, Fugu can select or coordinate workers from a model pool. That is why the wording matters:

Query wording	Practical meaning
Fugu AI	The product name people use for Sakana’s Fugu system
Fugu LLM	A reasonable shorthand, but incomplete because Fugu is an orchestrator
Fugu AI model	Useful for API buyers comparing model endpoints
Sakana Fugu	The official product name
Sakana Fugu Ultra	The higher-quality, slower variant for harder tasks

For a narrower explanation of the Ultra variant, see the dedicated Fugu Ultra guide.

Sakana Fugu vs Claude Fable 5: quick comparison

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

One important comparison is Sakana Fugu vs Claude Fable 5, because both products target similar high-difficulty workflows: long-horizon coding, agentic work, research, analysis, and multi-step reasoning.

They are not the same kind of product. Claude Fable 5 is Anthropic’s premium Mythos-class model for difficult coding, agents, knowledge work, vision, and long-running tasks. Sakana Fugu is a model orchestration layer that can coordinate multiple models behind one API. A simple analogy: Fable 5 is a single model choice, while Fugu is a system for deciding how multiple model choices should be used.

As of June 22, 2026, this comparison also has an availability and compliance angle. Anthropic’s Claude Fable 5 page lists Fable 5 access as unavailable, and Anthropic’s June 12 statement says a U.S. government directive led it to suspend Fable 5 and Mythos 5 access for foreign nationals and then disable access more broadly to ensure compliance. Sakana positions Fugu around multi-provider flexibility, which may help some teams reduce reliance on a single restricted model.

Comparison	Sakana Fugu / Fugu Ultra	Claude Fable 5
Product type	Learned multi-agent orchestrator exposed as one API	Single Anthropic Mythos-class model
Main promise	Coordinate multiple models to improve difficult workflows and reduce single-vendor dependency	Premium long-horizon reasoning, coding, agents, and knowledge work
API style	OpenAI-compatible API; Chat Completions and Responses support listed in the Fugu repo	Claude API model ID `claude-fable-5`
Current access context	Sakana says Fugu is available outside the EU/EEA, subject to account access and local restrictions	Anthropic’s Fable page listed access as unavailable in the June 22 source check
Pricing headline	Fugu Ultra: $5 input / $30 output per 1M tokens; orchestration tokens can also count	$10 input / $50 output per 1M tokens when available
Transparency	Exact worker models and routing are not exposed	Clear model identity, but safeguards can route flagged domains to Opus 4.8
Likely fit	Teams that want a model-agnostic agent layer, code review, research, and provider flexibility	Teams that want one Anthropic model for hard work, if/when access is restored
Watch-outs	Opaque routing, hidden orchestration cost, vendor-published benchmarks	Access changes, regulatory restrictions, 30-day retention, safeguard behavior

Review take: Fugu Ultra should not be framed as the universal winner against Fable 5. It is more relevant when your real problem is vendor dependency, access variability, or the operational burden of building your own multi-agent router. Fable 5 is simpler to reason about when you want a single known model and Anthropic access is available. Fugu is more relevant when you want the model-selection problem abstracted away.

For the dedicated Anthropic model guide, see our Claude Fable 5 overview and Claude Fable 5 vs Opus 4.8 comparison.

Fugu vs Fugu Ultra

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana launched two Fugu variants.

Model	Good fit	Tradeoff	What it does
Fugu	Everyday coding, code review, chatbots, interactive workflows	Lower latency, less deep coordination	Balances performance and response time. In the technical report, Sakana says this version is optimized for speed and selects a single worker per input.
Fugu Ultra	Hard multi-step reasoning, research, security analysis, Kaggle-style tasks, paper reproduction, patent/literature work	Slower and more expensive	Uses deeper orchestration over a larger expert-agent pool to maximize answer quality.

The simplest mental model:

Fugu = learned low-latency router/conductor for everyday work.
Fugu Ultra = deeper multi-agent orchestration for hard tasks.

If you are writing a customer-support chatbot or doing routine code review, the base Fugu model may be more practical. If you are asking the system to reproduce a paper, analyze a security target within scope, or solve a difficult engineering problem, Fugu Ultra is the one Sakana is positioning for that job.

What is Fugu Ultra?

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Fugu Ultra is Sakana’s quality-first Fugu variant. Sakana says it coordinates a deeper pool of expert agents and is aimed at complex, multi-step work where response time is less important than answer depth. On the official product page, Sakana lists example uses such as Kaggle competitions, paper reproduction, cybersecurity analysis, and literature or patent investigations.

That does not mean every team should default to Fugu Ultra. For most production systems, the choice is closer to this:

Use case	Better starting point	Why
Interactive coding help	Fugu	Lower-latency default is usually easier to use repeatedly
Long code review or migration planning	Fugu or Fugu Ultra	Test both; longer tasks may benefit from deeper orchestration
Paper reproduction	Fugu Ultra	Sakana positions Ultra for long multi-step research workflows
Patent/literature analysis	Fugu Ultra	More synthesis and verification may be useful
Chatbot or support assistant	Fugu	Latency and cost usually matter more than maximum depth
Simple summaries	Neither may be necessary	A single cheaper model may be enough

For the standalone SEO page targeting this query cluster, read Fugu Ultra: model, pricing, benchmarks, and use cases.

How Fugu works under the hood

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana says Fugu is grounded in two ICLR 2026 research papers: TRINITY and Conductor.

TRINITY: an evolved coordinator

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

The TRINITY paper describes a lightweight coordinator that orchestrates multiple LLMs over multiple turns. It assigns roles such as Thinker, Worker, and Verifier, then delegates work across coding, math, reasoning, and knowledge tasks.

The key idea is that the coordinator does not need to be the smartest model in the room. It needs to know who should do what, when to ask for verification, and how to use the group.

Conductor: learned natural-language orchestration

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

The Conductor paper focuses on training a model with reinforcement learning to discover agent coordination strategies. Instead of hard-coding a fixed agent workflow, Conductor learns communication patterns, worker prompts, and recursive structures that help a pool of models outperform individual workers on difficult reasoning benchmarks.

This is why Sakana calls Fugu an orchestration model rather than a classic router. A router typically picks a model. A conductor can design a mini-workflow.

Why this is different from a normal multi-agent app

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Traditional multi-agent systems often require developers to write the system by hand:

choose the models;
write agent roles;
decide when agents talk to each other;
add critique and retry loops;
build verification steps;
manage tool use;
choose a final answer.

Fugu tries to make that invisible from the outside. You call one API, and the orchestration happens inside the model product.

That is the part developers are reacting to. If it works reliably, it reduces the engineering burden of building agent systems from scratch.

Sakana Fugu benchmark claims

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana published benchmark results comparing Fugu and Fugu Ultra with frontier baselines including Opus 4.8, Gemini 3.1 Pro, and GPT-5.5. Sakana also compares against Fable 5 and Mythos Preview in some charts, while saying those models are not in Fugu’s agent pool because they are not publicly accessible.

Here are the main scores from the Sakana Fugu product page and technical report:

Benchmark	Fugu	Fugu Ultra	Opus 4.8	Gemini 3.1 Pro	GPT-5.5
SWE Bench Pro	59.0	73.7	69.2	54.2	58.6
TerminalBench 2.1	80.2	82.1	74.6	70.3	78.2
LiveCodeBench	92.9	93.2	87.8	88.5	85.3
LiveCodeBench Pro	87.8	90.8	84.8	82.9	88.4
Humanity’s Last Exam	47.2	50.0	49.8	44.4	41.4
CharXiv Reasoning	85.1	86.6	84.2	83.3	84.1
GPQA-D	95.5	95.5	92.0	94.3	93.6
SciCode	60.1	58.7	53.5	58.9	56.1
τ³ Banking	21.7	20.6	20.6	8.4	20.6
Long Context Reasoning	74.7	73.3	67.7	72.7	74.3
MRCRv2	86.6	93.6	87.9	84.9	94.8

Read this table carefully. It is useful early signal, but it is not a final verdict.

Sakana notes that the non-Fugu baseline scores are provider-reported. The Fugu results are Sakana’s own evaluation. That does not make them useless, but it does mean buyers should wait for independent replication, especially for production decisions.

The more interesting signal is not any single score. It is that learned orchestration may add value on tasks where planning, verification, coding loops, and multi-step decomposition matter.

Why Fugu matters

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Fugu is interesting because it turns an engineering pattern into a product.

For the past two years, many advanced AI teams have been building internal systems that look like this:

send the prompt to one strong model;
ask another model to critique or verify;
route coding subtasks to a coding-specialized model;
use a cheaper model for summarization;
run tools or tests;
ask a final model to synthesize the answer.

That works, but it can be expensive to build and maintain. Every time a new model launches, teams often need to retest routing logic. Every time a provider changes pricing, latency, policy, or availability, the workflow may need tuning.

Fugu says: let the orchestration model learn that.

If this approach keeps improving, the AI stack may split into three layers:

Layer	What it does	Example
Base models	Generate, reason, code, classify, search	Claude, GPT, Gemini, open models
Orchestrators	Decide which model or agent should do which work	Sakana Fugu, learned routers, agent conductors
Apps and workflows	Package AI into a user-facing job	coding agents, research assistants, analysts, customer-support systems

That is why Fugu is not only a model launch. It is a bet on where value moves next.

Pricing: how much does Sakana Fugu cost?

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana lists both pay-as-you-go and subscription pricing.

Fugu pay-as-you-go pricing

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

For the standard Fugu model, pricing depends on the active agent pool:

If one agent is active, you pay the standard rate for that specific underlying model.
If multiple agents are active, Sakana says it does not stack model fees. You pay a single rate based on the top-tier model involved.

That is useful because naive multi-agent systems can become expensive fast. If every internal agent call is billed separately at full price, costs can explode. Sakana’s stated pricing model is meant to make orchestration more predictable.

Fugu Ultra pricing

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

For fugu-ultra-20260615, Sakana lists fixed pricing per 1 million tokens:

Token type	Standard price	Context over 272K
Input	$5	$10
Output	$30	$45
Cached input	$0.50	$1.00

One caveat: Sakana’s pricing page says Fugu Ultra usage fields separate visible model work from orchestration work, and orchestration tokens still count toward final pricing. In plain English: the final answer is not the only token cost. The hidden orchestration can add billable usage.

Subscription plans

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana’s product page also lists monthly subscription plans:

Plan	Price	Fit
Standard	$20/month	Occasional API calls, small experiments, personal workflows
Pro	$100/month	Regular coding, review, research, and analysis sessions
Max	$200/month	Heavier long-running workloads

Sakana says every subscription tier includes both Fugu and Fugu Ultra, with higher tiers offering more usage. It also lists a promotion for users subscribing before the end of July 2026.

Pricing and cost checklist

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

If you are comparing Sakana Fugu price, Sakana Fugu cost, or Fugu Ultra pricing, do not look only at the final answer length. Track four things in a real test:

Input tokens — especially for long codebases, papers, or document sets.
Output tokens — reports and code reviews can become long.
Cached input tokens — useful if your workflow reuses the same context.
Orchestration usage — Fugu Ultra may do work behind the visible response, and Sakana says orchestration tokens can count toward pricing.

A safe evaluation should compare cost per completed task, not just price per million tokens. For example, if one model is cheaper per token but fails more often, the apparent savings may disappear after retries and human review. Conversely, if Fugu Ultra is slower or uses more orchestration than expected, a cheaper single-model workflow may be easier to justify.

How to use Sakana Fugu

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana says Fugu is available through an OpenAI-compatible API, so existing clients can point to the Fugu endpoint without a full SDK migration. The GitHub README says the API supports both Chat Completions and Responses endpoints.

For Codex users, Sakana also provides a one-line installer in the Fugu repository:

curl -fsSL https://sakana.ai/fugu/install | bash

Then:

codex-fugu

As with any curl | bash installer, you should review the script and repository before running it on a machine with sensitive credentials. The convenience is real, but so is the security responsibility.

Sakana Fugu with Codex, Cursor, and coding tools

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

The search data already shows interest in Codex Fugu, codex-fugu, and Sakana Fugu Cursor. These are related but not identical workflows:

Tool/query	What to know
`codex-fugu`	Sakana’s GitHub README lists this launcher after the one-line Codex install.
Sakana Fugu Codex	Good fit if you want Fugu inside a terminal coding-agent workflow.
Sakana Fugu Cursor	Possible only if your Cursor setup supports the required custom OpenAI-compatible endpoint and model ID; validate with a small test first.
Sakana Fugu Claude Code	Not a direct Claude Code model swap. Claude Code is Anthropic-oriented, while Fugu is exposed through Sakana’s OpenAI-compatible API.

For the practical setup-focused page, see Sakana Fugu with Codex and Cursor.

Is Sakana Fugu open source?

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana has a public SakanaAI/fugu GitHub repository, but that does not mean the full Fugu service or all underlying model orchestration is open source. The repository includes the README, installer/config materials, documentation, assets, and the technical report. The hosted Fugu system itself is a Sakana API product.

A practical way to phrase it:

GitHub repo: public.
Technical report: public.
Installer/config tooling: public in the repo.
Hosted Fugu model/orchestration service: proprietary API product.
Exact worker routing: not exposed to users in normal API usage.

This distinction matters for procurement and compliance. A public GitHub repo can make setup easier to inspect, but it does not provide full visibility into the hosted model pool or routing decisions.

What does “Fugu” mean?

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

“Fugu” commonly refers to pufferfish in Japanese. In this product context, Sakana Fugu is simply the name of Sakana AI’s multi-agent model orchestration product. It should not be confused with unrelated uses of “Fugu” in food, biology, or other software projects.

Where Fugu could be useful

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Fugu is most compelling when the task benefits from multiple viewpoints or staged verification.

Good use cases include:

code review, where one model can inspect architecture while another checks bugs and edge cases;
software engineering agents, where the system needs to plan, edit, test, and revise;
research reports, where retrieval, contradiction checking, and synthesis matter;
paper reproduction, where models need to read, implement, run, debug, and interpret;
cybersecurity assessment, when work is scoped, authorized, and evidence-driven;
patent and literature analysis, where recall, structure, and judgment all matter;
Kaggle-style data science, where search, experiments, and validation loops are useful;
long-context reasoning, where a system may need to break the problem into smaller passes.

Fugu is less necessary for simple tasks:

rewriting one email;
summarizing a short article;
translating a paragraph;
generating a basic list;
answering an easy factual question.

For those jobs, a single fast model is usually cheaper and simpler.

The caveats: what to watch before trusting Fugu

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Fugu is interesting, but it introduces new review questions.

1. It is not fully transparent

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana says users cannot see which exact underlying models Fugu used for each query. The routing and coordination are proprietary by design.

That may be acceptable for many workflows. But regulated teams may need detailed logs, model provenance, and data-flow visibility. If you need to prove which model saw which data, hidden orchestration is a serious governance question.

2. Fugu Ultra’s pool is fixed

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana says standard Fugu lets users opt out of specific models from the console settings. Fugu Ultra, however, relies on the full agent pool to deliver maximum performance, so its pool is fixed.

That means privacy and compliance teams may prefer standard Fugu even if Ultra scores higher.

3. Cost can include orchestration usage

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Fugu Ultra’s visible answer may be short while the internal orchestration is long. Sakana says orchestration tokens count toward billing. For long, hard tasks, you need usage monitoring rather than assuming cost from the final output alone.

4. Benchmark results need independent validation

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana’s published numbers are impressive. They are also vendor-published. Treat them as a reason to test Fugu, not as a reason to skip your own evaluation.

A good internal benchmark should include:

your real prompts;
your real documents or repos;
latency tracking;
token-cost tracking;
failure-mode review;
human evaluation;
comparison with your current model or agent stack.

5. EU/EEA availability is restricted

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana says Fugu is not currently available in the EU/EEA while the company works toward GDPR and EU-specific compliance. It is available from outside Japan, but local regulations and network conditions may affect access.

Sakana Fugu review scorecard

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

This is a source-based launch review, not a private hands-on benchmark. The verdict below is based on Sakana’s official product pages, technical report, pricing page, GitHub materials, and the current Anthropic Fable 5 access context.

Review category	Assessment	Why
Performance potential	Promising, needs validation	Sakana’s published Fugu Ultra numbers are strong across coding, reasoning, science, and agentic benchmarks, but they still need independent replication.
Ease of integration	Favorable	OpenAI-compatible API support and the Codex installer may lower migration friction for developers.
Fable 5 alternative value	Context-dependent	Fugu is not the same as Fable 5, but it may be worth evaluating when Fable 5 access is unavailable or when single-provider dependency is a concern.
Cost clarity	Needs workload testing	Token pricing is published, but orchestration tokens can add billable usage, so teams need real workload tests.
Transparency	Limited	Fugu hides exact model routing and coordination details, which is convenient for developers but may be difficult for audit-heavy teams.
Governance and compliance	Mixed	Standard Fugu offers some agent opt-out controls, but Fugu Ultra uses a fixed pool and EU/EEA access is restricted.
Provider flexibility	Potential advantage	The main strategic argument for Fugu is the possibility of reducing exposure to changes in model availability, policy, pricing, or provider access.

Overall review: Sakana Fugu is one of the more notable AI infrastructure launches of 2026 because it addresses the model-selection problem directly. If your team only needs one high-quality model call, Fable 5, Opus 4.8, GPT-5.5, or Gemini may be simpler. If your team is already building routers, critic loops, multi-agent scaffolds, or model fallback logic, Fugu deserves a controlled evaluation.

Sakana Marlin: the other product to watch

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Fugu is the developer-facing orchestration story. Sakana Marlin is the business-research story.

Sakana Marlin is positioned as a “virtual CSO” for ultra-deep strategy research. Instead of generating a quick chat response, it can run up to eight hours of autonomous reasoning: forming hypotheses, gathering information, browsing the web, checking contradictions, and producing detailed reports and slides.

The product is aimed at executive research and strategy work: market maps, policy changes, financial scenarios, regulation, industrial shifts, and similar topics where the value is not just summarization but decision-ready structure.

The connection to Fugu is obvious. Sakana is turning long-horizon agent research into products:

Marlin packages autonomous research for business users.
Fugu packages multi-agent orchestration for developers.

Both assume that serious AI work is not a single chat turn. It is a process.

The AI Scientist and Darwin Gödel Machine: why Sakana’s research matters

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Fugu also makes more sense if you look at Sakana’s earlier research.

The AI Scientist

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana’s AI Scientist-v2 paper describes an end-to-end agentic system that can formulate hypotheses, design and execute experiments, analyze data, create figures, and write manuscripts. Sakana says one fully autonomous manuscript exceeded the average human acceptance threshold in an ICLR workshop review process.

On March 26, 2026, Sakana announced that the broader AI Scientist work was published in Nature. The company also openly lists limitations: weak ideas, methodological gaps, hallucinated citations, mistakes, and the broader risk of overwhelming scientific review systems.

That balance is important. The system is impressive, but it is not magic. It automates parts of the scientific workflow and exposes both the promise and the governance problem.

Darwin Gödel Machine

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

The Darwin Gödel Machine paper describes a self-improving coding-agent system that iteratively modifies its own code and empirically validates changes on coding benchmarks. The reported results improved from 20.0% to 50.0% on SWE-bench and from 14.2% to 30.7% on Polyglot, with safety precautions such as sandboxing and human oversight.

This fits Sakana’s broader theme: use evolution, search, feedback, and agent loops to improve AI systems without relying only on bigger base-model training runs.

How Sakana AI could affect OpenAI, Anthropic, and Google

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Not in a simple “new chatbot displaces old chatbot” sense.

Sakana’s strategy is different. It depends on the existence of many strong models. Fugu becomes more valuable if the model ecosystem remains diverse: some models may be stronger at coding, some at math, some at long context, some at cost efficiency, some at regional availability, and some at Japanese language or enterprise constraints.

The competitive implication is more subtle:

If orchestration becomes a common interface, users may care less which base model is underneath.
If a learned conductor can swap providers, single-vendor lock-in becomes weaker.
If agent systems outperform single calls, foundation-model companies may need to compete at the workflow layer, not just the base-model layer.
If sovereign AI buyers want resilience, a model-agnostic orchestrator becomes strategically attractive.

That does not mean base models stop mattering. Fugu still needs strong workers. But it suggests that part of the next AI platform competition may shift toward who controls the coordination layer.

When should you evaluate Sakana Fugu?

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Evaluate Fugu if you already spend time manually comparing models, building agent workflows, or routing hard tasks between tools.

Fugu is especially relevant to test if you:

run code reviews or coding agents;
need more reliable results on hard reasoning tasks;
want one API that can use multiple models;
care about reducing single-vendor dependency;
have long research or analysis workflows;
are building products where model choice changes by task.

Wait or test cautiously if you:

need full model-level audit logs;
operate in the EU/EEA;
cannot tolerate opaque routing;
need deterministic latency;
have strict data-residency or model-provider rules;
already have a mature internal router that works well.

The safest practical move is to benchmark it against your own workflow. Do not ask whether Fugu is universally superior in the abstract. Ask whether it changes outcomes on your hardest 20 prompts enough to justify cost, latency, and governance tradeoffs.

Bottom line

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

Sakana AI is trending because Fugu turns a real pattern in advanced AI use into a product: do not bet everything on one model; coordinate many models intelligently.

That may sound obvious, but productizing it as a single OpenAI-compatible API is the important step. If Fugu works as advertised, developers get a simpler path to multi-agent intelligence without hand-building the whole orchestration layer.

The caveat is that this is still an early, proprietary, vendor-evaluated system. The benchmark claims are promising, not conclusive. The hidden routing is convenient, but it creates governance questions. Fugu Ultra may be useful for hard workflows, but its orchestration tokens and latency need real testing.

Still, the direction is worth watching. Sakana’s bet is that future AI capability may come not only from bigger models, but also from models that know how to use other models.

Sources checked

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

FAQ

Try it in practice Make this section actionable Practice the workflow instead of only comparing tools.

What is Sakana AI Fugu?

Sakana AI Fugu is a multi-agent orchestration system delivered through a single OpenAI-compatible model API. You send one request, and Fugu can route, delegate, verify, and synthesize work across a pool of AI agents behind the scenes.

Is Fugu a single AI model or a router?

Fugu behaves like one model from the user’s perspective, but it is more accurately understood as a learned orchestrator. The standard Fugu model is optimized for lower-latency routing, while Fugu Ultra can coordinate multiple agents for harder tasks.

What is the difference between Fugu and Fugu Ultra?

Fugu is designed for everyday work where latency matters. Fugu Ultra is designed for maximum answer quality on complex, multi-step tasks and uses deeper orchestration over a larger expert-agent pool.

What is Fugu Ultra?

Fugu Ultra is Sakana’s quality-first Fugu variant for complex, multi-step tasks. Sakana positions it for harder research, coding, paper reproduction, patent/literature analysis, and similar workflows where answer depth matters more than latency.

Is Sakana Fugu open source?

Sakana has a public GitHub repository for Fugu materials, including setup docs and the technical report, but the hosted Fugu orchestration service is a proprietary API product. The exact worker routing and full model pool are not exposed as open-source code.

Does Sakana Fugu work with Codex or Cursor?

Sakana’s GitHub README lists a codex-fugu launcher for Codex after installation. Cursor usage depends on whether your Cursor setup supports the needed custom OpenAI-compatible endpoint and model ID; teams should verify with a small non-sensitive test before using it on real repositories.

How does Sakana Fugu compare with Claude Fable 5?

Sakana Fugu is an orchestrator that can coordinate multiple models, while Claude Fable 5 is a single Anthropic Mythos-class model. Fugu may be more relevant when you want multi-provider flexibility or automated routing. Fable 5 may be simpler when you want one known Anthropic model and access is available.

Can Fugu Ultra be an alternative to Claude Fable 5?

Fugu Ultra may be an alternative for some coding, research, and reasoning workflows, especially when Fable 5 access is limited or unavailable. But it should not be treated as a direct one-to-one substitute because Fugu Ultra hides its underlying model routing and bills orchestration tokens, while Fable 5 has a clearer single-model identity.

How much does Sakana Fugu cost?

Sakana lists Fugu Ultra at $5 per 1M input tokens, $30 per 1M output tokens, and $0.50 per 1M cached input tokens, with higher rates for context above 272K. Subscription plans are listed at $20, $100, and $200 per month. Standard Fugu pay-as-you-go pricing depends on the active underlying agent pool.

Can I use Sakana Fugu with OpenAI-compatible tools?

Yes. Sakana says Fugu is available through an OpenAI-compatible API and supports Chat Completions and Responses endpoints. Sakana also provides a Codex installer in its GitHub repository.

Is Sakana Fugu available in the EU?

Sakana says Fugu is not available in the EU/EEA while it works toward GDPR and EU-specific compliance. Teams should check the current terms and availability before planning a rollout.

Are Sakana Fugu’s benchmark results independently verified?

The published benchmark numbers are Sakana’s own June 2026 evaluation, and the baseline scores are described as provider-reported. Treat them as useful early evidence, but validate Fugu on your own workloads before making production decisions.

Why is Sakana AI important?

Sakana AI is important because it is pushing a different path from pure scale: automated research, self-improving agents, learned model orchestration, and AI products built around collective intelligence. Fugu is the clearest commercial version of that strategy so far.

What is Sakana AI?#

Why Sakana AI is getting attention right now#

What is Sakana Fugu?#

Is Fugu AI an LLM?#

Sakana Fugu vs Claude Fable 5: quick comparison#

Fugu vs Fugu Ultra#

What is Fugu Ultra?#

How Fugu works under the hood#

TRINITY: an evolved coordinator#

Conductor: learned natural-language orchestration#

Why this is different from a normal multi-agent app#

Sakana Fugu benchmark claims#

Why Fugu matters#

Pricing: how much does Sakana Fugu cost?#

Fugu pay-as-you-go pricing#

Fugu Ultra pricing#

Subscription plans#

Pricing and cost checklist#

How to use Sakana Fugu#

Sakana Fugu with Codex, Cursor, and coding tools#

Is Sakana Fugu open source?#

What does “Fugu” mean?#

Where Fugu could be useful#

The caveats: what to watch before trusting Fugu#

1. It is not fully transparent#

2. Fugu Ultra’s pool is fixed#

3. Cost can include orchestration usage#

4. Benchmark results need independent validation#

5. EU/EEA availability is restricted#

Sakana Fugu review scorecard#

Sakana Marlin: the other product to watch#

The AI Scientist and Darwin Gödel Machine: why Sakana’s research matters#

The AI Scientist#

Darwin Gödel Machine#

How Sakana AI could affect OpenAI, Anthropic, and Google#

When should you evaluate Sakana Fugu?#

Bottom line#

Sources checked#

FAQ#

What is Sakana AI Fugu?

Is Fugu a single AI model or a router?

What is the difference between Fugu and Fugu Ultra?

What is Fugu Ultra?

Is Sakana Fugu open source?

Does Sakana Fugu work with Codex or Cursor?

How does Sakana Fugu compare with Claude Fable 5?

Can Fugu Ultra be an alternative to Claude Fable 5?

How much does Sakana Fugu cost?

Can I use Sakana Fugu with OpenAI-compatible tools?

Is Sakana Fugu available in the EU?

Are Sakana Fugu’s benchmark results independently verified?

Why is Sakana AI important?

Coursiv

Related AI guides

Best AI Agents for Business in 2026: Top Tools by Use Case

Best AI Coding Agents in 2026: Top Tools by Use Case

Claude Fable 5: Price, API, Access, Safeguards & Use Cases

Claude Fable 5 vs Opus 4.8: Benchmarks, Pricing & When to Switch

What is Sakana AI?

Why Sakana AI is getting attention right now

What is Sakana Fugu?

Is Fugu AI an LLM?

Sakana Fugu vs Claude Fable 5: quick comparison

Fugu vs Fugu Ultra

What is Fugu Ultra?

How Fugu works under the hood

TRINITY: an evolved coordinator

Conductor: learned natural-language orchestration

Why this is different from a normal multi-agent app

Sakana Fugu benchmark claims

Why Fugu matters

Pricing: how much does Sakana Fugu cost?

Fugu pay-as-you-go pricing

Fugu Ultra pricing

Subscription plans

Pricing and cost checklist

How to use Sakana Fugu

Sakana Fugu with Codex, Cursor, and coding tools

Is Sakana Fugu open source?

What does “Fugu” mean?

Where Fugu could be useful

The caveats: what to watch before trusting Fugu

1. It is not fully transparent

2. Fugu Ultra’s pool is fixed

3. Cost can include orchestration usage

4. Benchmark results need independent validation

5. EU/EEA availability is restricted

Sakana Fugu review scorecard

Sakana Marlin: the other product to watch

The AI Scientist and Darwin Gödel Machine: why Sakana’s research matters

The AI Scientist

Darwin Gödel Machine

How Sakana AI could affect OpenAI, Anthropic, and Google

When should you evaluate Sakana Fugu?

Bottom line

Sources checked

FAQ