Last week a colleague dropped a link in our team Slack: "Have you seen this? A chip the size of a dinner plate." My first reaction was a laugh. A chip the size of a dinner plate? Then I looked at the specs and stopped laughing. 4 trillion transistors. 900,000 AI cores. The H100 has about 18,000 cores — this thing claims 50x that count.
This is Cerebras's WSE-3 (Wafer-Scale Engine 3). And the company is targeting an IPO in Q2 2026, potentially as early as April.
TL;DR
- Cerebras WSE-3 is the world's largest AI chip — 5nm process, 4 trillion transistors, 900,000 AI cores
- Claims 21 PB/s memory bandwidth, approximately 2,600x faster than NVIDIA Blackwell B200
- IPO targeting ~$7–8B valuation, NASDAQ ticker CBRS
- Still difficult to compete with NVIDIA in training, but emerging as a genuine competitor in the inference market
- Revenue concentration risk: 80%+ from a single customer (G42)
What Is Wafer-Scale, Exactly?
Photo by Daniel Pantu on Unsplash | The WSE-3 isn't one chip on a wafer — the entire wafer IS the chip
Conventional semiconductors are manufactured by printing hundreds of chips onto a silicon wafer, then cutting them apart. The H100, AMD's MI300 — all follow this process. Cerebras broke with that convention: they made the entire wafer into a single massive chip.
The result is WSE-3. Manufactured on TSMC's 5nm process, roughly 21cm on each side — literally dinner plate dimensions.
| Spec | Cerebras WSE-3 | NVIDIA H100 | NVIDIA B200 |
|---|---|---|---|
| Transistors | 4 trillion | 80 billion | 208 billion |
| AI cores | 900,000 | 18,432 | ~20,000 |
| On-chip memory | 44GB SRAM | 80GB HBM3 | 192GB HBM3e |
| Memory bandwidth | 21 PB/s | 3.35 TB/s | 8 TB/s |
| Process | 5nm | 4nm | 4nm |
The numbers look overwhelming. But there's an important caveat.
"Fast" and "Better" Aren't the Same Thing
I covered NVIDIA's Vera Rubin announcement previously, and chip spec comparisons are full of traps.
WSE-3's 21 PB/s bandwidth is on-chip SRAM bandwidth. NVIDIA's HBM bandwidth is a fundamentally different kind of memory. SRAM is small but extremely fast; HBM is large but relatively slower. Comparing them directly is apples to oranges.
That said, Cerebras's advantage in inference speed is real and significant. Cerebras's cloud inference service claims 2,000+ tokens per second. For reference, the ChatGPT or Claude experience most people have feels like roughly 50–100 tokens per second. That's an order-of-magnitude difference.
The IPO: Why Now?
Photo by Bill Fairs on Unsplash | AI chip market competition is reshaping IPO market dynamics
The Cerebras IPO timeline has been complicated:
- September 2024: Filed confidential S-1 with SEC
- October 2024: CFIUS (Committee on Foreign Investment in the US) review puts it on hold
- December 2025: Reuters confirms IPO plans are back on track
- Q2 2026: Current target, potentially April
The 2024 delay was caused by G42 — a UAE-based AI firm that was both Cerebras's largest customer and an investor. The problem: G42 had prior relationships with Huawei and other Chinese tech companies. The US government raised national security concerns, CFIUS launched a review, and the IPO stalled.
By late 2025, G42 had divested its stake, CFIUS approval came through, and the path cleared.
Estimated valuation: $7–8 billion (~$10 trillion KRW). Total funding raised: $720 million. Last round (Series F) valued the company at $4.7 billion. The IPO would nearly double that.
What This Means for Developers
Honestly, for most developers Cerebras is still somewhat abstract. We rent NVIDIA GPU instances on AWS, Azure, or GCP — that's the day-to-day reality. But here's why it matters:
1. Inference costs could shift
Today's LLM API pricing is ultimately anchored to the cost of running NVIDIA GPU clusters. When inference-specialized chips like Cerebras enter the market at scale, API price competition follows. I covered OpenAI's infrastructure costs in an IPO risk analysis — compute cost is a core profitability variable.
2. Hardware diversity = more developer options
AI development has been locked into NVIDIA's CUDA ecosystem. AMD ROCm, Intel Gaudi, and now Cerebras are slowly opening alternatives. Competition improves tools and frameworks.
3. On-device vs. cloud inference calculus changes
If 2,000 tokens/second inference becomes commercially available in the cloud, the economics of on-device AI investment versus cloud API usage shifts meaningfully.
The Risks — Assessed Honestly
Photo by Ryan on Unsplash | Semiconductor markets aren't decided by technology alone
Revenue concentration is a serious problem. As of 2023, over 80% of hardware revenue came from a single customer: G42. For a company seeking public market confidence, that level of customer concentration is a material risk. A single contract change could crater revenue.
The CUDA moat is deep. NVIDIA's strength isn't just chip performance — it's the hundreds of thousands of developers who write CUDA code and the entire PyTorch/TensorFlow ecosystem built on top of it. No matter how fast WSE-3 is, software ecosystem lag will slow adoption.
Training market remains NVIDIA's domain. WSE-3's advantages show up in inference. Large-scale model training is still standardized on NVIDIA GPU clusters, and that's not changing soon.
| Dimension | Strength | Risk |
|---|---|---|
| Technology | Dominant inference speed, unique wafer-scale design | Training market penetration is limited |
| Market | AI inference demand is exploding | 80%+ revenue from G42 |
| Ecosystem | CS-3 systems + cloud service | Developer ecosystem thin vs. CUDA |
| IPO | CFIUS cleared, timeline defined | Tech IPO market uncertainty |
| Valuation | $8B reasonable relative to AI chip market | May be high relative to revenue scale |
DARPA Is Paying Attention
One notable signal: in April 2025, Cerebras received a $45 million DARPA contract, partnering with Ranovus on optical interconnect-based AI acceleration research. The US government is seriously looking for NVIDIA alternatives — and funding them.
This isn't just a startup-with-an-impressive-chip story. It's one piece of a larger strategic effort to diversify AI infrastructure supply chains.
So — Is It Worth Watching?
Short answer: interested, not excited yet.
WSE-3 is genuinely impressive technology. It's one of the few chips capable of making a real dent in NVIDIA's inference market dominance. A successful IPO could begin shifting the competitive structure of AI chips.
But the G42 revenue dependence, the CUDA ecosystem gap, and the absence of large-scale commercial deployment track record make "NVIDIA killer" premature. My read: watch the first 2–3 post-IPO earnings reports before forming a strong view.
For developers, one thing is clear: the more competitive the AI chip market becomes, the lower the API costs we pay. Whether it's Cerebras, AMD, or someone else — any serious challenge to NVIDIA's monopoly is worth rooting for.
Have you used any non-NVIDIA AI chip solutions in your work? Drop your experience in the comments.
References
- AI chipmaker Cerebras Systems rekindles IPO plans, targeting early 2026 listing — SiliconANGLE, December 21, 2025
- Cerebras IPO Countdown: Prepare to Invest in April 2026 — AccessIPOs, March 2026
- Cerebras IPO: Why This Nvidia Rival Could Go Public in 2026 — MarketWise, March 2026
- When Is the Cerebras IPO? Date, Valuation, and More — EBC Financial Group, September 2025
Related reading:
- OpenAI IPO Risk Filing Breakdown: The Paradox of 'Your Biggest Partner Is Your Biggest Risk' - OpenAI's IPO risks and the reality of AI company listings
- AI Data Centers Get $1 Trillion: What NVIDIA Vera Rubin and the Infrastructure Investment Boom Mean for Developers - NVIDIA's next-gen chips and the AI infrastructure investment wave