OpenAI and Broadcom Launch Jalapeño — A Custom AI Chip Built From the Ground Up for LLM Inference

OpenAI has taken a major step toward building its own technology stack by unveiling Jalapeño, its first custom-designed AI accelerator. Developed in close collaboration with semiconductor company Broadcom and manufacturing partner Celestica, Jalapeño is designed specifically for large language model (LLM) inference — the process that powers every response a user gets from tools like ChatGPT, Codex, and the OpenAI API.

What Is Jalapeño and Why Does It Matter

Unlike general-purpose AI chips that have been adapted over time for machine learning tasks, Jalapeño was built from a blank slate with one specific goal: to serve LLM inference workloads as efficiently as possible. OpenAI refers to it as an “Intelligence Processor” rather than a standard AI accelerator, signaling how purpose-specific this chip really is.

The chip’s architecture is designed to reduce unnecessary data movement while carefully balancing compute, memory, and networking resources. This allows Jalapeño to reach utilization rates much closer to the hardware’s theoretical peak — something that even current high-end AI chips struggle to achieve consistently. Early testing indicates the chip will deliver performance per watt substantially better than existing state-of-the-art solutions, though OpenAI says detailed benchmark figures will be published in the coming months.

Nine Months From Design to Tape-Out — With AI’s Help

One of the most striking aspects of Jalapeño’s development is how quickly it was completed. The chip moved from initial design to manufacturing tape-out in just nine months, which OpenAI and Broadcom describe as one of the fastest ASIC development cycles ever achieved for a high-performance AI semiconductor.

That speed was partly the result of OpenAI using its own AI models to accelerate portions of the chip design and optimization process. Engineering samples are already running machine learning workloads in OpenAI’s labs at production target frequencies and power levels, including the company’s GPT-5.3-Codex-Spark model.

Broadcom contributed its silicon implementation expertise alongside high-speed networking technologies — including its Tomahawk networking silicon — while Celestica handled board design, rack integration, and production systems.

How Jalapeño Fits Into OpenAI’s Full-Stack Strategy

For years, OpenAI — like most AI companies — depended almost entirely on third-party hardware, primarily from NVIDIA. Jalapeño signals a clear shift in that approach. OpenAI now intends to design and control more of the infrastructure that sits beneath its models and products.

Greg Brockman, President and Co-Founder of OpenAI, explained the reasoning directly:

“The world is moving to a compute-powered economy. Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems.”

Richard Ho, who leads OpenAI’s hardware program, added that the chip was designed around the specific kernels, memory movement patterns, networking requirements, and serving behaviors that matter most for frontier AI models.

The chip is planned for deployment at gigawatt scale across data centers operated by partners including Microsoft, beginning by the end of 2026.

What Jalapeño Means for ChatGPT and OpenAI Users

The practical impact of Jalapeño will show up in how people experience OpenAI’s products. Inference is the stage where AI actually generates responses for real users — every message sent to ChatGPT, every task run through Codex, and every API call made by developers. Improving inference efficiency directly translates to:

  • Faster response times in ChatGPT and other AI tools
  • Lower costs for developers using the OpenAI API
  • More reliable performance during periods of high demand
  • Greater ability to support future agentic AI products that handle multi-step tasks

Hock Tan, CEO of Broadcom, described the collaboration as a long-term commitment, saying the partnership is aimed at enabling gigawatt-scale data center deployment across multiple generations of AI infrastructure.

A Comparison: Jalapeño vs. General-Purpose AI Accelerators

FeatureJalapeño (OpenAI)General-Purpose AI Accelerators
Primary UseLLM inference onlyTraining and inference
ArchitecturePurpose-built from scratchAdapted from general designs
Performance Per WattSubstantially better (early tests)Current state-of-the-art baseline
Development Time9 months to tape-outTypically multi-year cycles
Deployment ScaleGigawatt-scale data centersVaries by vendor

The Road Ahead for OpenAI’s Hardware Platform

Jalapeño is positioned as the first chip in a multi-generation compute platform. OpenAI and Broadcom plan to continue developing successive generations of chips, each built to improve on the last in terms of efficiency, capability, and scale.

By owning more of the stack — from model architecture down to the silicon that runs it — OpenAI aims to create a tighter feedback loop. Better infrastructure enables more efficient AI training and serving, which leads to more capable models, which drives more useful products, which generates the revenue needed to fund the next round of infrastructure investment. The company frames this as a key mechanism for making advanced AI more broadly accessible over time.

For students, developers, small businesses, and researchers, the long-term promise is straightforward: faster, cheaper, and more reliable access to powerful AI tools.

Frequently Asked Questions

What exactly is OpenAI's Jalapeño chip designed to do?

Jalapeño is OpenAI's first custom AI chip, built specifically for large language model (LLM) inference. Unlike general-purpose AI accelerators, it was designed from scratch to power services like ChatGPT, Codex, and the OpenAI API, with the goal of delivering better performance per watt and lower operational costs.

Who built Jalapeño and how long did it take?

Jalapeño was developed jointly by OpenAI, Broadcom, and Celestica. It went from initial design to manufacturing tape-out in just nine months — a remarkably fast timeline for a high-performance AI semiconductor. OpenAI's own AI models were used to accelerate parts of the chip design process.

When will users see the benefits of Jalapeño in OpenAI products?

OpenAI plans to begin deploying Jalapeño at scale by the end of 2026. Once deployed, users can expect faster ChatGPT responses, lower API costs for developers, and more reliable AI performance during high-demand periods.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top