Multi-Chip AI Inference Startup Gimlet Labs Raises $80M to Break Hardware Bottleneck
Gimlet Labs, a startup building software that distributes AI workloads across different chip architectures, has closed an $80 million Series A funding round. The round was led by Menlo Ventures, with participation from Eclipse Ventures, Prosperity7, and Triatomic. This brings the company's total funding to $92 million.
The company emerged from stealth five months ago in October 2025 and immediately achieved eight-figure revenues. Its customer base has since tripled, now serving top frontier AI labs and major cloud providers running proprietary models.
How It Works
Gimlet's proprietary software stack automatically maps AI workloads—including multi-step agents—across heterogeneous hardware. The system intelligently slices models to use the optimal chip for each portion: compute-bound inference runs on GPUs while memory-bound decode operations run on high-memory systems.
The startup supports NVIDIA and AMD GPUs, Intel and ARM CPUs, Cerebras accelerators, and d-Matrix SRAM-centric silicon. The company is also building custom datacenters to interconnect these diverse accelerators over high-speed networks, addressing the thermal and connectivity challenges that come with mixing different chip architectures.
The result: 3-10x faster inference for the same cost and power, including on frontier models with over 1 trillion parameters and large context windows.
Why This Matters
The AI industry is facing an inference bottleneck. As models grow larger and agentic workloads become more common, the demand for compute is outstripping what single-vendor hardware can efficiently deliver. Gimlet Labs is attempting to solve this by treating diverse silicon not as a problem to be avoided, but as a feature to be exploited.
"Frontier AI labs and hyperscalers are demanding faster, more efficient inference on agentic workloads," said Menlo Ventures in their investment thesis. "Gimlet is positioned to become the foundational infrastructure for the next generation of AI computing."
The startup was founded by Zain Asgar (CEO, Stanford adjunct), Michelle Nguyen, Omid Azizi, and Natalie Serrino. The funding will be used to expand the team and build out infrastructure to meet surging demand.