HomeCompaniesLuminal

Making AI run fast on any hardware.

Luminal builds an ML framework and compiler that generates GPU code. Our stack 10x's model speed while simplifying deployment and cutting idle GPU costs Github: https://github.com/luminal-ai/luminal Discord: https://discord.gg/APjuwHAbGy
Active Founders
Joe Fioti
Joe Fioti
Founder / CEO
Generating GPU kernels automatically to speed up ML models. Ex-Intel, worked on CPU microcode and ML accelerators.
Matthew Gunton
Matthew Gunton
Founder
Co-founder at Luminal AI. Ex-Amazon engineer, with globally deployed projects automatically finding issues in the Amazon fulfillment network and cost effectively fixing them
Jake Stevens
Jake Stevens
Founder
Cofounder at Luminal: generating GPU kernels automatically to speed up ML models. Ex-Apple. Talk to me about donuts or compilers or both :)
Company Launches
Luminal - PyTorch for Production
See original launch post

Hey we’re Joe, Jake, and Matthew, cofounders of Luminal.

Luminal is an open-source ML compiler that generates blazingly fast CUDA kernels and makes deploying AI to production one line of code. We’re already powering research at Yale, production workloads at VC-backed startups and several research labs.

https://youtu.be/Rgo15NY9K-8

TLDR:

  • Our drop-in upgrade to PyTorch automatically generates complex GPU kernel code, like Flash Attention, with zero hand-engineering
  • We provide truly serverless deployments; no idle costs or cold starts.
  • By not relying on pretrained LLMs or hand-written optimizations, we can reliably speed up any model

Ask:

The problem:

Most people running their own AI models are lighting money on fire and don’t even know it. AI teams are frustrated when moving their model from dev to production. They either fall into dependency hell or kill their model’s speed. Today, companies waste millions of dollars on GPU engineering teams to optimize new models before they can be served to users.

How Luminal solves it:

Luminal replaces a process that companies pay GPU engineers $300k+ a year to do.

Our key insight is that by treating optimization as a search problem, we’re able to automatically discover extremely complex optimizations in minutes that would take a GPU expert weeks. We use a series of ‘rewrite rules’ to create millions of graphs to describe your model, generate kernel code for each and then search for the fastest one based on runtime.

Example of a complicated kernel on the left that Luminal automatically sped up, on the right.

Luminal is:

  1. Closing the research → engineering gap. Because we auto-generate optimized GPU code directly from your PyTorch, a brand-new model can go from research notebook to production in hours instead of weeks
  2. Writing blazing fast models. Luminal generates kernels that would take a seasoned GPU engineer weeks of painstaking profiling and tuning to write.
  3. Future-proofing AI for new chips. We can rerun our compiler on fresh hardware to get new optimal kernels so you're never out of date.

You should reach out if:

  • You’re an AI researcher or ML engineer
  • You’d like to lower your compute bill by tens of thousands of dollars a month
  • You work at a company that runs their own custom models

If you want your team spending more time making the world’s best models and less time optimizing hardware and cloud infrastructure, we’re for you!

The Team:

Joe - ex Intel, every Intel chip sold has the AI accelerator Joe worked on. He has extensive experience optimizing performance at the silicon level.

Matt - ex Amazon where his software handled finding and automatically fixing issues within the global inventory network 24/7

Jake - ex Apple Jake worked on imaging at Apple for your iPhone. He’s been a founder (with an exit) and head of growth at another startup that he grew to ~$5M ARR.

Send us an email: contact@luminalai.com

Book a call with us

YC Photos
Hear from the founders

What is the core problem you are solving? Why is this a big problem? What made you decide to work on it?

Most people running their own AI models are lighting money on fire and don’t even know it. The rush to get to market first means nearly every AI company is leaving thousands of dollars on the table every MONTH by not optimizing their models. That’s why we made Luminal.

If you want your team spending more time making the world’s best models and less time optimizing hardware and cloud infrastructure, we’re for you!

What is your long-term vision? If you truly succeed, what will be different about the world?

At Luminal, we’re taking a bold swing at redefining how machine learning engineers interact with the core compute layer powering AI. When we succeed, the world’s intelligence stacks will be faster, simpler, and more robust. The hardware is ready for AI performance, the low-level code is not.

Solving the software layer destroys NVIDIA’s moat and democratizes compute for AI

YC Summer 2025 Demo Day Video
Jobs at Luminal
San Francisco, CA, US
$150K - $350K
0.50% - 2.00%
Any (new grads ok)
San Francisco, CA, US
$150K - $350K
0.50% - 2.00%
Any (new grads ok)
Luminal
Founded:2025
Batch:Summer 2025
Team Size:3
Status:
Active
Primary Partner:Jared Friedman