Luminal

Making AI run fast on any hardware.

Summer 2025

Active

aiops

artificial-intelligence

developer-tools

cloud-computing

https://luminal.com

Making AI run fast on any hardware.

Luminal builds an ML framework and compiler that generates GPU code. Our stack 10x's model speed while simplifying deployment and cutting idle GPU costs Github: https://github.com/luminal-ai/luminal Discord: https://discord.gg/APjuwHAbGy

Active Founders

Joe Fioti

Founder / CEO

Generating GPU kernels automatically to speed up ML models. Ex-Intel, worked on CPU microcode and ML accelerators.

Joe Fioti

Founder / CEO

Generating GPU kernels automatically to speed up ML models. Ex-Intel, worked on CPU microcode and ML accelerators.

Matthew Gunton

Founder

Co-founder at Luminal AI. Ex-Amazon engineer, with globally deployed projects automatically finding issues in the Amazon fulfillment network and cost effectively fixing them

Matthew Gunton

Founder

Co-founder at Luminal AI. Ex-Amazon engineer, with globally deployed projects automatically finding issues in the Amazon fulfillment network and cost effectively fixing them

Jake Stevens

Founder

Cofounder at Luminal: generating GPU kernels automatically to speed up ML models. Ex-Apple. Talk to me about donuts or compilers or both :)

Jake Stevens

Founder

Cofounder at Luminal: generating GPU kernels automatically to speed up ML models. Ex-Apple. Talk to me about donuts or compilers or both :)

The problem:

Most people running their own AI models are lighting money on fire and don’t even know it. AI teams are frustrated when moving their model from dev to production. They either fall into dependency hell or kill their model’s speed. Today, companies waste millions of dollars on GPU engineering teams to optimize new models before they can be served to users.
—

How Luminal solves it:

Luminal replaces a process that companies pay GPU engineers $300k+ a year to do.

Our key insight is that by treating optimization as a search problem, we’re able to automatically discover extremely complex optimizations in minutes that would take a GPU expert weeks. We use a series of ‘rewrite rules’ to create millions of graphs to describe your model, generate kernel code for each and then search for the fastest one based on runtime.

Example of a complicated kernel on the left that Luminal automatically sped up, on the right.

Luminal is:

Closing the research → engineering gap. Because we auto-generate optimized GPU code directly from your PyTorch, a brand-new model can go from research notebook to production in hours instead of weeks
Writing blazing fast models. Luminal generates kernels that would take a seasoned GPU engineer weeks of painstaking profiling and tuning to write.
Future-proofing AI for new chips. We can rerun our compiler on fresh hardware to get new optimal kernels so you're never out of date.

—

You should reach out if:

You’re an AI researcher or ML engineer
You’d like to lower your compute bill by tens of thousands of dollars a month
You work at a company that runs their own custom models

If you want your team spending more time making the world’s best models and less time optimizing hardware and cloud infrastructure, we’re for you!

—

The Team:

Joe - ex Intel, every Intel chip sold has the AI accelerator Joe worked on. He has extensive experience optimizing performance at the silicon level.

Matt - ex Amazon where his software handled finding and automatically fixing issues within the global inventory network 24/7

Jake - ex Apple Jake worked on imaging at Apple for your iPhone. He’s been a founder (with an exit) and head of growth at another startup that he grew to ~$5M ARR.

Send us an email: contact@luminalai.com

Book a call with us

YC Photos

Hear from the founders

What is the core problem you are solving? Why is this a big problem? What made you decide to work on it?

Most people running their own AI models are lighting money on fire and don’t even know it. The rush to get to market first means nearly every AI company is leaving thousands of dollars on the table every MONTH by not optimizing their models. That’s why we made Luminal.

If you want your team spending more time making the world’s best models and less time optimizing hardware and cloud infrastructure, we’re for you!

What is your long-term vision? If you truly succeed, what will be different about the world?

At Luminal, we’re taking a bold swing at redefining how machine learning engineers interact with the core compute layer powering AI. When we succeed, the world’s intelligence stacks will be faster, simpler, and more robust. The hardware is ready for AI performance, the low-level code is not.

Solving the software layer destroys NVIDIA’s moat and democratizes compute for AI

YC Summer 2025 Demo Day Video

Jobs at Luminal

View all jobs

Compiler Engineer

San Francisco, CA, US

$150K - $350K

0.50% - 2.00%

Any (new grads ok)

Apply Now ›

Cloud Inference Engineer

San Francisco, CA, US

$150K - $350K

0.50% - 2.00%

Any (new grads ok)

Apply Now ›