SF Tensor

Infrastructure for AI labs to focus on research.

Fall 2025

Active

https://sf-tensor.com

Infrastructure for AI labs to focus on research.

AI researchers should be pushing the boundaries of what's possible with new architectures and training methods. Instead, they waste weeks configuring cloud infrastructure, debugging distributed systems, and optimizing their GPU code. We know because we lived it: While training our own models across thousands of GPUs earlier this year, we spent more time fighting our infrastructure than doing actual research. That why we're building two things. First, Elastic Cloud: a managed platform that automatically finds the cheapest GPUs across all providers, handles spot instance preemption, and cuts compute costs by up to 80%. Second, automatic kernel optimization that makes training code run faster by modeling hardware topology, often beating hand-tuned implementations. The problem is that getting high performance across different hardware is genuinely hard. NVIDIA's CUDA moat exists because writing fast kernels requires deep expertise. Most teams either accept vendor lock-in or hire expensive kernel engineers. Our goal is to break the CUDA moat. The compute bottleneck is the biggest constraint on AI progress. NVIDIA can't manufacture enough GPUs, and their monopoly keeps prices astronomical. Meanwhile, AMD, Google, and Amazon are shipping capable alternative hardware that nobody uses because the software is too hard. We're breaking that moat. If we succeed, anyone will be able to train state-of-the-art models without thinking past their PyTorch code.

Active Founders

Ben Koska

Founder

Optimizing GPU Kernels @ SF Tensor.

Ben Koska

Founder

Optimizing GPU Kernels @ SF Tensor.

Luk Koska

Founder

Optimizing AI infrastructure @ SF Tensor (F25). Set to graduate with a BSc in CS at age 16.

Luk Koska

Founder

Optimizing AI infrastructure @ SF Tensor (F25). Set to graduate with a BSc in CS at age 16.

Tom Koska

Founder

Optimizing AI infrastructure @ SF Tensor (F25). Set to graduate with a BSc in CS at age 16.

Tom Koska

Founder

Optimizing AI infrastructure @ SF Tensor (F25). Set to graduate with a BSc in CS at age 16.

Company Launches

SF Tensor - Infrastructure for the Era of Large-Scale AI Training ⚡

See original launch post

Hey everyone 👋

We’re Ben, Tom and Luk - co-founders of SF Tensor.

TL;DR: We let AI researchers forget about the infrastructure layer and focus on their research.

We automatically optimize kernels to run faster, find the cheapest GPUs across every provider and migrate your jobs when spot instances fail. Training AI should be about AI, not DevOps.

Ask: Know anyone training or fine-tuning AI models? We’d be grateful for an intro! Reach out to us at founders@sf-tensor.com.

uploaded image

The Problem

Training AI should mean developing smarter architectures and finding better data. But right now, it doesn’t. Teams waste their time on everything but actual research:

Optimizing code so that training runs don’t drain the bank
Fighting cloud providers and scrambling for GPU availability
Making distributed training work with reasonable MFU (=cost efficiency).

This drives up costs, frustrates everyone and kills velocity. Infrastructure has inadvertently turned into the limiting factor for AI research labs, and it’s killing progress.

We experienced this first-hand developing our own foundation models – what we expected to be AI research, experimentation and iterative improvement turned out to be an ugly mix of writing CUDA, debugging driver mismatches and optimizing inter-GPU collective operations. That’s why we decided to solve the infrastructure layer, to allow other researchers to focus on research, not infrastructure.

Our Solution

SF Tensor is the "set it and forget it" infrastructure layer for anyone training or fine-tuning AI models. Hook up your repo, pick your GPU count and budget, and we deal with the rest:

Our automatic kernel optimizer analyzes your architecture and tunes execution for any hardware (NVIDIA, AMD or TPUs). No more having to drop down into custom CUDA because PyTorch doesn’t understand memory topology.
We find the cheapest available compute across all clouds for your specific requirements and launch your training run.
Automatic Distributed Training allows you to scale from 1 to 10,000 GPUs without having to change your code or killing your MFU
Everything else that you shouldn’t have to think about: Spot instance migration? Handled. Monitoring? Baked in. Logs and artifacts? Done.

uploaded image

The Team

uploaded image

We’re 3 brothers that have been working on Artificial Intelligence together for years, most recently training our own Foundational World Models. SF Tensor was born out of our own needs as AI researchers scaling up training runs to thousands of concurrent GPUs.

Ben has been publishing AI research since high school, solo-training models across 4,000 GPUs as co-PI on a 6-figure grant.

Tom and Luk (twins btw) have been doing AI research for years, from starting college in parallel to high school at age 14 to finishing their BSc in CS (at age 16).

Compute should be boring. Let’s make it boring.

Try us right now at sf-tensor.com or contact us at hello@sf-tensor.com to see how we can help with your infra pains.

uploaded image

YC Photos

Hear from the founders

What is the core problem you are solving? Why is this a big problem? What made you decide to work on it?

AI labs and startups should focus on breakthrough research—new architectures, training methods, the stuff that ends up in papers. Instead, they burn countless hours configuring cloud infrastructure, debugging distributed training, and negotiating GPU deals.

We know because we lived it. While training our own models, we realized we were spending 60% of our time on infrastructure and 40% on actual research. Talking to other teams, we found everyone had this problem. Small labs can't afford dedicated infra teams. Large labs waste research talent on DevOps.

This infrastructure tax is holding back AI progress. We decided to eliminate it.

What is your long-term vision? If you truly succeed, what will be different about the world?

Today, wanting performance means vendor lock-in to NVIDIA. This creates artificial scarcity: limited production capacity, sky-high prices, and compute monopolies. By making all hardware from AMD GPUs, TPUs, Trainium, whatever comes next, equally usable, we'll unlock massive new supply.

In this world, anyone can train state-of-the-art models without ever thinking past their PyTorch code. Startups won't need infrastructure teams. Researchers won't waste time on cloud configuration. CEOs won't negotiate GPU rental deals.

Compute will be abundant, cheap, and boring. And AI research will accelerate because researchers can finally focus on research.