Hey everyone 👋
We’re Ben, Tom and Luk - co-founders of SF Tensor.
TL;DR: We let AI researchers forget about the infrastructure layer and focus on their research.
We automatically optimize kernels to run faster, find the cheapest GPUs across every provider and migrate your jobs when spot instances fail. Training AI should be about AI, not DevOps.
Ask: Know anyone training or fine-tuning AI models? We’d be grateful for an intro! Reach out to us at founders@sf-tensor.com.
Training AI should mean developing smarter architectures and finding better data. But right now, it doesn’t. Teams waste their time on everything but actual research:
This drives up costs, frustrates everyone and kills velocity. Infrastructure has inadvertently turned into the limiting factor for AI research labs, and it’s killing progress.
We experienced this first-hand developing our own foundation models – what we expected to be AI research, experimentation and iterative improvement turned out to be an ugly mix of writing CUDA, debugging driver mismatches and optimizing inter-GPU collective operations. That’s why we decided to solve the infrastructure layer, to allow other researchers to focus on research, not infrastructure.
SF Tensor is the "set it and forget it" infrastructure layer for anyone training or fine-tuning AI models. Hook up your repo, pick your GPU count and budget, and we deal with the rest:
We’re 3 brothers that have been working on Artificial Intelligence together for years, most recently training our own Foundational World Models. SF Tensor was born out of our own needs as AI researchers scaling up training runs to thousands of concurrent GPUs.
Ben has been publishing AI research since high school, solo-training models across 4,000 GPUs as co-PI on a 6-figure grant.
Tom and Luk (twins btw) have been doing AI research for years, from starting college in parallel to high school at age 14 to finishing their BSc in CS (at age 16).
Try us right now at sf-tensor.com or contact us at hello@sf-tensor.com to see how we can help with your infra pains.
AI labs and startups should focus on breakthrough research—new architectures, training methods, the stuff that ends up in papers. Instead, they burn countless hours configuring cloud infrastructure, debugging distributed training, and negotiating GPU deals.
We know because we lived it. While training our own models, we realized we were spending 60% of our time on infrastructure and 40% on actual research. Talking to other teams, we found everyone had this problem. Small labs can't afford dedicated infra teams. Large labs waste research talent on DevOps.
This infrastructure tax is holding back AI progress. We decided to eliminate it.
Today, wanting performance means vendor lock-in to NVIDIA. This creates artificial scarcity: limited production capacity, sky-high prices, and compute monopolies. By making all hardware from AMD GPUs, TPUs, Trainium, whatever comes next, equally usable, we'll unlock massive new supply.
In this world, anyone can train state-of-the-art models without ever thinking past their PyTorch code. Startups won't need infrastructure teams. Researchers won't waste time on cloud configuration. CEOs won't negotiate GPU rental deals.
Compute will be abundant, cheap, and boring. And AI research will accelerate because researchers can finally focus on research.