Cloudglue

Developer APIs to let your AI/LLM understand videos and audio.

Summer 2026 - Full-Stack AI Engineer Intern

$6.5K - $8.5K / monthlySan Francisco, CA, US / Remote (US)
Job type
Internship
Role
Engineering, Full stack
School year
Any
Visa
US citizen/visa only
Connect directly with founders of the best YC-funded startups.
Apply to role ›
Amy Xiao
Amy Xiao
Founder

About the role

About Cloudglue

Cloudglue is building foundational infrastructure that enables AI to understand videos for the first time. Our APIs enable developers to add video search, multi-video chat, and get structured data extraction from video content - reliably and at scale - in just a few lines of code.

We work at the bleeding edge of multimodal AI: computer vision, audio processing, video understanding, agentic retrieval, and AI-first UI/UX to enable video AI capabilities that don’t exist today.

We are building towards a future where AI agents can see, hear and understand video as natively as text, and reimagining how the world learns from and interacts with video content.

Team

Our team has shipped AI/ML powered products at Snapchat and Amazon that served billions of users, deployed optimized large-scale ML systems that cut millions in costs, and published frontier research at venues like ICCV, NeurIPS, CVPR, AWS re:Invent and DEF CON. At Cloudglue, you’ll join a small, fast-moving team where every engineer makes a direct impact on the product and company trajectory.

Velocity

In less than 1 year at Cloudglue, our nimble team of 3 has:

  • Published 5+ frontier papers in video AI at top-tier conferences
  • Outperformed Gemini on cost, fidelity, speed, and features, backed by our research
  • Built and productionized Cloudglue APIs (in use by companies today)
  • Signed our first paying customers, have hundreds of developers onboarded, and growing

Your Role

We’re looking for a driven, deeply curious student to join us as a Full-Stack AI Engineer Intern. This isn’t just a coding role, you will:

  • Work directly with our CTO (ex-Snap, ex-Amazon, Carnegie Mellon alum) on projects that push the boundaries of multimodal AI.
  • Ship features end-to-end across our stack (React/TypeScript frontend, Node/Python backend).
  • Integrate frontier video/audio AI models into production APIs.
  • Propose new features.
  • Collaborate directly with founders, customers, and researchers to drive real-world impact.

If you want a startup experience where you can wear multiple hats, have a voice, and make visible contributions at the bleeding edge of video AI, Cloudglue is here to provide that experience.

Responsibilities

  • Full-Stack Development: Build and ship features across frontend (React/TypeScript) and backend (Node, Python).
  • AI Integration: Deploy and optimize cutting-edge multimodal AI models for video/audio understanding.
  • Tool & UI Design: Create intuitive developer tools and UIs that bring video/audio insights to life.
  • Collaboration & Ownership: Contribute ideas, own projects, and work closely with founders in a fast-paced startup environment.

What We’re Looking For

Required Skills

  • Strong CS fundamentals (algorithms, data structures).
  • Database proficiency (SQL, query optimization).
  • Excellent communication and collaborative mindset.

Nice to Haves

  • Full-stack web experience (TypeScript/React, Next.js, Supabase, Vercel).
  • Python backend + familiarity with AI orchestration frameworks (LangGraph, LangChain, Temporal, etc.).
  • Experience with vector databases (Pinecone, Weaviate, Milvus, pgvector).
  • UI/UX instincts for building developer-facing tools.
  • Cloud deployment knowledge (AWS/GCP, Docker/Kubernetes).

About Cloudglue

Cloudglue
Founded:2024
Batch:S24
Team Size:3
Status:
Active
Location:San Francisco
Founders
Amy Xiao
Amy Xiao
Founder