HomeLaunchesJanus
85

Janus – Simulation Testing for AI Agents

Battle-Test your AI Agents with Human Simulation

Hey Everyone! We’re Shivum and Jet, the co-founders of Janus! 👋🏼

TLDR; Janus battle-tests your AI agents to surface hallucinations, rule violations, and tool-call/performance failures. We run thousands of AI simulations against your chat/voice agents and offer custom evals for further model improvement.

Launch Video


💸 Why this matters

A single broken AI conversation can mean:

  • A PR disaster (Air Canada chatbot inventing refund policies)
  • Users churning after one bad reply
  • Lawsuits or regulatory fines for poor compliance

Yet most teams still test agents manually by pasting prompts into playgrounds.


🤕 The Problem

Manual QA covers maybe 100 scenarios, while real users trigger millions. Generic testing platforms don’t understand your customers and can’t simulate nuanced back‑and‑forths at scale. This leaves companies with no actionable insights and blind spots that only appear after you ship.


💡 Our Solution

Janus automatically:

  • Generates thousands of hyper‑realistic user personas—from angry customers to domain experts—to cover every possible edge case
  • Runs full multi‑turn conversations (text or voice) against your agent, APIs, and function calls
  • Allows you to input natural language rules on what to test your agent against and how you’d like it to perform
  • Detects hallucinations, bias, tool‑call failures, and risky responses using SOTA LLM‑as‑a‑Judge + black-box UQ techniques
  • Pinpoints root causes and produces actionable recommendations you can plug straight into CI/CD.

All in < 10 min.


📜 Backstory

Shivum and Jet left incoming roles at Anduril and IBM, dropped out of Carnegie Mellon ML, and moved to SF to build Janus full-time. We felt this pain first‑hand while building consumer-facing agents ourselves: every new model or prompt tweak broke something in prod. We built Janus to give ourselves the “crash‑test dummy” we wished existed from day-1.

🚀 Our Ask

Building or piloting an AI agent? Skip manual QA and get started in 15 minutes to see how Janus makes agent eval effortless: cal.com/team/janus/quick-chat.


Shivum & Jet (Founders of Janus)

Check us out at withjanus.com.

Email us at team@withjanus.com