The Right Answer Isn't Enough—with Karthik Narayan (Komodo Health)

Back to episodes

About the episode

If an AI agent gives you the correct answer but took the wrong path to get there, can you actually trust it?

In this episode of Futureproof, Prakash Chandran sits down with Karthik Narayan, Director of Product Management at Komodo Health, where he leads Marmot, an enterprise AI product for life sciences. Marmot's promise is that life sciences companies no longer need to send their data to McKinsey and wait months for an answer—they can ask complex healthcare questions and get answers directly. The catch? The answers aren't binary. Together, Karthik and Prakash unpack why grading an agent on whether it got the right answer is only half the story, how Komodo uses parallel critique agents and friction detection to close the gap between AI confidence and analyst rigor, and what changes when AI makes product leaders more powerful than they've ever been.

Topics covered include:

Why the path matters more than the answer: How an agent can arrive at the correct number through the wrong query, pass traditional evals, and then fail catastrophically on the next question—and why trajectory evals are the real measure of trustworthiness.
Steering, not just answering: How Marmot uses research plans, follow-up questions, and full code transparency to give analysts maximum control over subjective healthcare methodology decisions.
Friction detection over thumbs up/down: Why users rarely use explicit feedback mechanisms, how Komodo infers dissatisfaction from behavioral patterns, and how that drove a complete platform rewrite at the six-month mark.
Build vs. buy when AI makes prototypes easy: Why a junior engineer's weekend demo isn't the same as a production system with fallback models, context compaction, token optimization, and continuous evaluation—and how to think about total cost of ownership.

Chapters

00:00

Meet Karthik Narayan
Karthik shares his decade in healthcare across payer, digital health, and life sciences, and why the convergence of unmet need, product-market fit, and technology makes this the most exciting work of his career.

03:05

Why Healthcare Questions Are So Hard
He unpacks the complexity of claims data—procedure codes, drug codes, look-back periods, cohort definitions—and why a seemingly simple question like how many people have lung cancer is actually publication-worthy methodology work.

05:45

Steering the AI, Not Just Trusting It
Karthik explains Marmot's approach: research plans with follow-up questions, full code transparency, decision logs, and maximum steering capacity—because in a world where market size can range from 1,300 to 33,000 patients depending on methodology choices, the user must stay in control.

09:25

Learning from Friction, Not Feedback
A discussion on why thumbs up/down feedback fails, how Komodo detects friction from behavioral patterns instead, and how those meta-patterns drove a complete platform rewrite just six months after launch.

11:50

Parallel Critique Agents
Karthik describes how five independent agents critique every analysis plan before execution, how the main agent synthesizes their feedback—accepting some critiques and rejecting some with reasoning—and how eval scores jumped as a result.

14:55

Trajectory Evals: The Path Matters More Than the Answer
He introduces the concept of trajectory evals—measuring not just whether the AI got the right answer, but whether it used the right data, the right codes, and the right tools to get there—and why passing a final-answer eval means nothing if the path was wrong.

17:50

Managing Expectations and Engagement While AI Thinks
A practical look at user expectations when analysis takes minutes instead of seconds, why showing intermediate artifacts keeps users engaged, and how features like pause-and-continue, parallel chats, and deep research mode address the tension between speed and rigor.

22:55

Build vs. Buy in a World of Easy Prototypes
Karthik addresses the my junior engineer built this over the weekend objection, breaks down the real total cost of ownership—fallback models, friction detection, eval suites, token optimization, context compaction—and explains why prototypes and production systems are fundamentally different things.

27:05

Why AI Makes Great Product Managers Even More Powerful
He shares his evolving view: he initially feared AI would replace product managers, then realized the real differentiator was never synthesis—it was knowing where the value is and having the agency to go get it.

29:15

Staying Sharp and Rethinking Interfaces
Karthik discusses how he stays current by working deep inside the problem rather than at the periphery, and why chat may not be the right interface for many AI agents.

33:40

Hiring for Hands-On AI Talent
Why brand-name employers no longer guarantee hands-on AI skills, what Karthik actually looks for in candidates—real experience building harnesses, articulating trade-offs, and thinking about evaluation—and why even small-company experience can outshine a big-tech résumé.

36:00

Advice for Getting Started in High-Stakes AI
Karthik's practical guidance: get your environment set up even if it's painful, work directly in the codebase to rebuild your cost intuition, stay close to the harness because small prompt changes create invisible ripple effects, and accept that your old instincts about what's easy and what's hard no longer apply.

Hosted by

Prakash Chandran

Prakash Chandran
CEO, Xano

Listen on any platform

Get all episodes of Futureproof on your favorite platform.