Taming Non-Determinism: a Framework for Evaluation and Observability in Autonomous Agent Trajectories

Speakers:

Naman Goyal

Taming Non-Determinism: a Framework for Evaluation and Observability in Autonomous Agent Trajectories

Date:

Tuesday, May 5, 2026

Time:

2:30 pm

Summary:

Deploying agentic AI in production introduces a unique engineering challenge: debugging non-deterministic execution paths.

Unlike traditional software, an agent’s “code” is a dynamic interplay of prompt context, model weights, and external tool outputs. This talk presents a rigorous engineering methodology for agents, focusing on quantifying agent performance and analyzing trajectories.

Key takeways:

→Trajectory evaluations: Analyzing the “Reasoning Trace” using a secondary judge model to detect hallucinated logic steps or tool misuse.
→ Cost-latency trade-offs: Optimizing token usage via dynamic context compression and speculative execution of tool calls.
→ Sandboxing & side-effects: Technical implementation of ephemeral execution environments to safely contain agentic code execution.

Speakers: