Speakers:
Taming Non-Determinism: a Framework for Evaluation and Observability in Autonomous Agent Trajectories
Date:
Tuesday, May 5, 2026
Time:
2:30 pm
Summary:
Deploying agentic AI in production introduces a unique engineering challenge: debugging non-deterministic execution paths.
Unlike traditional software, an agent’s “code” is a dynamic interplay of prompt context, model weights, and external tool outputs. This talk presents a rigorous engineering methodology for agents, focusing on quantifying agent performance and analyzing trajectories.
Key takeways:
→Trajectory evaluations: Analyzing the “Reasoning Trace” using a secondary judge model to detect hallucinated logic steps or tool misuse.
→ Cost-latency trade-offs: Optimizing token usage via dynamic context compression and speculative execution of tool calls.
→ Sandboxing & side-effects: Technical implementation of ephemeral execution environments to safely contain agentic code execution.