Thought Tracing

Thought tracing is a technique in AI interpretability that reconstructs and analyzes the intermediate reasoning steps (or “thoughts”) generated by a large language model (LLM) during task execution, revealing its internal decision-making process rather than treating it as a black box.

Key Insights

References