Definition
AI safety approach supervising the AI's reasoning process, not just the final outcome.
Detailed Explanation
An AI safety approach focusing on rewarding or supervising the *process* by which an AI arrives at an answer, rather than just the final outcome, to encourage sound reasoning.
Use Cases
Training AI systems to reason correctly, reducing deceptive alignment risks, improving AI reliability in complex problem-solving, AI alignment techniques.