Definition
Controlled experimentation comparing two or more versions of an ML model in production.
Detailed Explanation
Statistical method for comparing model versions by routing production traffic between variants. Includes traffic splitting metric collection statistical analysis and significance testing. Used for validating model improvements and measuring business impact.
Use Cases
Model improvement validation Feature impact testing Performance comparison User impact studies