Definition
A training technique for sequence-to-sequence models where the model uses ground truth previous tokens instead of its own predictions. This speeds up training but can lead to exposure bias.
Detailed Explanation
Teacher forcing is a training method for recurrent neural networks and sequence-to-sequence models where the model receives the correct previous token as input rather than its own prediction. While this speeds up training and helps with convergence it can create a discrepancy between training and inference behavior known as exposure bias. Various scheduled sampling techniques have been developed to address this issue.
Use Cases
Sequence-to-sequence models Language generation Speech recognition Machine translation