Definition
The basic units of text or data that AI models process, typically words, subwords, or characters.
Detailed Explanation
Discrete elements that represent text or data in a format processable by AI models. In NLP, tokens can be words, subwords, or characters, created through various tokenization algorithms. The choice of tokenization strategy affects model performance and efficiency.
Use Cases
Language model processing, Text generation, Machine translation, Code completion