Definition
A centralized repository that allows storing structured and unstructured data at any scale.
Detailed Explanation
Data lakes store raw data in its native format until needed supporting various data types and processing requirements. They typically use distributed storage systems and provide flexible access patterns for different types of analysis.
Use Cases
1. Enterprise data storage 2. Machine learning training 3. Historical analysis 4. Data science exploration