Text to music 2023-01-27
MusicLM by Google icon

MusicLM by Google

By unverified author. Claim this AI
Music labeling dataset with aspects and moods.
Generated by ChatGPT

MusicCaps is a dataset of 5,521 music clips of 10 seconds each, labeled with an aspect list and a free-text caption written by musicians. An aspect list is a list of adjectives that describe how the music sounds, such as “pop, tinny wide hi hats, mellow piano melody, high pitched female vocal melody, sustained pulsating synth lead”.

The free-text caption is a description of how the music sounds, including details like instruments and mood. MusicCaps is sourced from the AudioSet dataset and is divided into an eval and train split.

The dataset is licensed with a Creative Commons BY-SA 4.0 license. Each clip is labeled with metadata such as YT ID (pointing to the YouTube video in which the labeled music segment appears), start and end position in the video, labels from the AudioSet dataset, aspect list, caption, author ID (for grouping samples by who wrote them), is balanced subset, and is AudioSet eval split.

The dataset is intended to be used for music description tasks.


Community ratings

Average from 1 rating.

How would you rate MusicLM by Google?

Help other people by letting them know if this AI was useful.


Feature requests

Are you looking for a specific feature that's not present in MusicLM by Google?
MusicLM by Google was manually vetted by our editorial team and was first featured on January 28th 2023.
Promote this AI Claim this AI

Pros and Cons


Music labeling dataset
Aspects and moods identified
Descriptive free-text caption
Labeled by musicians
Eval and train split
Licensed under Creative Commons
YouTube video metadata
Dataset for describing music
High quality musical clips
Comprehensive aspect list
Designed for Music description tasks
Includes caption author ID
Balanced subset
Dataset linked to YouTube
Well-structured music captions
Points to start and end in video
Includes is AudioSet eval split


Limited to 10 second clips
Low quality sound
No active events
Only divisions are 'eval' and 'train'
Requires manual noise filtering
Limited music genres
Depends on Youtube availability
Only English labels
No waveform data
No ongoing support (never updated)


What is MusicLM?
What is the purpose of MusicLM?
What kind of data does MusicLM provide?
What is the size of the MusicCaps dataset?
What adjectives describe the music in the aspect list of MusicLM?
What is the eval and train split in MusicLM?
What type of license does MusicLM use?
What metadata is included with each music clip in MusicLM?
What is the intended use of the MusicLM dataset?
What type of tasks is the MusicLM dataset suitable for?
How can I access the MusicCaps dataset?
What does a free-text caption in MusicLM describe?
What is the source of the audio data for MusicCaps?
What does the YT ID metadata refer to in MusicLM?
Can I use MusicLM for commercial purposes?
Is it possible to sort or filter the music clips by who wrote them in MusicLM?
Is the MusicCaps dataset updated regularly?
What was the method for curating and labeling the music clips in MusicLM?
What is the AudioSet eval split in MusicLM?
How can MusicLM help in music description tasks?

If you liked MusicLM by Google

0 AIs selected
Clear selection