MuCUE Bench

MuCUE Bench - Music Comprehensive Understanding Evaluation

MuCUE Data Distribution

Overview of the MuCUE Dataset.

Multi-dimensional Music Coverage

Cross-genre & Multilingual Music Corpus

8 Music Understanding Tasks

Covering key identification, pitch detection ....

Authoritative Evaluation Framework

Expert-validated and dedicated to addressing sophisticated challenges in music audio comprehension, with intentionally designed scalability for future enhancements.

Core Tasks

Instrument Classification

Instrument Recognition in Music

Q:

Please listen and name the instruments in this clip.

Answer:

A. guitar
B. horn
C. saxophone
D. hi-hat

Genre Classification

Music Genre Classification (e.g., Rock, Classical, Jazz, Electronic)

Q:

What type of music is this track classified as?

Answer:

A. metal
B. disco
C. classical
D. country

Mood/Theme Identification

Mood/Theme Identification In Music

Q:

What’s the mood and thematic focus of this track?

Answer:

A. action
B. dark
C. upbeat
D. deep

Lyrical Reasoning

Lyric Analysis (The process of interpreting a song's underlying story, emotional drive, or deeper meaning by analyzing its word choice, flow, and hidden metaphors)

Q:

What is the main message of the lyrics?

Answer:

A. To give up when things get tough
B. To rely on others for happiness
C. To live freely and embrace one's journey
D. To seek revenge on haters

Comprehensive

Comprehensive Music Audio Understanding

Q:

What is the primary characteristic of the bass line in the song?

Answer:

A. Melodic and flowing
B. Percussive and rhythmic
C. Distorted and chaotic
D. Absent in the excerpt

Chord Identification

A chord – the fundamental harmonic unit formed by simultaneous pitches – plays critical roles in music analysis, AI composition, and MIR systems

Q:

I need to know the chord being used in this guitar section - can you identify it?

Answer:

A. D#:maj7/1
B. A#:maj(#11)/1
C. F#:min9(*5)/2
D. D:min9(*5)/1

Evaluation And Dataset Viewer

Dataset Viewer

  • Dataset Scale Specifications

    10,000+music samples,Total duration exceeds 100 hours, with average clip length of 10 seconds-10 minutes

  • Audio Quality

    44.1kHz sampling rate, 16-bit depth

Evaluation Metric System

Metric Tasks Computation
Accuracy All True samples / All samples

Audio Samples with Annotation Demo

Music Segments

Annotation:

tonal center: B major

Popular English Music Segments

Annotation:

primary characteristic of the bass line: Percussive and rhythmic

Datasets Overview

Tasks Datasets
Key Identification gs_key, gs_key_30s
Pitch Identification nsyn_pitch
Chord Identification guitarset
Instrument Classification ins_cls, nsyn_ins, mtg_ins
Genre Classification gtzan, fma-small, fma-medium, mtg_genre
Mood/Theme Identification mtg_mood
Lyrical Reasoning ly_m163
Comprehensive mmau-music, tat, mucho, mcaps, mqa_m163

Leaderboard

Below are the average performance metrics of each model on the MMMUBench test set, ranked in descending order by overall score.

Datasets Gemini-2.0-flash Qwen2.5-Omni Kimi-Audio Qwen2-Audio Ours
gs key 30s33.623.826.018.250.4
gtzan key33.728.728.322.034.1
nsyn pitch30.836.831.831.277.2
guitarset25.213.227.219.258.8
ballroom tempo31.728.924.431.129.4
gtzan tempo41.332.422.927.140.7
ins cls26.066.879.439.891.2
nsyn ins32.440.644.422.474.0
mtg ins19.855.851.224.068.6
gtzan72.288.677.883.981.3
fma-small63.466.255.865.672.4
fma-medium62.878.059.877.085.2
mtg genre57.261.655.846.481.4
ballroom genres57.045.844.035.252.4
mtg mood38.243.439.429.252.8
md4q71.947.661.357.865.9
salami segd40.618.627.219.464.8
salami pred37.632.234.631.264.8
salami cnt49.836.837.830.243.2
salami overall62.155.845.342.648.7
lyr88.287.487.060.090.8
mmau-music67.163.866.257.866.5
tat61.259.454.061.480.6
mucho69.666.569.766.763.9
mcaps62.265.668.074.080.0
mqa58.076.079.060.888.4
Avg 49.8 50.8 49.9 43.6 65.7

Resources and How to Participate

Contribute Guidelines

We welcome researchers from academia and industry to contribute to the improvement of the MuCUE Bench dataset and model optimization. Contribution methods include: