What Are Audio Features, Anyway?
When you listen to a song, your brain processes dozens of sonic signals simultaneously — the speed, the intensity, whether it feels happy or sad, whether your body wants to move. You do this effortlessly, without thinking about it. But what if you could put numbers on all of that?
That's exactly what audio features are. They're measurable characteristics of a track that describe how the music sounds — not what genre it belongs to, not who the artist is, but the raw sonic properties. Think of them as the DNA of a song, broken down into individual traits.
When I built Orphea, I wanted to go beyond surface-level labels. Saying a track is "pop" or "electronic" tells you almost nothing about how it actually feels. But knowing that a track has 0.85 energy, 0.3 valence, and 140 BPM? That paints a picture — an intense, dark, fast-paced track. Suddenly you can compare it to every other track in your library and find real patterns.
In this guide, I'll break down every major audio feature, explain what the numbers actually mean, give you song examples, and show you how Orphea uses them to build your Music DNA profile.
Energy: How Intense Does the Music Feel?
Energy measures the perceived intensity and activity of a track. It's scored from 0.0 (minimal energy) to 1.0 (maximum energy). High-energy tracks feel fast, loud, and dense. Low-energy tracks feel calm, sparse, and gentle.
What contributes to energy? It's a combination of loudness, dynamic range, timbre, perceived speed, and general "noisiness." A death metal track with distorted guitars, blast beats, and screaming vocals will score close to 1.0. A solo piano ballad recorded in a quiet room will be closer to 0.1.
High-Energy Examples (0.8+)
- Rage Against the Machine — "Killing in the Name" (0.97) — Relentless intensity from start to finish
- The Prodigy — "Firestarter" (0.95) — Pounding electronic aggression
- Beyonce — "Crazy in Love" (0.86) — Upbeat, dense, driving rhythm
Low-Energy Examples (0.0–0.3)
- Bon Iver — "Skinny Love" (0.18) — Sparse acoustic guitar, fragile vocals
- Billie Eilish — "when the party's over" (0.22) — Minimal, quiet, intimate
- Norah Jones — "Don't Know Why" (0.15) — Laid-back jazz lounge feel
One thing to note: energy isn't the same as loudness or tempo. A loud, slow doom metal track can have high energy. A fast but quiet acoustic guitar piece can have low energy. It's about the overall perceptual intensity — the feeling of "something is happening" versus "space and silence."
Valence: Musical Happiness (or Sadness)
Valence is probably the most misunderstood audio feature. It measures the musical positivity conveyed by a track — how happy, cheerful, or euphoric the music sounds. A score of 1.0 is pure musical bliss. A score of 0.0 is deep sadness, anger, or darkness.
Here's the crucial distinction: valence describes how the music sounds, not what the lyrics say. A song with devastating lyrics can still have high valence if the melody is upbeat and the chords are major. Think "Hey Ya!" by OutKast — the lyrics are about a failing relationship, but the track sounds irresistibly happy. Its valence is around 0.93.
Conversely, an instrumental track with no sad lyrics at all can have very low valence if it uses minor keys, dissonant harmonies, and dark timbres. Valence captures the emotional color of the sound itself.
What Creates High Valence?
- Major keys and bright chord progressions
- Upbeat, bouncy rhythms
- Higher-pitched, clear timbres
- Resolution and harmonic "satisfaction"
What Creates Low Valence?
- Minor keys and dissonant harmonies
- Slow, heavy, or dragging rhythms
- Darker, lower-frequency timbres
- Tension without resolution
In my experience building Orphea, valence is the single best feature for understanding someone's emotional relationship with music. Two people can both love "electronic music," but one might prefer euphoric trance (valence 0.8+) while the other listens to dark techno (valence 0.2). Their emotional worlds are completely different.
Danceability: Can You Move to It?
Danceability describes how suitable a track is for dancing, based on a combination of tempo, rhythm stability, beat strength, and overall regularity. It ranges from 0.0 (impossible to dance to) to 1.0 (designed for the dance floor).
This one might seem straightforward, but it's subtler than you'd think. A track doesn't need to be fast to be danceable — many of the most danceable tracks in history are mid-tempo grooves. What matters is rhythmic consistency, a clear beat, and a groove that your body can lock into.
High Danceability (0.8+)
- Daft Punk — "Get Lucky" (0.86) — Smooth, steady, infectious groove
- Drake — "One Dance" (0.85) — Mid-tempo but rhythmically irresistible
- Bee Gees — "Stayin' Alive" (0.88) — The quintessential dance rhythm
Low Danceability (0.0–0.3)
- Radiohead — "Everything in Its Right Place" (0.27) — Rhythmically shifting, ambient
- Sigur Ros — "Hoppipolla" (0.21) — Builds and swells but no steady beat
- Pink Floyd — "Comfortably Numb" (0.25) — Slow, contemplative, free-form sections
What lowers danceability? Irregular time signatures (think prog rock in 7/8), frequent tempo changes, rubato playing (stretching and compressing time), and ambient textures without a clear pulse. If you can't nod your head to it on a steady count, danceability drops.
Danceability also interacts interestingly with energy. A track can be highly danceable but low energy (think chill house or bossa nova). Or it can be high energy but low danceability (chaotic free jazz or black metal). The combination tells you a lot about how someone experiences rhythm.
Tempo (BPM): The Speed of the Music
Tempo is measured in beats per minute (BPM) and represents the speed of the music. Unlike the other features, it's not scored 0.0 to 1.0 — it's an actual number, typically ranging from about 60 BPM (very slow) to 200+ BPM (very fast).
BPM is deceptively simple. You'd think it's just "how fast is the song?" But tempo interacts with everything else in complex ways. A 140 BPM drum and bass track feels frantic. A 140 BPM half-time trap beat feels slow and heavy because the perceived rhythm runs at half speed. Context matters enormously.
Common Tempo Ranges by Genre
- 60–80 BPM — Ballads, downtempo, hip-hop, R&B
- 80–100 BPM — Pop, reggaeton, trap
- 100–120 BPM — Pop, indie, rock
- 120–130 BPM — House, disco, dance pop
- 130–150 BPM — Techno, trance, dubstep
- 160–180 BPM — Drum and bass, jungle, hardcore punk
In Orphea, your average BPM across your library is a surprisingly stable metric. Most people cluster around a preferred tempo range and rarely stray far from it. If your average is around 85 BPM, you probably favor hip-hop, R&B, and downtempo. If it's around 128, house music and dance pop likely dominate. Around 170? You're probably deep in drum and bass or punk territory.
One quirk to be aware of: tempo detection can sometimes "halve" or "double" the actual BPM. A 140 BPM track might be detected as 70 BPM if the algorithm locks onto half-time. Orphea's AI accounts for this by cross-referencing with genre expectations and other audio features, but occasional misdetections happen with all analysis tools.
Acousticness, Instrumentalness & Loudness
Beyond the big four (energy, valence, danceability, tempo), several other audio features provide additional depth:
Acousticness (0.0 – 1.0)
A confidence measure of whether the track is acoustic. 1.0 represents high confidence that the track is acoustic — recorded with real instruments, no electronic production. 0.0 means heavily produced or electronic. An unplugged folk recording scores near 1.0. A synthwave track scores near 0.0.
Acousticness is interesting because it reveals production preferences. Some listeners exclusively prefer organic, natural-sounding music. Others are drawn to synthetic textures. Most people fall somewhere in between, and your Orphea DNA captures exactly where.
Instrumentalness (0.0 – 1.0)
Predicts whether a track contains no vocals. The closer to 1.0, the more likely there are no sung words. Values above 0.5 are typically instrumental tracks. This captures everything from classical symphonies to lo-fi hip-hop study beats to post-rock instrumentals.
This feature is useful for understanding whether you're drawn to music for the vocals/lyrics or for the pure sonic experience. High instrumentalness in your DNA suggests you process music more through texture and structure than through linguistic meaning.
Loudness (dB)
The overall average loudness of a track in decibels. Unlike energy, loudness is a purely physical measurement — the amplitude of the audio signal. Modern pop and rock tracks are often mastered very loud (-5 to -3 dB), while classical recordings and acoustic jazz tend to have more dynamic range (-15 to -8 dB).
The "loudness wars" of the 2000s pushed everything to maximum volume, crushing dynamic range. You can spot this in analysis: tracks from 2005–2015 tend to be louder than both older and newer releases, as the industry has slowly moved back toward more dynamic mastering.
How Orphea Analyzes Audio Features Without Spotify
Here's something most people don't realize: the original audio feature definitions come from the Spotify API. When Spotify analyzes a track, their algorithms extract energy, valence, danceability, and everything else directly from the audio signal. It's incredibly accurate — but it's locked to Spotify's ecosystem.
So what happens when you use SoundCloud, TIDAL, or Apple Music? Those platforms don't expose audio feature data the same way. This is where Orphea does something different.
I built an AI inference pipeline using Orphea's AI model that estimates audio features from a track's metadata — title, artist, genre context, and any available acoustic information. It's not analyzing the raw audio waveform (that would require enormous compute), but it's leveraging the model's training on millions of songs to make surprisingly accurate predictions.
How It Works
- Step 1: Orphea fetches your liked tracks from your connected provider (SoundCloud likes, TIDAL favorites, Apple Music library)
- Step 2: For each track, the AI model receives the title and artist name
- Step 3: The model infers energy, valence, danceability, tempo, and other features based on its knowledge of the song (or similar songs by the same artist)
- Step 4: Results are cached in the database so repeat analyses are instant
This approach is why Orphea is one of the few music analysis tools that works across multiple streaming platforms. You don't need Spotify to get detailed audio analysis — SoundCloud and TIDAL users get the same depth of insight. The playing field is level.
I'm continuing to improve the inference model, and future versions will incorporate actual audio analysis for even higher accuracy. But for now, the metadata-based approach delivers results that users consistently find accurate and insightful.
Putting It All Together
Audio features are the vocabulary for describing music in precise, comparable terms. Instead of vague labels like "chill" or "hype," you can say a track has 0.3 energy, 0.7 valence, 0.8 danceability, and 95 BPM — and anyone who understands these features immediately knows exactly what kind of track you're describing.
When Orphea builds your Music DNA profile, it's aggregating these features across your entire listening library. The result is a multi-dimensional fingerprint that captures:
- Your energy preferences — Do you lean toward intense or calm?
- Your emotional palette — Happy and bright, or dark and complex?
- Your rhythmic tendencies — Dance-oriented or free-form?
- Your tempo comfort zone — Slow grooves or fast-paced energy?
Understanding these features doesn't just satisfy curiosity — it makes you a more intentional listener. You start noticing why certain songs click and others don't. You build better playlists. You discover music that resonates on a deeper level because you know what you're actually looking for.
Frequently Asked Questions
Ready to discover your Music DNA?
Connect your streaming account, run your first scan, and see what your music says about you.
Try Orphea — Free