SingingPub

Tuesday, 3 May 2022

[New post] WORLD, Spectral Envelopes, DCT and Mel Space

Site logo image synsinger posted: " I've been looking at the WORLD code, trying to figure out how the spectral envelopes are coded. WORLD is doing something interesting - it's using DCT coefficients and non-linear Mel space. The human ear doesn't hear sound linearly - there are some "

WORLD, Spectral Envelopes, DCT and Mel Space

synsinger

May 3

I've been looking at the WORLD code, trying to figure out how the spectral envelopes are coded.

WORLD is doing something interesting - it's using DCT coefficients and non-linear Mel space.

The human ear doesn't hear sound linearly - there are some parts of the audio spectrum that it pays a lot of attention to, and some parts that it doesn't pay much attention to at all.

This is useful, because it allows compression of data - less numbers have to be stored for the parts the ear is relatively insensitive to.

The process involves mapping data from one space (frequency) to another (e.g., non-linear Mel Space), and it's pretty straight-forward. WORLD appears to perform this mapping using simple linear interpolation.

Once the data has been mapped to Mel space, a DCT (Discrete Cosine Transform) is applied, and the n most significant coefficients are gathered.

So the process of converting the spectral envelope into DCT coefficients seems to be (more or less):

  • Interpolate the spectral envelope from frequency space to Mel space
  • Perform the DCT in Mel space
  • Gather the first n coefficients from bins of the the DCT

The process of converting the coefficients back to a spectral envelope is the inverse:

  • Fill a DCT with zeros
  • Put the saved coefficients into the first n bins of the DCT
  • Perform an IDCT to get the spectral envelope in Mel space
  • Interpolate the spectral envelope from Mel space to frequency space

This raises the question of why WORLD doesn't simply sample the spectral envelope in Mel space intervals and save the values. I haven't tested it, but I'm guessing the resulting envelope would be quite similar.

I suspect the DCT is used because it's useful for training neural networks. The DCT is well known as being able to de-correlate data, which is really helpful feature when training a neural network. And this one of WORLD's stated goals.

In any event, I think I've got a slightly better grasp on how WORLD is using DCT coefficients.

Comment
Like
Tip icon image You can also reply to this email to leave a comment.

Unsubscribe to no longer receive posts from synSinger.
Change your email settings at manage subscriptions.

Trouble clicking? Copy and paste this URL into your browser:
https://synsinger.wordpress.com

Powered by WordPress.com
Download on the App Store Get it on Google Play
at May 03, 2022
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest

No comments:

Post a Comment

Newer Post Older Post Home
Subscribe to: Post Comments (Atom)

Intracranial Semiquincentennial

not neo-colonial, or an adversarial necro-colonial Semiquincentennial ͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏...

  • You're on the list!
    Hello, ͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­͏     ­...
  • Listening
    ...
  • index left
    Read on blog or  Reader ...

Search This Blog

  • Home

About Me

SingingPub
View my complete profile

Report Abuse

Blog Archive

  • July 2026 (2)
  • June 2026 (27)
  • May 2026 (28)
  • April 2026 (26)
  • March 2026 (25)
  • February 2026 (24)
  • January 2026 (25)
  • December 2025 (24)
  • November 2025 (25)
  • October 2025 (27)
  • September 2025 (18)
  • August 2025 (31)
  • July 2025 (29)
  • June 2025 (32)
  • May 2025 (16)
  • April 2025 (18)
  • March 2025 (21)
  • February 2025 (22)
  • January 2025 (16)
  • December 2024 (22)
  • November 2024 (8)
  • October 2024 (11)
  • September 2024 (11)
  • August 2024 (2722)
  • July 2024 (3200)
  • June 2024 (3080)
  • May 2024 (3199)
  • April 2024 (3101)
  • March 2024 (3214)
  • February 2024 (3014)
  • January 2024 (3244)
  • December 2023 (3192)
  • November 2023 (2685)
  • October 2023 (2042)
  • September 2023 (1758)
  • August 2023 (1539)
  • July 2023 (1533)
  • June 2023 (1380)
  • May 2023 (1397)
  • April 2023 (1335)
  • March 2023 (1392)
  • February 2023 (1320)
  • January 2023 (1600)
  • December 2022 (1555)
  • November 2022 (1389)
  • October 2022 (1230)
  • September 2022 (1023)
  • August 2022 (1109)
  • July 2022 (1122)
  • June 2022 (1141)
  • May 2022 (1120)
  • April 2022 (1178)
  • March 2022 (1085)
  • February 2022 (763)
  • January 2022 (924)
  • December 2021 (1347)
  • November 2021 (2424)
Powered by Blogger.