SingingPub: [New post] Considering WORLD

Respond to this post by replying above this line

Considering WORLD

by synsinger

WORLD is an open source library described as "a high-quality speech analysis, manipulation and synthesis system". I've been familiar with it for some time, and a lot of the general design ideas in the core of synSinger reflect those in WORLD.

WORLD uses three items to reconstruct speech:

The fundamental frequency
The spectral envelope
The aperiodicity measure

What got my attention is that WORLD doesn't retain phase information when reconstructing the vocal. Rather, it generates what it considers to be a reasonable value.

I read through a number of papers on WORLD and several videos, but they glossed over the specific details of how the phase was approximated.

So I finally dove into the code, and... Yeesh. I simply don't have the technical background to understand what's going on, and I have no clue who I could turn to for information.

I have a feeling that, at best, it's going to be a long slog to figure out how it's calculating the phase. Hopefully the results will be better than what I got with the Griffin-Lim code. It may even make me revisit that code, to see if I can find out where I went wrong there.

But I'm also considering whether I should simply use the WORLD library. After all, it pretty much already does what I'm trying to do. I could then simply focus on getting the framework to work with WORLD.

synsinger | April 1, 2022 at 7:40 am | Categories: Uncategorized | URL: https://wp.me/p3iI9y-Kq

Comment

SingingPub

Friday, 1 April 2022

[New post] Considering WORLD

Considering WORLD

No comments:

Post a Comment

How Music Shapes Memory & Identity & Is More Than The Soundtrack To Our Lives

Report Abuse

Friday, 1 April 2022

[New post] Considering WORLD

New post on synSinger

Considering WORLD

No comments:

Post a Comment

How Music Shapes Memory & Identity & Is More Than The Soundtrack To Our Lives