One of the things that synSinger needs to do is take multi-syllable words, convert them to phonemes, and then split those phonemes back into syllables.
The good news is that I've got a version of the CMU Pronouncing Dictionary that does exactly that. Here's a portion of it:
SHUPP SH AH1 P SHUR SH ER1 SHURE SH UH1 R SHURGARD SH UH1 R G AA2 R D SHURLEY SH ER1 L IY0 SHURR SH ER1 SHURTLEFF SH ER1 T L IH0 F SHURTLIFF SH ER1 T L IH0 F SHURTZ SH ER1 T S SHUSTER SH AH1 S T ER0 SHUSTERMAN SH AH1 S T ER0 M AH0 N SHUT SH AH1 T SHUTDOWN SH AH1 T D AW2 N SHUTDOWNS SH AH1 T D AW2 N Z SHUTE SH UW1 T SHUTES SH UW1 T S SHUTOUT SH AH1 T AW2 T
For words in the CMU Dictionary, it's simply a process of doing a lookup.
For words not in the dictionary, the process is a bit more complex.
I've got a modified copy of the 1976 rules-based English-to-phonemes program from the Naval Research Laboratory. For example, here's the rules for the /T/ section:
-- T "(TH)", "TH", "#:(TED)", "TIXD", "S(TI)#N", "CH", "(TI)O", "SH", "(TI)A", "SH", "(TIEN)", "SHUN", "(TUR)#", "CHER", "(TU)A", "CHUW", " (TWO)", "TUW", "&(T)EN ", "", "(T)", "T",
The program isn't great, but as a fallback for the dictionary, it's adequate.
But there are no syllabification rules, so the phonemes still need to be split into syllables.
In one respect, this is a non-issue. As a rule of singing, all phoneme at the head of a note is the vowel. Any consonants that precede a vowel are moved to the end of the prior syllable.
For example, the word "geometry" is rendered phonetically as "JH IY AA M AH T R IY".
Internally, synSinger breaks the syllables at the vowels, so it would be "JH - IY - AA M - AH T R - IY". Note that the /jh/ is moved to the note or silence that precedes it.
This can look a bit odd to a user, so I've put together some code to convert phonemes generated by the rule-based code into syllables. It renders the example as "JH IY - AA - M AH - T R IY".
It isn't completely correct, but it looks reasonably accurate, which is all it needs to be. After all, it's just the fallback code.
Another option would be to write code to scan the phonetic dictionary for the phonemes that fall between vowels, and keep the most common ones as hyphenation rules. This wouldn't be too difficult to do, so I may do that at some point.
But for now, I just want to integrate the existing English-to-phoneme code so I can move forward with the synthesis code.
No comments:
Post a Comment