9. CVVC Japanese
If you're anything like I was when I first heard of it you're probably thinking, "tf is this? Japanese doesn't use ending consonants! Why does this exist??" and the answer to that question is the reclist is super small like 30 recordings??? When I first saw a CVVC reclist I actually didn't believe it.
The Stats
Average recordings: ~30
Average size: ~300 lines
Typical Aliasing Method: Hiragana/Romaji combo [a k][か]
Average size: ~300 lines
Typical Aliasing Method: Hiragana/Romaji combo [a k][か]
The Mechanics
The most important thing to remember here is that the CV and VC sections form two halves of a VCV oto: [a k][か] works pretty much exactly the same as [a か]. The key here is to preserve the natural spacing and fade-out with a third of the oto lines. This is easier than you'd think.
The Base Oto - Choosing Base Values
Now, you could auto generate the base oto with moresampler like with VCV but like.... don't. Moresampler is absolutely useless when it comes to CVVC and fixing what it gives you is probably harder than just making it yourself. So here's what we're gonna do:
Take a good look at your recordings. 8-mora CVVC (which is what I use) looks something like "ka_ki_ku_ke_ko_ka_n_ka.wav," which would be split into [ka][a k][ki][i k][ku][u k][ke][e k][ko][o k][n k]. However you decide to split yours, make sure you have a CV for all five vowels and a VC for all the vowels and ん. You should also have the CV's in hiragana, or hiragana and romaji if you like. Don't just have romaji. Japanese banks should have the option of using Japanese characters, period.
Your base values don't depend as strongly on the tempo as VCV ones do, so I tend to make the fairly uniform. I use (60,90,120,-400) for mine. Remember, if you set everything to a negative right blank value you have to save, close setparam, and then open it again or it'll switch back to positive values when you edit.
As far as the left blank goes..... there's two options: just scroll through the recording and deal with it or oto the first recording and then use those left blank values to estimate the rest of the recordings. Up to you really, one's not particularly easier than the other.
Take a good look at your recordings. 8-mora CVVC (which is what I use) looks something like "ka_ki_ku_ke_ko_ka_n_ka.wav," which would be split into [ka][a k][ki][i k][ku][u k][ke][e k][ko][o k][n k]. However you decide to split yours, make sure you have a CV for all five vowels and a VC for all the vowels and ん. You should also have the CV's in hiragana, or hiragana and romaji if you like. Don't just have romaji. Japanese banks should have the option of using Japanese characters, period.
Your base values don't depend as strongly on the tempo as VCV ones do, so I tend to make the fairly uniform. I use (60,90,120,-400) for mine. Remember, if you set everything to a negative right blank value you have to save, close setparam, and then open it again or it'll switch back to positive values when you edit.
As far as the left blank goes..... there's two options: just scroll through the recording and deal with it or oto the first recording and then use those left blank values to estimate the rest of the recordings. Up to you really, one's not particularly easier than the other.
Showtime.
CVs
The CV otos are not the same as the otos in a CV-only bank. The biggest difference is here, you want the left blank to be just after the end of the previous vowel, and you want to make sure that on hard consonants like d, t, k, etc. the overlap is not behind any part of the consonant. For the first syllable of the recording, look at the spacing of the second syllable and try to get as close as possible.
VCs
It's important to note that VCs in a CVVC Japanese bank are purely transitional. They're there to make up for the lack of VCV. Because of this, some consonants get cut out of the oto entirely. Allow me to explain.
Hard Consonants vs. Soft Consonants
This isn't going to be the most linguistically accurate description in the world, but long story short, a "hard" consonant (like k) has a space between itself and the previous vowel. A "soft" consonant (like s) has no space, the vowel flows straight into it. Soft consonants are either voiced or unvoiced. That's probably not the scientific definition but it is the easiest way to tell them apart.
For any type of VC in a japanese bank, the overlap should be at the end of the consistent vowel (like in VCV) and the preutterance should be as the end of the vowel. However, since the consonants all transition differently, the consonant and right blank values need to be adjusted to suit each type.
For any type of VC in a japanese bank, the overlap should be at the end of the consistent vowel (like in VCV) and the preutterance should be as the end of the vowel. However, since the consonants all transition differently, the consonant and right blank values need to be adjusted to suit each type.
HardThe consonant and right blank should encapsulate the silence between the vowel and the consonant. See Figure 1.
|
Soft UnvoicedUnvoiced consonants have no pitch, which means if utau tries to stretch/loop the consonant it'll sound like Satan. There's also no silence to stretch/loop. Best way to deal with this is to have the right black before the next vowel and put the consonant abut 10 ms before. Remember: if the consonant and the right blank touch UTAU will crash. See Figure 2.
|
Soft VoicedVoiced consonants carry a pitch, and thus can be looped and stretched to a limited extent - but if the timing in your recordings is consistent there's no need to worry about it sounding off. The consonant and right blank should frame the most consistent part of the consonant. See Figure 3.
|
Just like before, you should avoid changing the value of the overlap unless absolutely necessary.
Don't move on until you're confident in your abilities, because this is where it gets intense.
Don't move on until you're confident in your abilities, because this is where it gets intense.