3. Basic Theory
In order to learn how to oto properly it's important to understand what all the values and terms are and what they do.
What is an "oto?"
"Oto" and "otoing" are derived from the file "oto.ini", which must be present in the voicebank folder with the recordings. This file tells UTAU how to use the recordings when making a UST. The oto acts as a chart with different values that function in different ways.
Columns in a chart
An oto uses six types of values, or six columns in a chart:
- Left Blank/Offset
- Green/Blue shaded region
- Anything before the left blank will not be heard in the oto segment.
- This tells UTAU when to start looking for sound.
- Measured in milliseconds from the beginning of the recording
- Overlap
- Green line
- This value is measured in milliseconds from the left blank
- This is were the previous note will stop playing.
- Everything between the overlap and the left blank will be crossfaded with the previous recording when using p2p3 or p1p4 to adjust the envelopes.
- Preutterance
- Red line
- Measured in milliseconds from the left blank
- This is where the note starts - for example, if you have a quarter note, the beginning of the quarter note will play from the preutterance onward.
- This is probably the most important value in the oto.
- Everything between the overlap and the preutterance will NOT be affected by crossfade.
- Consonant
- Blue/Pink region
- Measured in milliseconds from the left blank
- Everything between the consonant and the left blank will not be stretched or looped when you change the note length.
- Right Blank/Cutoff
- Yellow/Blue region
- Measured in milliseconds from the end of the recording
- Alternatively, negative values are measured from the left blank.
- This is where UTAU stops looking for sound.
- Everything between the consonant and the right blank will be stretched or looped when you change the length of the note.
- Alias
- The alias isn't a numeric measure like the others; rather it's an alternate name for the recording/oto segment
- For example, if my recording is named "ka.wav" and I set the alias to "か," either [ka] or [か] will call up that recording when entered into the lyrics.
- Aliases are also used in multi-syllabic reclists to differentiate between syllables.
- For example, the VCV recording "ka_ka_ki_ka_ku_ke_ka.wav" would be split into [- ka][a ka][a ki][i ka][a ku] [u ke][e ka]
Great, but what do I do with them?
Well for the most part it varies depending on what type of bank you're working on, but there are a few things that never change:
- For CV or VCV otos, the preutterance should always be at the end of the consonant and before the vowel.
- The overlap must be less than or equal to half the preutterance, or the envelopes screw up (there are exceptions to this, but keep it in mind for now.)
- If the consonant and right blank are touching UTAU will crash
- The area between the consonant an the right blank should always be either a vowel, voiced consonant, or silence. You should also try to capture the most regular part of the vowel within this section to avoid strange glitches or noises.