[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Aural CSS Settings Explained was Re: [emacspeak The Complete Audio Desktop] Emacspeak And Voice Locking Using Aural CSS
- To: email@example.com
- Subject: Aural CSS Settings Explained was Re: [emacspeak The Complete Audio Desktop] Emacspeak And Voice Locking Using Aural CSS
- From: "T. V. Raman" <firstname.lastname@example.org>
- Date: Tue, 21 Feb 2006 06:19:38 -0800
- Cc: email@example.com
- Delivered-To: firstname.lastname@example.org
- Delivered-To: email@example.com
- In-Reply-To: <firstname.lastname@example.org>
- List-Help: <mailto:email@example.com?subject=help>
- List-Post: <mailto:firstname.lastname@example.org>
- List-Subscribe: <mailto:email@example.com?subject=subscribe>
- List-Unsubscribe: <mailto:firstname.lastname@example.org?subject=unsubscribe>
- Old-Return-Path: <email@example.com>
- References: <26913376.1140486400815.JavaMail.firstname.lastname@example.org><email@example.com>
- Reply-To: firstname.lastname@example.org
- Resent-Date: Tue, 21 Feb 2006 09:19:39 -0500 (EST)
- Resent-From: email@example.com
- Resent-Message-ID: <nZ2dkB.A.mdE.7Fy-DB@mail>
- Resent-Sender: firstname.lastname@example.org
Aural CSS Settings Explained:
Here is how the four dimensions average-pitch, pitch-range,
stress and richness work (or are supposed to work)
First a bit about voices:
A speaking voice has a default pitch --- fundamental frequency
and this changes over the course of a sentence due to
inflection. Speakers also have the ability to "project" their
voice, or alternatively pitch it lower --- this is similar to
volume but not quite the same.
The ACSS Dimensions:
average-pitch: Basic voice pitch.
In practice, speakers with smaller heads have
higher pitched voices, so on formant TTS engines,
you need to vary the head-size inversely with the
fundamental frequency -- see dectalk-voices.el and
outloud-voices.el --- these are both formant
Pitch-range: Determines "how excited" the speaker sounds.
If you look at the overall intonation contour, pitch-range
determines how high the peaks get and how deep the valleys get.
Stress: This is indeed subtle.
Basically pitch-range is the overal intonation contour; stress
controls the individual peaks such as primary and secondary
stress. Just increasing pitch-range ends up with a very sing-song
effect; stress and pitch-range together often do better.
Richness: This is the "project your voice" setting. Its inverse
is "smoothness" which is why overlays like voice-smoothen set
richness to be low. The perceived effect is that the voice is
softer, with higher values of richness, the voice gets
"brighter". If you look at the spectogram, the "saw-tooth"
patterns you see are much sharper for higher richness values.
AIM: emacspeak GTalk: email@example.com
To unsubscribe from the emacspeak list or change your address on the
emacspeak list send mail to "firstname.lastname@example.org" with a
subject of "unsubscribe" or "help"
Emacspeak Files |
Unsubscribe | Search