[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Using emacspeak with speech-dispatcher






>>>>> "Tim" == Tim Cross <tcross@rapttech.com.au> writes:
    Tim> Hi Lukas, Just a couple of points which may help.
    Tim> 
    Tim> With respect to ALSA, I've been using ALSA for sometime
    Tim> now with the software dectalk and the ALSA OSS
    Tim> emulator. Raman has put some documentation on how to get
    Tim> direct ALSA support using ViaVoice Outloud. Alsa is
    Tim> certainly the way to go as it is pretty much the

With regard to ALSA, alsa-oss works depending on the sound card,
it doesn't always work.

To add alsa support to flite, first check that flite doesn't
already implement alsa support; it well might. 
If it does not, you'll need to:

A) Get Flite to return wave buffers in a call-back instead of
   writing them to the audio device 

B) In your callback, write the wave buffers to ALSA --- you can
see how this is done in file atcleci.cpp in the Emacs CVS
repository.


    Tim> official sound support layer for 2.6.x kernels onwards.
    
    Tim> If you go to my website at
    Tim> http://www-personal.une.edu.au/~tcross, you will find a
    Tim> tar ball of a patched emacspeak I did to support the

If you would like the above code to remain useful,
you'll need to factor out the SSML bits from the cepstral bits.
You and I had discussed how to implement SSML support -- not sure
if that discussion made it to the list archives;
But essentially you will need to contribute two files:

0)  ssml-voices.el  An implementation of the mapping from
    personalities to SSML codes

1)  ssml-tts      An associated TTS server 

In addition, you'll need to write the functions that configure
the speech layer to use the two -- that code would  be integrated
into dtk-speak by me, something I'll do once the other bits are
done.

There is little or no chance of my finding the time to pull the
relevant bits of from your tarball, because at present I need
those bits even less than you do:-)

    Tim> Cepstral voices, which includes an SSML
    Tim> interface. Unfortunately, this is for emacspeak 20, but
    Tim> you should be able to update it for emacspeak 23. The
    Tim> cepstral interface is broken and I've not updated it,
    Tim> but the SSML stuff will give you a good starting point
    Tim> for getting emacspeak to generate TTS commands which are
    Tim> based on SSML.
    Tim> 
    Tim> Note that the reason I've not maintained the SSML and
    Tim> Cepstral stuff is because Cepstral has changed its C API
    Tim> and I've not had time to update it and as the SSML stuff
    Tim> has never been integrated into emacspeak, I've just not
    Tim> had the time to re-patch every emacspeak version each
    Tim> time it comes out. Raman was going to have a look at
    Tim> what I've done, but it doesn't look like he has had
    Tim> time. I've never had any feedback on what I've done, so
    Tim> it could all be completely wrong, but it did seem to
    Tim> work. Unfortunately, the SSML support within Cepstral at
    Tim> the time I was using it wasn't great as they only
    Tim> supported a subset of the SSML tags and many of the ones
    Tim> we really wanted to get decent voice locking were not
    Tim> supported.
    Tim> 
    Tim> One of the issues with SSML is that unlike the existing
    Tim> emacspeak TTS, SSML is XML based and therefore requires
    Tim> properly formed start and end tags, while the existing
    Tim> TTS interfaces just use single start tags. This means
    Tim> having to patch the dtk-speak.el file so that TTS
    Tim> commands are given both a start and end tag. You will
    Tim> also need to create an ssml-voices.el file (see the
    Tim> outloud-voices.el as an example.
    Tim> 
    Tim> You will also find within the tar ball a
    Tim> generic-voices.el. This is a 'do nothing' voices file
    Tim> which can be used to get quick and dirty interfaces
    Tim> between emacspeak and any speech synthesizer happening
    Tim> quickly. Essentially, it just doesn't add voice lock
    Tim> type commands to the text emacspeak sends to the speech
    Tim> servers. So, instead of solutions which attempt to
    Tim> create a basic interface by having a script which strips
    Tim> out dtk or outloud commands, you can create text streams
    Tim> which just have text and eliminate the need to do any
    Tim> stripping. I was going to use this to create new double
    Tim> talk and flite interfaces which provided just basic
    Tim> speech.
    Tim> 
    Tim> Once you have emacspeak generating TTS commands which
    Tim> are SSML based, all that probably remains to do is
    Tim> create a tcl script which connects to speech dispatcher
    Tim> and passes the SSML tagged text to speech dispatcher via
    Tim> a socket, plus add support for commands such as changing
    Tim> punctuation and some of the useful but not essential
    Tim> bonus options, like split caps, all caps beep etc. In
    Tim> fact, it wouldn't even need to be a tcl script - I only
    Tim> mention it as all the other helper interface scripts are
    Tim> tcl. It could really be any language you
    Tim> like. Alternatively and possibly better left as a later
    Tim> task, you could bypass the helper scripts completely and
    Tim> create a direct interface to speech dispatcher from
    Tim> within elisp - check out the speech-dispatcher.el file
    Tim> for clues on doing this. However, if you go the direct
    Tim> interface route, you will have to do a fair amount of
    Tim> additional work which has already been done in the tcl
    Tim> tts-lib.tcl file by Raman, which is why I'd probably go
    Tim> the tcl interface helper route initially.
    Tim> 
    Tim> With respect to getting emacspeak to support multiple
    Tim> languages, I think this is a much more difficult
    Tim> task. Raman is the person to provide the best guidence
    Tim> here, but as I understand it, quite a lot of emacspeak
    Tim> would need to be changed. The source of the problem here
    Tim> I think is mainly due to the fact that historically,
    Tim> many hardware synths, like the dectalk, only supported
    Tim> single byte character sets and only handled 7 bit ascii
    Tim> reliably. Therefore, Raman did considerable work to
    Tim> incorporate character mapping and translation into
    Tim> emacspeak to ensure that only 7 bit characters are ever
    Tim> sent to the speech synthesizer.
    Tim> 
    Tim> This means that to get reliable support for multi-byte
    Tim> character sets and even 8 bit character sets, quite a
    Tim> bit of patching would be required. To make matters more
    Tim> complex, although most new software synthesizers (and
    Tim> even some hardware ones) will support at least the full
    Tim> 8 bits and some even multi-byte characrter sets,
    Tim> emacspeak would need some way of knowing this in order
    Tim> to provide consistent and reliable processing of
    Tim> characters and mapping of 'special' characters to
    Tim> meaningful spoken representations. However, currently,
    Tim> emacspeak doesn't have any facility which dynamically
    Tim> allows it to change its mapping of characters based on
    Tim> the capabilities of the speech synthesizer. While speech
    Tim> dispatcher may be able to handle this to some extent, we
    Tim> need to ensure support for existing synthesizers is not
    Tim> broken.
    Tim> 
    Tim> Although I actually know very little about other
    Tim> character sets, especially multibyte ones, I'd also be a
    Tim> little concerned about how emacs itself is evolving in
    Tim> this respect. From many of the posts on the emacs
    Tim> newsgroups, I get the impression that this is still a
    Tim> rather mirky and inconsistent aspect of emacs. It would
    Tim> certainly be important to check out emacs 22 and see
    Tim> what has changed there before making a lot of changes to
    Tim> eamcspeak. Definitely make sure you get guidence from
    Tim> Raman as in addition to him knowing more about emacspeak
    Tim> than anyone else, he is also the person who has had
    Tim> probably the most experience in dealing with issues like
    Tim> this.
    Tim> 
    Tim> While my personal time (like everyone else) is just too
    Tim> scarce at present, especially due to some large work
    Tim> projects I am taking on, I certainly would be prepared
    Tim> to try and provide some support in getting emacspeak
    Tim> support for speech dispatcher. I just don't know how
    Tim> much time I will have and what my response times will be
    Tim> like - probably pretty slow! However, don't hesitate to
    Tim> drop me an e-mail if you need some help and I'll see
    Tim> what I can do, just no promises!
    Tim> 
    Tim> good luck
    Tim> 
    Tim> Tim
    Tim> 
    Tim> Lukas Loehrer writes:
    >> Hi all,
    >> 
    >> using emacspeak with eflite, I aim for the following
    >> improvements:
    >> 
    >> 1. Use alsa for speech playback.  2. Have languages other
    >> than English, especially German in my case.
    >> 
    >> The first point is the more important one for me. I looked
    >> around and believe that speech-dispatcher is the most
    >> promising way to attack these goals. One advantage of this
    >> solution is that flite can be used for English and
    >> festival for other languages, so one can still benefit
    >> from the good performance of flite.
    >> 
    >> What is the best way to connect emacspeak to
    >> speech-dispatcher? Are there existing solutions? I am
    >> considering writing something similar to eflite that
    >> implements the emacspeak speech server interface and talks
    >> via speech-dispatcher. Another way would be to make emacs
    >> connect to speech-dispatcher directly via SSIP.
    >> 
    >> The advantage of the external solution would be that it
    >> does not require any changes to emacspeak to achieve the
    >> first of the above goals and that it could also be used
    >> with other programs like yasr that support the emacspeak
    >> speech interface.
    >> 
    >> As far as I can tell, multi-language support would require
    >> extensions to emacspeak in both approaches.
    >> 
    >> Does anyone have some thoughts or suggestions? Is
    >> speech-dispatcher the way to go?
    >> 
    >> Lukas
    >> 
    >> -----------------------------------------------------------------------------
    >> To unsubscribe from the emacspeak list or change your
    >> address on the emacspeak list send mail to
    >> "emacspeak-request@cs.vassar.edu" with a subject of
    >> "unsubscribe" or "help"
    >> 
    Tim>
    >> -----------------------------------------------------------------------------
To unsubscribe from the emacspeak list or change your address on
    Tim> the emacspeak list send mail to
    Tim> "emacspeak-request@cs.vassar.edu" with a subject of
    Tim> "unsubscribe" or "help"

-- 
Best Regards,
--raman

      
Email:  raman@users.sf.net
WWW:    http://emacspeak.sf.net/raman/
AIM:    emacspeak       GTalk: tv.raman.tv@gmail.com
PGP:    http://emacspeak.sf.net/raman/raman-almaden.asc
Google: tv+raman 

-----------------------------------------------------------------------------
To unsubscribe from the emacspeak list or change your address on the
emacspeak list send mail to "emacspeak-request@cs.vassar.edu" with a
subject of "unsubscribe" or "help"


Emacspeak Files | Subscribe | Unsubscribe | Search