[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

speech dispatcher




Interesting what you said about the festival interface. I've only
played around with speech dispatcher a bit, but I found the festival
interface increadibly clear and responsive (using 16k voices on a
1.3Ghz system). I knocked up a very simple swift faces along the same
line as the speech dispatcher dtk generic interface - works ok - has
some minor problems with missing letter here and there, but a
reasonable proof of concept. 

If the speech dispatcher interface is compatible with emacpseak, it
could be a very good addition. I've used the speechd.el interface for
speech dispatcher. Its not bad and I have it configured as a backup
incase I break emacspeak and need some emacs speech interface to get
things working again. 

One one level, I like the simplicity and "light-weight" aspects of
speechd.el, but on the other hand, it doesn't have the power of
emacspeak and there are many aspects of emacs which just don't provide
decent speech support using the speechd.el approach - for example,
spell checking, modes with multiple window interaction etc. 

However, a combination of an emacspeak front-end and a
speech-dispatcher backend for the TTS server interaction could be a
promising approach which would give us multiple TTS engine support
with minimal maintenance. I also seem to remember seeing an e-mail
about a group forming which was going to try and define a TTS API
framework which all probjects wishing to provide speech support could
use. This I think is a promising direction, but it will take some time
before it achieves any real progress - especially if its a big
committee!

Tim


>>>>> "Bart" == Bart Bunting <bart@bunting.net.au> writes:

 Bart> Hi, Spent a little time this morning hacking up the beginnings
 Bart> of an interface to speech-dispatcher.

 Bart> Seems to work pretty well with very little work.

 Bart> I just started from the dtk-soft server and hacked it around a
 Bart> bit.

 Bart> Festival sounds quite nice but is still not that responsive.
 Bart> The theta interface is much more snappy.


 Bart> Bart

 >> -----Original Message----- From: T. V. Raman
 >> [mailto:tvraman@comcast.net] Sent: Saturday, January 08, 2005
 >> 11:43 AM To: Tim Cross Cc: tvraman@comcast.net;
 >> peter.rayner@cea.fr; emacspeak@cs.vassar.edu Subject: current TTS
 >> options for Fedora-core 1
 >> 
 >> 
 >> 
 >> well, investigate it, and tell me what you discover.
 >> 
 >> >>>>> "Tim" == Tim Cross <tcross@rapttech.com.au> writes:
 >> 
 Tim> One other thing I forgot to mention is that I'm not sure if we
 Tim> should totally ignore other TTS interfaces. While it would be
 Tim> necessary to investigate what would be involved, something like
 Tim> the speech-dispatcher approach is probably worth investigating
 Tim> further. I know that it has become a lot more sophisticated,
 Tim> with support for auditory icons, multiple voices and multiple
 Tim> languages plus SSML. While it is possible that the approaches
 Tim> are so different that no true integration can be achieved, it
 Tim> should still be evaluated fully. The benefit of such approaches
 Tim> is that by creating just a single interface, we immediately gain
 Tim> support for a number of different TTS engines, including
 Tim> festival, flite, apollo, software dtk, epos, llia_phon etc - all
 Tim> of which can be maintained with a single interface.
 >>
 Tim> Tim-
 >> >>>>> "tvr" == T V Raman <tvraman@comcast.net> writes:
 >> 
 tvr> Actually there is very little that needs to be done to make
 tvr> option 1 complete, and as you say the other speech server
 tvr> frameworks out there are not sophisticated enough since they've
 tvr> mostly done a least common denominator approach.
 >>
 tvr> As things stand in the design the intent is that you shouldn't
 tvr> have to modify elisp or tcl files; in practice, thsi can be
 tvr> proven only by writing more servers, discovering where mods are
 tvr> needed, and refactoring code appropriately; discussion in the
 tvr> abstract usually leads to mud slinging and stone throwing,
 tvr> nothing else.
 >>
 tvr> If you examine how the dectalk and viavoice support works today,
 tvr> the dectalk specific code is now in dectalk-voices.el; the
 tvr> viavoice code in outloud-voices.el, and the TCL layer mirrors
 tvr> this, with the common TCL code in tts-lib.tcl.
 >>
 tvr> The name "dtk" is legacy and should be thought of as a synonym
 tvr> for tts --- I made sure of this the last time I refactored the
 tvr> code and named things that were dectalk specific with a dectalk-
 tvr> prefix.
 >>  >>>>> "Tim" == Tim Cross <tcross@rapttech.com.au> writes:
 >> 
 Tim> I think Raman's idea is a good one and I would certainly be
 Tim> willing to participate in a team which worked on speech servers
 Tim> for emacspeak. The current job I'm in means I will not have a
 Tim> lot of time for this project until after August, but am
 Tim> certainly willing to try and contribute when possible.
 >>
 Tim> If the emacspeak community decides this would bea good model to
 Tim> follow for providing speech server support, I think we need to
 Tim> start by looking at how we may be able to slightly modify the
 Tim> architecture of emacspeak so that additional servers do not
 Tim> require modification to the core emacspeak code-base. Currently,
 Tim> if you want to create a new server which is integrated into
 Tim> emacspeak in the same way as existing servers, you need to
 Tim> modify some of the emacspeak source code. I feel that if we are
 Tim> going to introduce another group, as far as possible, we need to
 Tim> have an architecture where Raman (or whoever) can extend
 Tim> emacspeak functionality without reference to the work done by
 Tim> another group which is adding speech servers.
 >>
 Tim> I feel we have a couple of options along these lines -
 >>
 Tim> 1. We could modify the existing code base so that we have a very
 Tim> well defined speech server interface layer. This would be the
 Tim> easiest option in my view as Raman has already got much of the
 Tim> work done - its really just a bit of cleanup work and moving
 Tim> some processing which currently happens at either the TCL server
 Tim> script level into the elisp layer or vice versa.
 >>
 Tim> 2. Possibly examine modifications to emacspeak so that it can
 Tim> work with other frameworks which have been developed for
 Tim> interfaces to generic speech servers. The speechd project is an
 Tim> example of this sort of approach. I also believe a group has
 Tim> been formed to create a uniform speech interface which KDE, GNOE
 Tim> et. al. would use and perhaps we should examine how feasible
 Tim> this might be. The main drawback I can see is that some of these
 Tim> projects don't seem to support the advanced features of
 Tim> emacspeak (e.g. don't handle multiple voices well, auditory
 Tim> icons etc), plus this would require possibly substantial changes
 Tim> to the emacspeak architecture.
 >>
 Tim> Other points of view, comments, concerns etc welcomed and
 Tim> encouraged. We need to contribute if we want emacspeak to
 Tim> evolve. I actually feel we are getting close to a time where
 Tim> emacspeak requires more maintainers, not just for speech servers
 Tim> but also for the emacspeak code base itself. Raman has held it
 Tim> together for a long time now, but he has other interests and
 Tim> responsabilities and its probably time us as users started
 Tim> taking on some of the tesponsability for its maintenance and
 Tim> development.
 >>
 Tim> Tim
 >>  >>>>> "tvr" == T V Raman <tvraman@comcast.net> writes:
 >> 
 tvr> An FAQ would be a good start. The next step would be to put
 tvr> together a small team that took responsibility for creating and
 tvr> maintaining speech servers. The reason I have not bothered
 tvr> updating the Software Dectalk support is that no more than
 tvr> ahandful of users out there bothered with even 4.61, and it's
 tvr> just not worth the effort required to maintain multiple speech
 tvr> servers for such a small user base. Under those the only thing
 tvr> that works is if the person who wants it the most puts in the
 tvr> effort. In this case, it's not me, since I already have my needs
 tvr> fully met.
 >>
 tvr> -- Best Regards, --raman
 >>
 tvr> Email: raman@cs.cornell.edu WWW: http://emacspeak.sf.net/raman/
 tvr> AIM: TVRaman PGP:
 tvr> http://emacspeak.sf.net/raman/raman-almaden.asc IRC:
 tvr> irc://irc.gnu.org/emacspeak
 >>
 tvr>
 >> ------------------------------------------------------------------
 >> -----------
 tvr> To unsubscribe from the emacspeak list or change your address on
 tvr> the emacspeak list send mail to
 tvr> "emacspeak-request@cs.vassar.edu" with a subject of
 tvr> "unsubscribe" or "help"
 >>
 tvr> -- Best Regards, --raman
 >>
 tvr> Email: raman@users.sf.net WWW: http://emacspeak.sf.net/raman/
 tvr> AIM: TVRaman PGP:
 tvr> http://emacspeak.sf.net/raman/raman-almaden.asc
 >>
 tvr>
 >> ------------------------------------------------------------------
 >> -----------
 tvr> To unsubscribe from the emacspeak list or change your address on
 tvr> the emacspeak list send mail to
 tvr> "emacspeak-request@cs.vassar.edu" with a subject of
 tvr> "unsubscribe" or "help"
 >> 
 >> --
 >> Best Regards, --raman
 >> 
 >> 
 >> Email: raman@users.sf.net WWW: http://emacspeak.sf.net/raman/ AIM:
 >> TVRaman PGP: http://emacspeak.sf.net/raman/raman-almaden.asc
 >> 


-----------------------------------------------------------------------------
To unsubscribe from the emacspeak list or change your address on the
emacspeak list send mail to "emacspeak-request@cs.vassar.edu" with a
subject of "unsubscribe" or "help"