[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

TTS backends, documentation, API and openTTS

   Hi Raman (and others), 

   I'm hoping you can assist me with a couple of things related to emacspeak
   and its interface with TTS systems. Part of what I'm hoping to do is help
   set the scene for implementing an openTTS interface for emacspeak and make
   it easier to create interfaces for other TTS systems and possibly increase
   consistency across TTS backends. 

   This has mainly come up because I recently moved to a 64 bit Linux and now
   have to use espeak. On one system, I also still use my old dectalk express.
   This has revealled some issues, mostly minor, but sometimes frustrating,
   with the espeak backend. I've therefore decided to put some of the other
   projects I was working on aside for now and instead concentrate on trying
   to scratch this particular itch as it will make my other projects easier to
   work on in the end.

   Initial analysis of the espeak tcl script and tclespeak.so code has raised
   some questions and identified some weaknesses and/or inconsistencies with
   how things have been implemented. I don't mean this to sound critical of
   the individuals who have worked on and contributed code in these areas.
   Most of what has been done is pretty good, but posibly needs updating and
   some re-factoring. Given that much of this work has been done with little
   formal documentation and to some extent by reverse engineering how things
   work by examining the code, the result is really pretty good.

   What I plan to do is broken into three stages
   Stage 1:

   1. Document the Emacspeak to TTS interface/protocol 
   2. Document the TCL API 

   Stage 2. 

   1. Identify any areas that might benefit from refactoring or refinement. 
   2. Examine how things may fit in with the openTTS protocol and possibly
      other TTS systems or protocols/standards. For example, maybe supporting
      SSML markup and work out what, if any, additions or modifications are

   Stage 3.

   1. Implement changes and any re-factoring identified in stage 2.
   2. Update documentation and create a basic HOWTO document that explains the
      process to follow when adding a new TTS backend. 
   3. Update TTS drivers to work with the re-factored code. I would expect
      this would be minor for most systems. 
   4. Update or re-implement the espeak TTS driver
   5. Implement an interface for openTTS. 

   Obviously, this represents a considerable amount of work and I'm hoping
   others will be able to assist, particularly with editing and proof reading
   of the documentation and later with testing and hopefully, some

   In particular, Raman, your input in this whole process will be critical.
   Nobody knows this stuff better than you and to some extent, it is your
   vision that is being realised. At the same time, I'm vary aware your time
   is limited and don't want to distract you from the important work that you
   do developing the front-end functionality. My hope is that if you can
   provide input in the first two stages, particularly with clarification or
   corrections in my understanding of how the protocol and TCL API work and
   any suggestions or ideas you may have for improvements and refinement, then
   others can work on the last stage. With luck, having documentation and more
   explination wil encourage others to assist and at the same time, reduce
   demands on you for input. 

   While I feel fairly confident in being able to document the basic TCL API,
   I'm less confident that I fully understand the more abstract design and
   underlying philosophy that guided your decisions. I think it would be
   important to include a little about the higher more abstract objectives as
   they will assist others in understanding how to apply the lower level TCL
   API. I'm also very keen to hear about any ideas or refinements you have
   been thinking about, especially those ones you have wanted to do but have
   not had the time to tackle or just have never risen high enough on the
   priority list. This could be a good chance to refine some of the ideas and
   concepts in the light of experience gained over the last few years and
   developments in other areas.

   I've also deliberately suggested a staged approach as each stage can
   deliver something that will potentially be useful and reduces the likelihood
   of partially completed work that leaves things in a worse state. I'm hoping
   to avoid the mistake of taking on too much and failing to deliver anything
   by doing things in distinct smaller stages. 

   Ultimately, I would aim to deliver the documentation in texinfo format so
   that it can be incorporated into the rest of the main emacspeak docs.
   However, I may use Org mode initially as I find it a very useful mode to
   collect and collate data of this type. 

   Raman, should we maintain this in the main SVN or would you prefer I use a
   separate repository and later, move it into the main repository once we have
   something of real substance? I don't want to add to your load and expect
   initially, there will be a lot of updates occuring. I would like to have
   some way for others to grab the latest drafts, make improvements and
   provide updates. I'm agnostic about whether it is SVN, GIT, BZR etc. We
   could even just use the Emacspeak section on the emacs wiki initially,
   though I don't like it as an interface as much as using Org mode.

   Anyway, comments and feedback is welcome. While I'm prepared to try and
   push this forward and recognise it represents considerable work to do
   properly, I'm hoping that others will be able to help. 


To unsubscribe from the emacspeak list or change your address on the
emacspeak list send mail to "emacspeak-request@cs.vassar.edu" with a
subject of "unsubscribe" or "help".

If you have questions about this archive or had problems using it, please send mail to:

priestdo@cs.vassar.edu No Soliciting!

Emacspeak List Archive | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 | 1999 | 1998 | Pre 1998

Emacspeak Files | Emacspeak Blog | Search the archive