[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

TTS backends, documentation, API and openTTS



Tim,

First off, kudos on taking this on, I'll help in every way I can.

I suggest creating a git snapshot --- and doing the work there,
and emailing things in once a month or thereabout for integration
into the main emacspeak repository.  If you set up a git repos, I
can sync against it to do reviews.

You're right that a lot of the underlying abstractions are mostly
in my head -- would be good to get that out and into a pensive
called  a texinfo file. Trouble is, unless someone asks (as you
have) it's hard for me to tell what needs to go into the pensive
to be useful.

That said, here are some wish-lists:

1. Bi-way communication between emacspeak and the TTS  server
would make some edge cases work better. I never bothered doing
this because doing it via pipes and a forked subprocess is
painful and not worth the marginal additional benefit one would
get. But with Emacs now coming with DBus support, that would be
something I'd like to do if I had the time (which I sadly dont

2. SSML  support -- again a nice to have, but mostly not bubbled
up because  there are so few good TTS  engines on Linux that
would actually benefit. A standard like SSML  as a
bridge/intermediate representation is useful only when you have
multiple engines implementing quality synthesis

Your work-plan sounds good, start with creating  the
server/client communication api. Go ahead and use org-mode, it
can support to info, and is much easier than writing texinfo by hand.
think I'll get).
-- 

-- 


On 4/15/10, Tim Cross <tcross@rapttech.com.au> wrote:
>    Hi Raman (and others),
>
>    I'm hoping you can assist me with a couple of things related to emacspeak
>    and its interface with TTS systems. Part of what I'm hoping to do is help
>    set the scene for implementing an openTTS interface for emacspeak and
> make
>    it easier to create interfaces for other TTS systems and possibly
> increase
>    consistency across TTS backends.
>
>    This has mainly come up because I recently moved to a 64 bit Linux and
> now
>    have to use espeak. On one system, I also still use my old dectalk
> express.
>    This has revealled some issues, mostly minor, but sometimes frustrating,
>    with the espeak backend. I've therefore decided to put some of the other
>    projects I was working on aside for now and instead concentrate on trying
>    to scratch this particular itch as it will make my other projects easier
> to
>    work on in the end.
>
>    Initial analysis of the espeak tcl script and tclespeak.so code has
> raised
>    some questions and identified some weaknesses and/or inconsistencies with
>    how things have been implemented. I don't mean this to sound critical of
>    the individuals who have worked on and contributed code in these areas.
>    Most of what has been done is pretty good, but posibly needs updating and
>    some re-factoring. Given that much of this work has been done with little
>    formal documentation and to some extent by reverse engineering how things
>    work by examining the code, the result is really pretty good.
>
>    What I plan to do is broken into three stages
>
>    Stage 1:
>
>    1. Document the Emacspeak to TTS interface/protocol
>    2. Document the TCL API
>
>    Stage 2.
>
>    1. Identify any areas that might benefit from refactoring or refinement.
>    2. Examine how things may fit in with the openTTS protocol and possibly
>       other TTS systems or protocols/standards. For example, maybe
> supporting
>       SSML markup and work out what, if any, additions or modifications are
>       required.
>
>    Stage 3.
>
>    1. Implement changes and any re-factoring identified in stage 2.
>    2. Update documentation and create a basic HOWTO document that explains
> the
>       process to follow when adding a new TTS backend.
>    3. Update TTS drivers to work with the re-factored code. I would expect
>       this would be minor for most systems.
>    4. Update or re-implement the espeak TTS driver
>    5. Implement an interface for openTTS.
>
>    Obviously, this represents a considerable amount of work and I'm hoping
>    others will be able to assist, particularly with editing and proof
> reading
>    of the documentation and later with testing and hopefully, some
>    development.
>
>    In particular, Raman, your input in this whole process will be critical.
>    Nobody knows this stuff better than you and to some extent, it is your
>    vision that is being realised. At the same time, I'm vary aware your time
>    is limited and don't want to distract you from the important work that
> you
>    do developing the front-end functionality. My hope is that if you can
>    provide input in the first two stages, particularly with clarification or
>    corrections in my understanding of how the protocol and TCL API work and
>    any suggestions or ideas you may have for improvements and refinement,
> then
>    others can work on the last stage. With luck, having documentation and
> more
>    explination wil encourage others to assist and at the same time, reduce
>    demands on you for input.
>
>    While I feel fairly confident in being able to document the basic TCL
> API,
>    I'm less confident that I fully understand the more abstract design and
>    underlying philosophy that guided your decisions. I think it would be
>    important to include a little about the higher more abstract objectives
> as
>    they will assist others in understanding how to apply the lower level TCL
>    API. I'm also very keen to hear about any ideas or refinements you have
>    been thinking about, especially those ones you have wanted to do but have
>    not had the time to tackle or just have never risen high enough on the
>    priority list. This could be a good chance to refine some of the ideas
> and
>    concepts in the light of experience gained over the last few years and
>    developments in other areas.
>
>    I've also deliberately suggested a staged approach as each stage can
>    deliver something that will potentially be useful and reduces the
> likelihood
>    of partially completed work that leaves things in a worse state. I'm
> hoping
>    to avoid the mistake of taking on too much and failing to deliver
> anything
>    by doing things in distinct smaller stages.
>
>    Ultimately, I would aim to deliver the documentation in texinfo format so
>    that it can be incorporated into the rest of the main emacspeak docs.
>    However, I may use Org mode initially as I find it a very useful mode to
>    collect and collate data of this type.
>
>    Raman, should we maintain this in the main SVN or would you prefer I use
> a
>    separate repository and later, move it into the main repository once we
> have
>    something of real substance? I don't want to add to your load and expect
>    initially, there will be a lot of updates occuring. I would like to have
>    some way for others to grab the latest drafts, make improvements and
>    provide updates. I'm agnostic about whether it is SVN, GIT, BZR etc. We
>    could even just use the Emacspeak section on the emacs wiki initially,
>    though I don't like it as an interface as much as using Org mode.
>
>    Anyway, comments and feedback is welcome. While I'm prepared to try and
>    push this forward and recognise it represents considerable work to do
>    properly, I'm hoping that others will be able to help.
>
>    Tim
>

-----------------------------------------------------------------------------
To unsubscribe from the emacspeak list or change your address on the
emacspeak list send mail to "emacspeak-request@cs.vassar.edu" with a
subject of "unsubscribe" or "help".



If you have questions about this archive or had problems using it, please send mail to:

priestdo@cs.vassar.edu No Soliciting!

Emacspeak List Archive | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 | 1999 | 1998 | Pre 1998

Emacspeak Files | Emacspeak Blog | Search the archive