[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

more on 8-bit characters and ViaVoice

To recap: characters with the 8th bit set aren't coming out nicely
through ViaVoice.   This is a problem when using non-English European
languages e.g. the latin-1 alphabet.

I'm sure I'm not going about this in a very intelligent manner but I'm
having fun learning a lot about emacs.

A couple of weeks ago I said I thought the problem wasn't in emacs but
either ViaVoice or tcl.  It turns out this was unjustified on the
evidence I had then but is probably correct anyway.  I had put a
write-region call in the body of dtk-interp-queue in the file
dtk-interp.el to check whether the strings were being modified by
emacspeak code.  They weren't.  However there turned out to be every
chance that the encoding system used for the file i/o in my test and
the process-send-string was different.  And it was ...  A bit of
hacking with some octal numbers confirmed that what was being
spoken was the utf-8 encoding of the 8-bit characters so the encoding
system was a reasonable place to look, rather than some internal
translation within emacs. 
On my system the form (process-coding-system (get-process
"speaker")) returned (raw-text-unix . iso-latin-1) while the file i.o
was raw-text for both input and output.  (note that the second value
is for output to the process).  Looked like the solution.  So call
set-process-coding-system just before the process-send-string and
... no luck, still the same behaviour.

change tack, was it ViaVoice perhaps?  Get into a shell *outside*
emacs and run the outloud tcl script directly.  Not much help there,
since the characters with the 8th bit set were being mapped to escape
sequences by the keyboard drivers, tcl was never seeing them.
However, hacking the tiny cmdlinespeak program which comes with
ViaVoice_tts and inserting the problematic 8-bit characters in the
arguments of eciAddText produced correct behaviour.  It looks as if we
can get things working if we can get the characters to ViaVoice

So back to emacs.  Were the 8-bit characters getting out?   I really
wanted to instrument the tcl script to see just what bytes it was
receiving but I don't speak tcl.  Instead I wrote a parallel
invocation of an external process in the body of dtk-initialize.  The
process is a 5-line C program which copies its standard input byte by byte into
a file.  A call to process-send-string in dtk-interp-queue and I
*hope* we're now seeing exactly the same bytes as the tcl interpretter
is seeing.  Here's some output
q {ad-handle-definition:  backquote occur-mode-goto-occurrence' got redefined}
q {  Press C-h C-e to get an   overview of emacspeak  16.0  I am  completely operational,  and all my circuits are functioning perfectly! }
q {Preparing diary aw 3 .}
q {Command }
q {this}
q {is}
q {a}
q {Preparing diary aw 3 .}
q {this}
q {this}
q {is}
q {this }
q { is another}
Note the second last line.  As I look at it now, ViaVoice is saying
"q this a circumflex inverted exclamation point" (omitting the
punctuation) and a look at the line shows the character octal 241  is
really in the file.  A final check with
od -t o1 
confirms the point.

So where are we now?
I think the 8-bit characters are being sent to the speech-server, in
fact I think they are making it as far as the standard input.
they are not being sent to ViaVoice.  *If* that's right  then the
question might be what does tcl do with its input?  I'm not a tcl
person at all so am getting some local help with that.
I welcome corrections to the logic that's got me here though.

To unsubscribe from the emacspeak list or change your address on the
emacspeak list send mail to "emacspeak-request@cs.vassar.edu" with a
subject of "unsubscribe" or "help"