[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PDF feature form.

Here is a better solution:

Organize a programming project to add support for PDF logical
structure to the pdftotext and pdftohtml utilities. Pdftotext is part
of the XPDF package, available at http://www.foolabs.com/xpdf/ and
Pdftohtml is based on it.

Basically, the task would be to write a program that would take a
structured PDF document (that is, one which includes a PDF structure
tree as defined in PDF versions 1.3 and above), and produces suitable
XML output.

It would require someone with good programming skills and an ability
to read the PDF reference manual, which is available in PDF format
from Adobe's web site and can be converted to text easily using
existing tools if required.

Xpdf can already process the PDF format; you would just need to add
support for the structural tree (and might as well work on PDF
bookmarks as well). Xpdf is written in C++ by the way, and is free
software under the GNU General Public License.

To unsubscribe from the emacspeak list or change your address on the
emacspeak list send mail to "emacspeak-request@cs.vassar.edu" with a
subject of "unsubscribe" or "help"