HTMLDOC is a previously commercially developed
open-source program
Open-source software (OSS) is computer software that is released under a license in which the copyright holder grants users the rights to use, study, change, and distribute the software and its source code to anyone and for any purpose. Open ...
that converts
HTML and
Markdown web pages and files to
EPUB
EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes styled ''ePub''. EPUB is supported by many e-readers, and compatible software is available for most smartphones ...
, indexed HTML,
PostScript
PostScript (PS) is a page description language in the electronic publishing and desktop publishing realm. It is a dynamically typed, concatenative programming language. It was created at Adobe Systems by John Warnock, Charles Geschke, Doug Br ...
, and
PDF
Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
files, complete with a table of contents. HTMLDOC can be used from the command line, a simple GUI, or from a web server. Development originally occurred through the author's now-defunct business,
Easy Software Products, and now continues on the author's personal web site.
Features and limitations
HTMLDOC 1.9 supports most of HTML 3.2 with some elements of HTML 4.01, it has limited support for
Unicode and no support for CSS and PDF forms.
[HTMLDOC main page](_blank)
/ref>
HTMLDOC 1.9 supports the following character sets: Windows-874, Windows-1250, Windows-1251, Windows-1252, Windows-1253, Windows-1254, Windows-1255, Windows-1256, Windows-1257, Windows-1258, ISO-8859-1
ISO/IEC 8859-1:1998, ''Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1 ...
, ISO-8859-2, ISO-8859-3
ISO/IEC 8859-3:1999, ''Information technology — 8-bit single-byte coded graphic character sets — Part 3: Latin alphabet No. 3'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. I ...
, ISO-8859-4, ISO-8859-5, ISO-8859-6, ISO-8859-7
ISO/IEC 8859-7:2003, ''Information technology — 8-bit single-byte coded graphic character sets — Part 7: Latin/Greek alphabet'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. I ...
, ISO-8859-8, ISO-8859-9
ISO/IEC 8859-9:1999, ''Information technology — 8-bit single-byte coded graphic character sets — Part 9: Latin alphabet No. 5'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1989 ...
, ISO-8859-14
ISO/IEC 8859-14:1998, ''Information technology — 8-bit single-byte coded graphic character sets — Part 14: Latin alphabet No. 8 (Celtic)'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published ...
, ISO-8859-15, KOI8-R; you cannot mix characters from different code page
In computing, a code page is a character encoding and as such it is a specific association of a set of printable characters and control characters with unique numbers. Typically each number represents the binary value in a single byte. (In some co ...
s. There is no support for CJK and Arabic characters, and support for ISO-8859-13 is missing. Support for UTF-8 is limited mainly to Western, Latin-alphabet-based, left-to-right-written languages. HTMLDOC 1.9 uses several proprietary processing instructions for formatting the pdf output, these use the syntax of the HTML comments.[HTMLDOC 1.9 User's Manual](_blank)
/ref>
There are no plans for introducing the CSS support or broader Unicode support.[
]
License and availability
Licensed under the terms of the GNU General Public License version 2. It is legal to compile the sources and distribute the program, and various versions can be found on the Internet. For example, HTMLDOC is included as part of the Debian
Debian (), also known as Debian GNU/Linux, is a Linux distribution composed of free and open-source software, developed by the community-supported Debian Project, which was established by Ian Murdock on August 16, 1993. The first version of D ...
operating systems.Debian – Search Results – htmldoc
/ref>
References
External links
*
Free software programmed in C
Free software programmed in C++
{{web-software-stub