====== HTMLDOC / DOC====== Converts HTML to PDF (here: for command-line linux (MS-Windows not covered, but similar)) Dokumentation: [[http://www.htmldoc.org/documentation.php]] {{tag>htmldoc pdf-cpnverter}} Quelle((http://www.htmldoc.org/htmldoc.html#CMDLINE)) To convert a single web page type: htmldoc --webpage -f output.pdf filename.html ENTER ==== What Are All These Commands? ==== *- -webpage is the document type that specifies unstructured files with page breaks between each file. * -f output.pdf is the file name that you will save all the documents into and also the type of file it is. In this example it is a PDF file. * filename.html is the name of the file that you want to be converted and the type of file it is. In this example it is a HTML file. Try the following exercise: You want to convert the file myhtml.html into a PDF file. The new file will be called mypdf.pdf. How would you do this? (Don't worry, it's answered for you on the next line. But try first.) To accomplish this type: htmldoc --webpage -f mypdf.pdf myhtml.html ENTER ===== Converting Multiple HTML Files ===== To convert more than one web page with page breaks between each HTML file, type: htmldoc --webpage -f output.pdf file1.html file2.html ENTER All we are doing is adding another file. In this example we are converting two files: file1.html and file2.html. Try this example: Convert one.html and two.html into a PDF file named 12pdf.pdf. Again, the answer is on the next line. Your line command should look like this: htmldoc --webpage -f 12pdf.pdf one.html two.html ENTER We've been using HTML files, but you can also use URLs. For example: htmldoc --webpage -f output.pdf http://slashdot.org/ ENTER ===== Generating Books ===== Type one of the following commands to generate a book from one or more HTML files: htmldoc --book -f output.html file1.html file2.html ENTER htmldoc --book -f output.pdf file1.html file2.html ENTER htmldoc --book -f output.ps file1.html file2.html ENTER ==== What are all these commands? ==== * htmldoc is the name of the sofware. * --book is a type of document that specifies that the input files are structured with headings. * -f output.html is where you want the converted files to go to. In this case, we requested the file be a HTML file. We could have made it a PDF (-f output.pdf) or Postscript (-f ouput.ps ), too. * file1.html and file2.html are the files you want to convert. HTMLDOC will build a table of contents for the book using the heading elements (H1, H2, etc.) in your HTML files. It will also add a title page using the document TITLE text (you're going to learn about title files shortly) and other META information you supply in your HTML files. See Chapter 6 - HTML Reference for more information on the META variables that are supported. Note: When using book mode, HTMLDOC starts rendering with the first H1 element. Any text, images, tables, and other viewable elements that precede the first H1 element are silently ignored. Because of this, make sure you have an H1 element in your HTML file, otherwise HTMLDOC will not convert anything! ===== Setting the Title File ===== The --titlefile option sets the HTML file or image to use on the title page: htmldoc --titlefile filename.bmp ... ENTER htmldoc --titlefile filename.gif ... ENTER htmldoc --titlefile filename.jpg ... ENTER htmldoc --titlefile filename.png ... ENTER htmldoc --titlefile filename.html ... ENTER HTMLDOC supports BMP, GIF, JPEG, and PNG images, as well as generic HTML text you supply for the title page(s). ==== Putting It All Together ==== htmldoc --book -f 12book.pdf 1book.html 2book.html --titlefile bookcover.jpg ENTER Take a look at the entire command line. Dissect the information. Can you see what the new filename is? What are the names of the files being converted? Do you see the titlepage file? What kind of file is your titlefile? Figure it out? The new file is 12book.pdf. The files converted were 1book.html and 2book.html. A title page was created using the JPEG image file bookcover.jpg. ====== Parameter: ====== Quelle((http://www.htmldoc.org/htmldoc.html#HTMLREF)) ===== Comments ===== HTMLDOC supports many special HTML comments to initiate page breaks, set the header and footer text, and control the current media options: Sets the left footer text; the test is applied to the current page if empty, or the next page otherwise. Sets the center footer text; the test is applied to the current page if empty, or the next page otherwise. Sets the right footer text; the test is applied to the current page if empty, or the next page otherwise. Break to the next half page. Sets the left header text; the test is applied to the current page if empty, or the next page otherwise. Sets the center header text; the test is applied to the current page if empty, or the next page otherwise. Sets the right header text; the test is applied to the current page if empty, or the next page otherwise. Sets the bottom margin of the page. The "nnn" string can be any standard measurement value, e.g. 0.5in, 36, 12mm, etc. Breaks to a new page if the current page is already marked. Sets the media color attribute for the page. The "foo" string is any color name that is supported by the printer, e.g. "Blue", "White", etc. Breaks to a new page or sheet if the current page is already marked. Chooses single-sided printing for the page; breaks to a new page or sheet if the current page is already marked. Chooses double-sided printing for the page; breaks to a new sheet if the current page is already marked. Chooses portrait orientation for the page; breaks to a new page if the current page is already marked. Chooses landscape orientation for the page; breaks to a new page if the current page is already marked. Sets the left margin of the page. The "nnn" string can be any standard measurement value, e.g. 0.5in, 36, 12mm, etc. Breaks to a new page if the current page is already marked. Sets the media position attribute (input tray) for the page. The "nnn" string is an integer that usually specifies the tray number. Breaks to a new page or sheet if the current page is already marked. Sets the right margin of the page. The "nnn" string can be any standard measurement value, e.g. 0.5in, 36, 12mm, etc. Breaks to a new page if the current page is already marked. Sets the media size to the specified size. The "foo" string can be "Letter", "Legal", "Universal", or "A4" for standard sizes or "WIDTHxHEIGHTunits" for custom sizes, e.g. "8.5x11in"; breaks to a new page or sheet if the current page is already marked. Sets the top margin of the page. The "nnn" string can be any standard measurement value, e.g. 0.5in, 36, 12mm, etc. Breaks to a new page if the current page is already marked. Sets the media type attribute for the page. The "foo" string is any type name that is supported by the printer, e.g. "Plain", "Glossy", etc. Breaks to a new page or sheet if the current page is already marked. Break if there is less than length units left on the current page. The length value defaults to lines of text but can be suffixed by in, mm, or cm to convert from the corresponding units. Break to the next page. Break to the next sheet. Sets the number of pages that are placed on each output page. Valid values are 1, 2, 4, 6, 9, and 16. Break to the next page.