Tuesday, August 31, 2010

Dynamic HTML to PDF conversion

There are a lot of libraries out there which allow you to create PDFs from HTML documents. Some of them even allow to use external stylesheets, others allow inline styles only. They have one disadvantage in common: the PDF never looks like your original HTML file. Other limitations include i.e.:
  • No support for floating,
  • no support for ordered lists,
  • no support for nested tables,
  • in general table support is weak for all libraries I tested,
  • only a subset of CSS styles is supported,
  • dynamic content placement is very difficult,
  • problems on page breaks,
  • etc.
The easiest way is to just print the HTML page to a PDF file. The generated files are nearly 100% as you expect them to be. The best solution I found to accomplish this is wkhtmltopdf. Ask your webhosting provider to install this tool, if you don't have the possiblity to install it by yourself. I heard of people that managed it to install wkhtmltopdf in their home directory successfully.

wkhtmltopdf uses QT libraries. These libraries include the Webkit rendering engine for Apple Safari Browsers. So you can expect to have all CSS and HTML features working that Safari currently supports. You have to compile the author's patched QT libraries to have full functionality without the need of using a running XServer. Alternatively use the static executables.

You can find a PHP Class you can use with wkhtmltopdf. It's pretty self explaining and the example application should help you creating your first PDF file. I create PDF files by rendering an HTML file out of my PHP application, saving it to a writable directory with other necessary external files like images, js and css and executing the wkhtmltopdf converter.

It's a great tool and creates small PDF files. I'm missing security features like password protection, though.