Important: Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

Generating PDF from HTML, fast!



  • Qt 5.7, Linux & Windows. I have in-memory HTML string. I need to convert to PDF, and save to file.

    Up until now I have been using QWebEngineView to achieve this. That involves having to wait to allow it to finish loading & rendering the HTML, and then QWebEnginePage::printToPdf() to export to PDF.

    This is fine where I am allowing the user to preview the HTML and generate the PDF interactively.

    However, the software also has to export hundreds of HTML documents to PDF unattended. Although in that situation I cut out actually displaying the QWebEngineView and just use it non-interactively to generate the PDF, it's way too slow. (The HTML load/render actually takes longer than the print to PDF.) Trust me!

    So I have two questions:

    1. What do you/should I use for HTML->PDF within Qt? I see there is QPdfWriter (http://doc.qt.io/qt-5/qpdfwriter.html) or there is QTextDocument::print() (http://doc.qt.io/qt-5/qtextdocument.html#print) and send to PDF-printer-to-file. Which one to use? Have I missed another one?

    2. My users are "an*lly retentive" about the exact format of their output. I already had problems when I moved them from QWebKit with its PDF-generation-engine over to QWebEngine with its different one. I'm a bit unsure about this: where are the PDF-generation-engines? I believe QWebEngine is using a Chromium one (or at least its own), if I use QPdfWriter and/or QTextDocument::print(QPagedPaintDevice *printer) will I be using the same PDF engine or a different one from QWebEngine?



  • @JonB said in Generating PDF from HTML, fast!:

    f I use QPdfWriter and/or QTextDocument::print(QPagedPaintDevice *printer) will I be using the same PDF engine or a different one from QWebEngine?

    Yes, they are different and QTextDocument supports only a subset of html.

    It really depends on the document, it's pretty easy to try using QTextDocument setting the html and then print to pdf. if the result is acceptable for the client you might be done



  • @VRonin
    Thank you for answering.

    Yes, they are different and QTextDocument supports only a subset of html.

    That's a major blow. I have no real idea what the HTML might contain, all I know so far as that whatever QWebEngineView makes of it is acceptable to the user, if that could differ if I go via QTextDocument that may be a non-no :(

    What about the other part of the question? Where is the "HTML-to-PDF-converter-driver"? Out of QWebEnginePage::printToPdf(), QTextDocument::print(QPagedPaintDevice *printer) (where printer is PDF) and QPdfWriter, do they share the same code to produce the PDF or are they each quite separate with their own code for that? Then I would understand what choices I have. If you had a completely arbitrary piece of HTML, which one would you use to get to PDF? Thanks!

    [P.S. I'm removing my naughty post elsewhere which encouraged a bad example to get you here...!]



  • @JonB said in Generating PDF from HTML, fast!:

    QWebEnginePage::printToPdf(), QTextDocument::print(QPagedPaintDevice *printer) (where printer is PDF) and QPdfWriter

    the latter 2 share the same code. QWebEnginePage uses the Chromium engine to do the work



  • @VRonin
    Thank you, that is great information, and about what I suspected.

    For my application that probably means I shall have to stick with QWebEnginePage::printToPdf() for "unattended" conversion, as the user can also go into an "interactive" session which does show the page there and convert to PDF, and I suspect users will demand 100% compatibility with that one's output. So now I have to investigate whether I can get that to be much faster when it does not actually need to display the HTML to the user but just convert it to PDF.... :(


  • Moderators

    Instead of creating and deleting QDialogs and QWebEnginePages over and over again to render the HTML offscreen, wonder if it's faster (and memory efficient!) to use a dedicated HTML-to-PDF converter:



  • @JKSH
    I am aware of this. The issue is: that dialog is also used "interactively" to allow the user to "preview" the letter, optionally edit it, and produce the PDF. Then it can be used to "batch" process hundreds of letters, non-interactively. It is vital to the users that the batch-processed outputs be identical to the interactive ones, down to the pixel. So that's why I have to use the same engine/mechanism for non-interactive as interactive, which precludes using something else.


  • Moderators

    @JonB said in Generating PDF from HTML, fast!:

    down to the pixel.

    Stringent requirements indeed!

    All the best


Log in to reply