Important: Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

[SOLVED] QTextDocument::toHtml("utf-8") and Arabic on Windows



  • While we are on the topic of rtl languages (http://developer.qt.nokia.com/forums/viewthread/2169) I have a question regarding exporting and viewing such text as html on Windows (XP and/or 7).

    Exporting to html and viewing in a browser "just works" on Mac and Linux, but on Windows I only get question marks. I have tried with fonts such as Arial and Tahoma that the system indicates should be able to display e.g. Arabic.

    Can someone knowledgeable tell me what I am missing?

    Thanks,
    Peter



  • do you set a proper codepage for html?
    does it shows correctly if codepage set manually in the browser?
    have you tried misc browsers or just IE?



  • I specify utf-8 as the title indicates. Do you mean something else?

    IE, Firefox, Chrome, makes no difference. Viewing a file generated on Windows with a browser under another OS does also not work so I am pretty sure it has to do with how the file is generated.



  • Did you try toHtml("ASCII")? That should generate number based entities in the file for all characters above 127. That's known to work.

    Also, can you display the files generated on the other OS' on the windows box?

    And finallay, do the two files show any differences?



  • bq. I specify utf-8 as the title indicates. Do you mean something else?bq.

    I mean: what's inside the generated document? Is encoding specified?



  • [links no longer available, use zip below]

    The utf-8 file created on the Mac opens fine on Windows, but of course not vice versa. I cannot see any difference between the files except what is supposed to be the actual text.

    The files are generated from exactly the same code, only compiled on the different platforms.



  • Could you please put all the files in one ZIP. I cannot download that much files from drop box.



  • http://dl.dropbox.com/u/237357/qt_html.zip

    They are really small files, but here you go. Thanks for looking at this.



  • Both windows file obviously contain no useful content, a number of 3F/'?' symbols.

    Another bug found?..



  • The encoding you provide with QTextDocument::toHtml() is only written into the header of the generated HTML. It does not influence the encoding in the returned string (that would be useless anyways, as QString is unicode).

    How do you write the string to a file? I'd suggest using "QTextStream":http://doc.trolltech.com/latest/qtextstream.html with the right encoding (using setCodec()). QTextStream by default uses a codec suitable for your locale; that might be Windows-1252 and would explain the garbled output on windows.



  • Aaah, I am relieved it was something simple like that . I already used QTextStream and adding setCodec("utf-8") gave the desired result. Thank you.


Log in to reply