Important: Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

Strings and encodings in Qt



  • From your favourite author, a brand new -- work in progress -- wiki article:

    http://developer.qt.nokia.com/wiki/Strings_and_encodings_in_Qt

    Any feedback is welcome.



  • Very nice!
    I don't see there setting encoding for translation
    @QTextCodec::setCodecForTr(QTextCodec::codecForName ("Windows-1251"));@
    or that is another topic?



  • I18n is actually a broader topic -- I briefly mentioned QTextCodec::codecForTr(), CODECFORTR, etc.; do you think it's worth elaborating?



  • Yeah you right
    maybe it's not worth



  • First of all, thanks for putting the effort into writing this wiki page.

    What I miss in the story is the alternative approach to avoid non-ascii characters in the source text completely. Yes, I know there are differences of opinion if that is the best way to do it or not, but it is a valid approach. I just use ascii in my sources, and use Qt Linguist to support any unicode I like. It avoids creating a dependency in the code (the ::fromUtf8 call) to the encoding of of the code (the fact that I must make sure that I, and everybody else in the company) always edit the source file using an editor that uses the UTF8 encoding.



  • Andre is right. That approach could get a more elaborate paragraph in the beginnen.

    I'm a strong advocate for using UTF-8 in your source code. And while this still holds true, I recently made up my mind and would go with the plain text + linguist solution in the future (or UTF-8 + linguist, it depends). That's not caused by UTF-8 hassles, but by the pure fact that the tr() methods just provide more functionality than bridging the encoding gap. Most notably is the support for singular/plural forms that you get for free (without having to hardcode this in your source).

    One caveat using the tr() singular/plural magic: You must create a translation for the language used in the sources too!



  • [quote author="Andre" date="1323244054"]
    What I miss in the story is the alternative approach to avoid non-ascii characters in the source text completely. Yes, I know there are differences of opinion if that is the best way to do it or not, but it is a valid approach. I just use ascii in my sources, and use Qt Linguist to support any unicode I like. [/quote]

    Sounds good. Could you elaborate a bit about your approach? Do you have some (more or less) formal guidelines you follow, that I can add to the wiki page? For instance, what do you put in your code exactly if you have to write "Saint Honoré", both in the cases of a user-visible string or not?



  • That would depend on the language you use as your base, I guess. I think that the approach would only work if the language you want to use for your strings doesn't have or need too many non-ascii characters to produce something readable. I know for instance German has rules about replacing "Güt" with "Guet" and the Ringel-S (sorry, no idea how to type that one on my keyboard here) with a double-s. How that would work on "Saint Honoré", I don't know, I don't often run into strings like that in my applications. I guess I would just leave off the accent from the é. I know I (have to) do that for my name all the time anyway...

    What I do, is that I keep all strings in my sources in English. English doesn't have non-ascii characters, so normally you would not run into these. Then, I make translation files

    • to English (yes, translate English to English to get plurals working)
    • other required language(s) if needed.

    When I need non-ascii characters in my source, for instance when I want to use symbols from the unicode table, I usually use the unicode code directly. That is, instead of writing QChar('∞'), I would use QChar(0x221E); //infinity symbol. That happens only sporadically though, little enough to make that workable.


Log in to reply