Important: Please read the Qt Code of Conduct - https://forum.qt.io/topic/113070/qt-code-of-conduct

QTextCodec canEncode, what is the expected behavior?



  • I am testing a few string conversions for use with an external non-unicode program on Windows using Qt 5.4.1. I have set my windows non-unicode program locale to Japanese and testing the 2 different strings using the code below:

        foreach(const QString& arg, arguments)
        {
            QTextCodec* codec = QTextCodec::codecForLocale();
            QByteArray localizedArg = arg.toLocal8Bit();
            if(QString::fromLocal8Bit(localizedArg) != arg)
            {
                qDebug() << arg << "codec->canEncode" << codec->canEncode(arg) << "QString::fromLocal8Bit(localizedArg) != arg";
            }
            else
            {
                qDebug() << arg << "codec->canEncode" << codec->canEncode(arg);
            }
        }
    

    The variable "arguments" contains the following two strings:

    1. d:/でアヒィン/1.10/Amber/新しいバンク 2.file
    2. d:/1.10/Amber/你你.file

    The first string contains a mixture of English and Japanese characters and the second string contains a mixture of English and Chinese characters.

    Running through the loop above produces the following result in the output window:

    "d:/でアヒィン/1.10/Amber/新しいバンク 2.file" codec->canEncode true
    "d:/1.10/Amber/你你.file" codec->canEncode true QString::fromLocal8Bit(localizedArg) != arg

    If characters are lost or incorrectly converted such that reversing the operation produces a different string, I would expect QTextCodec::canEncode to return false (i.e. if it's going to convert the foreign characters to "?").

    Stepping through the code, I got into QTextCodec::canEncode which the return result is based on state.invalidChars == 0, and just before it, it invoked the QWindowsLocalCodec::convertFromUnicode which doesn't seem to do anything to the ConverterState passed in. Is this correct or a bug?


  • Lifetime Qt Champion

    @Thuan_Firelight please search the forum, I remember a similar question some. I just dont remember the outcome.



  • @aha_1980 I did, and the closest one I found was this: https://forum.qt.io/topic/93921/unexpected-result-from-qtextcodec-canencode-qstring/7.

    However, it seems the "solved" response was that "US-ASCII" is not recommended. I am not using "US-ASCII" however. The last response was from the OP and he reckon it is not restricted to "US-ASCII". I wanted to bump the thread, but the forum mechanism recommend I start a new thread as that one is quite old.

    And there is also this bug report: https://bugreports.qt.io/browse/QTBUG-6925, the last comment state it was closed. So I checking what's the expected behavior before I comment further on the bug report to bump it up.


  • Lifetime Qt Champion

    @Thuan_Firelight

    I would recommend you to place a comment and vote on QTBUG-6925, which was re-opened by the way.

    If you can break down your problem to a minimal, compile and testable example, please attach it there also - it helps debugging and fixing the problem.

    You should of course try that with the latest release (which is 5.12-RC by now) - maybe there is already some improvement for your case.


Log in to reply