Qt World Summit: Submit your Presentation

Weird characters in file path

  • I saw this from one of the examples. The example is called tabdialog

    The constructor of the window class has
    QFileInfo fileInfo(filePath);

    where filePath is "."

    somehow has the file path to the executable. It returns a QString.
    It seems to be coded as 32 bit characters. I think it is wchar_t array. I’m not sure since the page about it doesn’t say much.

    How can I convert that to ASCII 8 bit which is what I use else where in my program.

    I tried
    QString stringThing=fileInfo.absoluteFilePath();
    strcpy(pstring, (const schar *)pqString.toLocal8Bit());
    strcpy(pstring, (const schar *)pqString.toUtf8());

    It doesn't seem to work.
    What's the difference between these 2 functions?

  • Lifetime Qt Champion


    The first one like explained in the documentation will use QTextCodec::codecForLocale to convert the text.
    The other will turn your text into UTF8.

    In any case, they both return a QByteArray, so your cast is wrong

  • I tried
    char pstring[1000];
    QString stringThing=fileInfo.absoluteFilePath();
    QByteArray ztest;
    QTextCodec *codec=QTextCodec::codecForName("UTF-8");
    strcpy(pstring, ztest.constData());

    Instead of getting
    /home/jake.......and so on

    I am getting

    in my char array.

    Can I get a clear example on how to use these classes?

  • Lifetime Qt Champion

    Why did you stop using toUt8 ?
    By the way, what OS are you on ?

  • Both functions,
    return the same data.
    I inspected the first 9 elements.
    I am on Kubuntu 18.04.4 LTS.
    I just need to understand what I am getting? I am willing to write my own function to convert the unicode to ASCII 8 bit.

  • Lifetime Qt Champion

    @stretchthebits said in Weird characters in file path:

    to ASCII 8 bit.

    You don't want ascii - you want probably utf-8 which is returned by toUtf8(). toLocal8Bit() returns the current locale encoding which is utf-8 on most linux systems too.

  • if toUtf8() does return a UTF-8 encoding, why does a path such as /home/joe/Documents looks like corrupted data when I copy to a char array?

  • @stretchthebits said in Weird characters in file path:

    strcpy(pstring, (const schar *)pqString.toUtf8());

    Because this is bad code. toUtf8() returns a QByteArray (as @SGaist said), not a char* compatible string. You should avoid old styles casts unless absolutely necessary.
    You should be doing:

    QString stringThing=fileInfo.absoluteFilePath();
    QByteArray tempbytes(pqString.toUtf8());
    strcpy(pstring, tempbytes.data());


  • @fcarney said in Weird characters in file path:

    QString stringThing=fileInfo.absoluteFilePath();
    QByteArray tempbytes(pqString.toUtf8());
    strcpy(pstring, tempbytes.data());

    That worked but I would like to understand what is happening under the hood. I thought QByteArray was an array of char. That is what my test had shown. What changes is the QByteArray class making to the array returned by toUtf8()?

  • Look at the source to the header for QByteArray. It has much more than char data in that class. By casting the entire object to a char array pointer you were pointing to other data besides the actual char data. That is why you needed to create a local QByteArray (not a temporary) and then access the char data via the .data() method of the QByteArray.

  • Wow, if I use F11 to jump into the code, I get the sequence
    Q_REQUIRED_RESULT QByteArray toUtf8() const &
    { return toUtf8_helper(*this); }

    QByteArray QString::toUtf8_helper(const QString &str)
    return qt_convert_to_utf8(str);

    inline int size() const { return d->size; }

    template <typename String, if_compatible_qstring_like<String> = true>
    QStringView(const String &str) noexcept
    : QStringView(str.isNull() ? nullptr : str.data(), qsizetype(str.size())) {}

    static QByteArray qt_convert_to_utf8(QStringView str)
    if (str.isNull())
    return QByteArray();

    return QUtf8::convertFromUnicode(str.data(), str.length());


    and so on.
    I guess I am a little behind the times. I'm not sure what all the
    reinterpret_cast and
    const_cast was about.
    This is some fancy coding :)

    I thought pqString.toUtf8() is suppose to return UTF-8 which should contain my path to the file
    /home/john/bla bla and I could just pass that to fopen(). isn't UTF-8, the first 0 to 127 values encoded just like ASCII?
    In other words, a array that contains /home/john/bla bla
    whether encoded in UTF-8 or ASCII, they would be exactly identical.

Log in to reply