QString to QByteArray if QString contains unicode chars
-
I have a QString which contains some unicode chars, and I want to convert the QString to a QByteArray. (Which is later converted back to a QString)
If I use toLatin, toUtf8 or toLocal8bit the unicode characters are lost upon conversion.
How to I convert these unicode character filled QString to a QByteArray?
-
@Christian-Ehrlicher I'm not sure how to pick encoding for my QString (I could not find it in docs) if that's what you mean. But once that is done, it looks like all of the methods for QString to convert to bytes create a single 8-bit char per character (toUtf-8, to Latin, to local 8 bit)
How would I get the multibyte QChar to become multiple bytes in the QByteArray? From the docs it sounds like to Utf 8 might create multiple bytes in the QByteArray, but upon testing it did not.
@ocgltd said in QString to QByteArray if QString contains unicode chars:
How would I get the multibyte QChar to become multiple bytes in the QByteArray?
According to https://unicodeplus.com/U+4FF0 is encoded as 0xE4 0xBF 0xB0 when converted to utf-8
-
When you know which encoding the
unicode chars
have you can convert them to a proper QString with QTextDecoder. If your QByteArray is utf-8 encoded then use QString::fromUtf8() / toUtf8(). -
When you know which encoding the
unicode chars
have you can convert them to a proper QString with QTextDecoder. If your QByteArray is utf-8 encoded then use QString::fromUtf8() / toUtf8().@Christian-Ehrlicher I don't think I know enough to understand the details of your response. For example, I might say:
QString s = "Hello there";
s.append(QChar(0x4FF0));
QByteArray a = s.toLatin();But I want the unicode bytes for my unicode char "俰" to appear in the byte array. I haven't picked any encoding (at least not knowingly)
-
@Christian-Ehrlicher I don't think I know enough to understand the details of your response. For example, I might say:
QString s = "Hello there";
s.append(QChar(0x4FF0));
QByteArray a = s.toLatin();But I want the unicode bytes for my unicode char "俰" to appear in the byte array. I haven't picked any encoding (at least not knowingly)
@ocgltd said in QString to QByteArray if QString contains unicode chars:
But I want the unicode bytes for my unicode char "俰" to appear in the byte array.
Then you have to use an encoding which supports this unciode char, Latin1 is not one of them but UTF-8.
-
@ocgltd said in QString to QByteArray if QString contains unicode chars:
But I want the unicode bytes for my unicode char "俰" to appear in the byte array.
Then you have to use an encoding which supports this unciode char, Latin1 is not one of them but UTF-8.
@Christian-Ehrlicher I'm not sure how to pick encoding for my QString (I could not find it in docs) if that's what you mean. But once that is done, it looks like all of the methods for QString to convert to bytes create a single 8-bit char per character (toUtf-8, to Latin, to local 8 bit)
How would I get the multibyte QChar to become multiple bytes in the QByteArray? From the docs it sounds like to Utf 8 might create multiple bytes in the QByteArray, but upon testing it did not.
-
@Christian-Ehrlicher I'm not sure how to pick encoding for my QString (I could not find it in docs) if that's what you mean. But once that is done, it looks like all of the methods for QString to convert to bytes create a single 8-bit char per character (toUtf-8, to Latin, to local 8 bit)
How would I get the multibyte QChar to become multiple bytes in the QByteArray? From the docs it sounds like to Utf 8 might create multiple bytes in the QByteArray, but upon testing it did not.
@ocgltd said in QString to QByteArray if QString contains unicode chars:
From the docs it sounds like to Utf 8 might create multiple bytes in the QByteArray, but upon testing it did not.
Why not ?
QString s = "Hello there"; s.append(QChar(0x4FF0)); QByteArray a=s.toUtf8(); s=a; // back to QString qDebug()<<s; // print "Hello there俰"
-
@Christian-Ehrlicher I'm not sure how to pick encoding for my QString (I could not find it in docs) if that's what you mean. But once that is done, it looks like all of the methods for QString to convert to bytes create a single 8-bit char per character (toUtf-8, to Latin, to local 8 bit)
How would I get the multibyte QChar to become multiple bytes in the QByteArray? From the docs it sounds like to Utf 8 might create multiple bytes in the QByteArray, but upon testing it did not.
@ocgltd said in QString to QByteArray if QString contains unicode chars:
How would I get the multibyte QChar to become multiple bytes in the QByteArray?
According to https://unicodeplus.com/U+4FF0 is encoded as 0xE4 0xBF 0xB0 when converted to utf-8
-
@ocgltd said in QString to QByteArray if QString contains unicode chars:
How would I get the multibyte QChar to become multiple bytes in the QByteArray?
According to https://unicodeplus.com/U+4FF0 is encoded as 0xE4 0xBF 0xB0 when converted to utf-8
@Christian-Ehrlicher said in QString to QByteArray if QString contains unicode chars:
According to https://unicodeplus.com/U+4FF0 is encoded as 0xE4 0xBF 0xB0 when converted to utf-8
What's right.
"Hello there\xE4\xBF\xB0" -
@ocgltd said in QString to QByteArray if QString contains unicode chars:
How would I get the multibyte QChar to become multiple bytes in the QByteArray?
According to https://unicodeplus.com/U+4FF0 is encoded as 0xE4 0xBF 0xB0 when converted to utf-8
@Christian-Ehrlicher That is what I wanted, so the reason it doesn't work must be something in my code. I'll recheck. Thanks