QString.append() with a utf8 character not working
-
Hey, I have an issue where I have a std::string A containing "μ" and a QString B. I want to add A to B but get a 65533 '�' value where μ should be. The backstory is that I read A from a binary file using ifstream(Not QT's textstream). μ is stored properly in A there according to the watch table but when appending B gets '�' .
// WORKING EXAMPLE std::string A = u8"μ"; QString B; B.append(QString("%1").arg(QString::fromStdString(A))); // WHAT I NEED HELP WITH short Asize; std::string A; QString B; rf.open(path, std::ios::in | std::ios::binary); rf.read((char*)&Asize, sizeof(short)); A.resize(Asize); rf.read((char*)&A[0], Asize); B.append(QString("%1").arg(QString::fromStdString(A)));
I use QT 5.12 which is important regarding the fromStdString function. in my QT version it returns a toUtf8() converted result so it seems to be the correct unicode at least? Unless reading from the stream does not return the proper encoding somehow despite what the watch table says? To add further confusion, in the working example A does not show μ in the watch table but B will show μ in the watch tabel after appending. So something is weird with the encoding...
Thanks in advance!
-
Hi, 0xffffffb5 is a "μ" in utf-8 so it should work if you just do:
... B += QChar(A[0]); ...
@hskoglund said in QString.append() with a utf8 character not working:
Hi, 0xffffffb5 is a "μ" in utf-8 so it should work if you just do:
This is wrong - your page correctly say that "μ" is encoded as 0xc2, 0xb5 in utf-8 and 0x00b5 in utf-16.
So no, this is no proper utf-8 but latin-1 --> QString::fromLatin1() -
Hey, I have an issue where I have a std::string A containing "μ" and a QString B. I want to add A to B but get a 65533 '�' value where μ should be. The backstory is that I read A from a binary file using ifstream(Not QT's textstream). μ is stored properly in A there according to the watch table but when appending B gets '�' .
// WORKING EXAMPLE std::string A = u8"μ"; QString B; B.append(QString("%1").arg(QString::fromStdString(A))); // WHAT I NEED HELP WITH short Asize; std::string A; QString B; rf.open(path, std::ios::in | std::ios::binary); rf.read((char*)&Asize, sizeof(short)); A.resize(Asize); rf.read((char*)&A[0], Asize); B.append(QString("%1").arg(QString::fromStdString(A)));
I use QT 5.12 which is important regarding the fromStdString function. in my QT version it returns a toUtf8() converted result so it seems to be the correct unicode at least? Unless reading from the stream does not return the proper encoding somehow despite what the watch table says? To add further confusion, in the working example A does not show μ in the watch table but B will show μ in the watch tabel after appending. So something is weird with the encoding...
Thanks in advance!
@Daddedebad said in QString.append() with a utf8 character not working:
QString::fromStdString
Try https://doc.qt.io/qt-6/qstring.html#fromUtf8-3 instead
-
@Daddedebad said in QString.append() with a utf8 character not working:
QString::fromStdString
Try https://doc.qt.io/qt-6/qstring.html#fromUtf8-3 instead
@jsulm I tried with both of the overloads available for my QT version:
QString::fromUtf8(QByteArray::fromStdString(A)) // and QString::fromUtf8((char*)&A, A.size())
Does not work differently
-
@jsulm I tried with both of the overloads available for my QT version:
QString::fromUtf8(QByteArray::fromStdString(A)) // and QString::fromUtf8((char*)&A, A.size())
Does not work differently
@Daddedebad said in QString.append() with a utf8 character not working:
(char*)&A
A.c_str() should do
-
QString::fromStdString() is the same as QString::fromUtf8() so it doesn't matter.
Please output your byte contents of A:
for (size_t i = 0; i < A.size(); ++i) { std::cout << "A[" << i << "] = 0x" << std::hex << int(A.at(i)) << std::endl; }
To check if it's properly utf-8 encoded which I doubt it is.
-
QString::fromStdString() is the same as QString::fromUtf8() so it doesn't matter.
Please output your byte contents of A:
for (size_t i = 0; i < A.size(); ++i) { std::cout << "A[" << i << "] = 0x" << std::hex << int(A.at(i)) << std::endl; }
To check if it's properly utf-8 encoded which I doubt it is.
Result is A[0] = 0xffffffb5
Not sure what it shoud be, I'm pretty new to working with unicodes like this. I noticed that QString stores in unsigned short while A[0] in watchtable is -75. Also that the utf8 encoding should be 0xB5. So I guess it's not encoded properly then?
-
Hi, 0xffffffb5 is a "μ" in utf-8 so it should work if you just do:
... B += QChar(A[0]); ...
-
Hi, 0xffffffb5 is a "μ" in utf-8 so it should work if you just do:
... B += QChar(A[0]); ...
@hskoglund said in QString.append() with a utf8 character not working:
Hi, 0xffffffb5 is a "μ" in utf-8 so it should work if you just do:
This is wrong - your page correctly say that "μ" is encoded as 0xc2, 0xb5 in utf-8 and 0x00b5 in utf-16.
So no, this is no proper utf-8 but latin-1 --> QString::fromLatin1() -
@hskoglund said in QString.append() with a utf8 character not working:
Hi, 0xffffffb5 is a "μ" in utf-8 so it should work if you just do:
This is wrong - your page correctly say that "μ" is encoded as 0xc2, 0xb5 in utf-8 and 0x00b5 in utf-16.
So no, this is no proper utf-8 but latin-1 --> QString::fromLatin1()This was the correct answer, it works now. Thank you!
To post the entire answer:
short Asize; std::string A; QString B; rf.open(path, std::ios::in | std::ios::binary); rf.read((char*)&Asize, sizeof(short)); A.resize(Asize); rf.read((char*)&A[0], Asize); B.append(QString("%1").arg(QString::fromLatin1(A.c_str())));
-
This was the correct answer, it works now. Thank you!
To post the entire answer:
short Asize; std::string A; QString B; rf.open(path, std::ios::in | std::ios::binary); rf.read((char*)&Asize, sizeof(short)); A.resize(Asize); rf.read((char*)&A[0], Asize); B.append(QString("%1").arg(QString::fromLatin1(A.c_str())));
@Daddedebad said in QString.append() with a utf8 character not working:
B.append(QString("%1").arg(QString::fromLatin1(A.c_str())));
B += QLatin1Char(a[0]);
or, when A is more than one byte
B += QString::fromLatin1(A.c_str());