How to solve the problem of garbled characters in characteristic names for lowenergyscanner on the Windows platform?
-
The value you see in Bluetooth LE explorer,
45 6E 74 65 72 20 53 6C 65 65 70 20 4D 6F 64 65
, is indeed the UTF-8 encoding ofEnter Sleep Mode
.The value you see in the descriptor value of your program is the result of mishandling the bytes you see in Bluetooth LE explorer:
e6 b9 85 e6 95 b4 e2 81 b2 e6 b1 93 e6 95 a5 e2 81 b0 e6 bd 8d e6 95 a4
is the UTF-8 encoding of this set of Unicode code points:
U+6E45 U+6574 U+2072 U+6C53 U+6565 U+2070 U+6F4D U+6564 湅 整 汓 敥 ⁰ 潍 敤
Inspection of each 16-bit code point shows that it corresponds to two bytes of the input treated as a short. It looks that something has treated the input as a series of QChar/shorts to build a QString that it then encoded as UTF-8. So, we go from to bytes
45 6E
, treat them as a QChar/short 0x6E45 (notice the endian-ness change), the character湅
, which then UTF-8 encodes ase6 b9 85
.Where exactly that mishandling is occurring I am unable to say.
-
The value you see in Bluetooth LE explorer,
45 6E 74 65 72 20 53 6C 65 65 70 20 4D 6F 64 65
, is indeed the UTF-8 encoding ofEnter Sleep Mode
.The value you see in the descriptor value of your program is the result of mishandling the bytes you see in Bluetooth LE explorer:
e6 b9 85 e6 95 b4 e2 81 b2 e6 b1 93 e6 95 a5 e2 81 b0 e6 bd 8d e6 95 a4
is the UTF-8 encoding of this set of Unicode code points:
U+6E45 U+6574 U+2072 U+6C53 U+6565 U+2070 U+6F4D U+6564 湅 整 汓 敥 ⁰ 潍 敤
Inspection of each 16-bit code point shows that it corresponds to two bytes of the input treated as a short. It looks that something has treated the input as a series of QChar/shorts to build a QString that it then encoded as UTF-8. So, we go from to bytes
45 6E
, treat them as a QChar/short 0x6E45 (notice the endian-ness change), the character湅
, which then UTF-8 encodes ase6 b9 85
.Where exactly that mishandling is occurring I am unable to say.
-
The value you see in Bluetooth LE explorer,
45 6E 74 65 72 20 53 6C 65 65 70 20 4D 6F 64 65
, is indeed the UTF-8 encoding ofEnter Sleep Mode
.The value you see in the descriptor value of your program is the result of mishandling the bytes you see in Bluetooth LE explorer:
e6 b9 85 e6 95 b4 e2 81 b2 e6 b1 93 e6 95 a5 e2 81 b0 e6 bd 8d e6 95 a4
is the UTF-8 encoding of this set of Unicode code points:
U+6E45 U+6574 U+2072 U+6C53 U+6565 U+2070 U+6F4D U+6564 湅 整 汓 敥 ⁰ 潍 敤
Inspection of each 16-bit code point shows that it corresponds to two bytes of the input treated as a short. It looks that something has treated the input as a series of QChar/shorts to build a QString that it then encoded as UTF-8. So, we go from to bytes
45 6E
, treat them as a QChar/short 0x6E45 (notice the endian-ness change), the character湅
, which then UTF-8 encodes ase6 b9 85
.Where exactly that mishandling is occurring I am unable to say.
@ChrisW67 Based on your response, I used ChatGPT to generate a correction function, which works, but in some cases, the conversion is incomplete. For example, for "e6b985e695b4e281b2e6b193e695a5e281b0e6bd8de695a4", it correctly converts to "Enter Sleep Mode"; however, for "e68584e685b4e58ca0e7a9a9", it converts to "Data Siz", while the correct result should be "Data Size". Similarly, for "e68584e685b4e590a0e685b2e78daee6a5ade78db3e6bda9e281aee79193e791a1", it converts to "Data Transmission Stat", but the correct result should be "Data Transmission State".
QString getUserDescriptionOnWindows(const QByteArray &value) { // reference: https://forum.qt.io/post/816621 // Power by ChatGPT4 QString wrongText = QString::fromUtf8(value); QByteArray restoredBytes; for (QChar c : wrongText) { ushort unicode = c.unicode(); restoredBytes.append(static_cast<char>(unicode & 0xFF)); restoredBytes.append(static_cast<char>((unicode >> 8) & 0xFF)); } QString correctText = QString::fromUtf8(restoredBytes); return correctText; }
-
Hi, the conversion only supports texts with an even number of characters, "Enter Sleep Mode" is 16 chars but "Data Size" is 9 and "Data Transmission State" is 23.
So for both "Data Size" and "Data Transmission State" there is a missing "65" at the end, for example to fix "Data Siz" to "Data Size" change "e68584e685b4e58ca0e7a9a9" to "e68584e685b4e58ca0e7a9a965".
-
I've got a more simple way to do this conversion:
// QStringEncoder for Qt6, QTextCodec+QTextEncoder for Qt5 QStringEncoder utf16LE(QStringEncoder::Utf16LE); QString wrongText = QString::fromUtf8(value); QByteArray restoredBytes = utf16LE(wrongText); QString correctText = QString::fromUtf8(restoredBytes);
But this won't help about the missing character. It just has the same effect as your code.
-
Hi, the conversion only supports texts with an even number of characters, "Enter Sleep Mode" is 16 chars but "Data Size" is 9 and "Data Transmission State" is 23.
So for both "Data Size" and "Data Transmission State" there is a missing "65" at the end, for example to fix "Data Siz" to "Data Size" change "e68584e685b4e58ca0e7a9a9" to "e68584e685b4e58ca0e7a9a965".
@hskoglund Thank you for your reply. However, I am unable to determine what is missing at the end of the text with an odd number of characters, as it is not necessarily 65. These data are read from descriptor values, such as "e68584e685b4e590a0e685b2e78daee6a5ade78db3e6bda9", which should correspond to "Data Transmission".
-
That descriptor value "e68584e685b4e590a0e685b2e78daee6a5ade78db3e6bda9" is missing the a final "6e" and that's why you only get "Data Transmissio".
The function that gives you those descriptor values has a bug: there is a 50% chance it will throw away the last character :-(
If you do not have access to that function so you can fix the bug, maybe you can use a spelling checker to complete the sentence. -
You should create a bug report about this. Looks like the decoding within qlowenergycontroller_winrt.cpp is wrong / buggy. But I can't see where exactly.
-
(Didn't know the bug was in Qt)
Inside qlowenergycontroller_winrt.cpp there is special case for handling buffers of the type QBluetoothUuid::DescriptorType::CharacteristicUserDescription, by setting the boolean isWCharString to true, which enables this "easy to understand :-)" code:
if (isWCharString) { QString valueString = QString::fromUtf16(reinterpret_cast<char16_t *>(data)).left(size / 2); return valueString.toUtf8(); }
however Microsoft's documentation says
..
CharacteristicUserDescription:
The characteristic value contains a UTF-8 string of variable size that is a user textual description
...So probably the bug is that there is no WCharString in there to reinterpret_cast on..
. -
Aha, I think I just spotted the bug, at the end of this line (from prev. post):
QString valueString = QString::fromUtf16(reinterpret_cast<char16_t *>(data)).left(size / 2);
size / 2 ---> this is what causes the loss of the last character when the text has an odd number of characters -
Aha, I think I just spotted the bug, at the end of this line (from prev. post):
QString valueString = QString::fromUtf16(reinterpret_cast<char16_t *>(data)).left(size / 2);
size / 2 ---> this is what causes the loss of the last character when the text has an odd number of characters@hskoglund Thank you for your reply. I think what you said is correct. I tried making the following modifications to the
byteArrayFromBuffer
function and found that I could correctly retrieve the value ofCharacteristicUserDescription
without needing to use thegetUserDescriptionOnWindows
function.if (isWCharString) { // QString valueString = QString::fromUtf16(reinterpret_cast<char16_t *>(data)).left(size / 2); // return valueString.toUtf8(); return QByteArray(data, size); }
The new
getName
function in thecharacteristicinfo.cpp
:QString CharacteristicInfo::getName() const { //! [les-get-descriptors] QString name = m_characteristic.name(); if (!name.isEmpty()) return name; // find descriptor with CharacteristicUserDescription const QList<QLowEnergyDescriptor> descriptors = m_characteristic.descriptors(); for (const QLowEnergyDescriptor &descriptor : descriptors) { if (descriptor.type() == QBluetoothUuid::DescriptorType::CharacteristicUserDescription) { name = descriptor.value(); // qDebug() << "original: " <<descriptor.value(); // name = getUserDescriptionOnWindows(descriptor.value()); // qDebug() << descriptor.value().toHex('-'); // qDebug() << "converted: " << name; break; } } //! [les-get-descriptors] if (name.isEmpty()) name = u"Unknown"_s; return name; }
-
Glad to hear it worked out for you.
Maybe if you have time you could try what @Christian-Ehrlicher suggested: file a Qt bug report, something like this:
"Qt bluetooth qlowenergycontroller_winrt.cpp incorrectly assumes CharacteristicUserDescription returns UTF16 when it actually returns UTF8" -
That descriptor value "e68584e685b4e590a0e685b2e78daee6a5ade78db3e6bda9" is missing the a final "6e" and that's why you only get "Data Transmissio".
The function that gives you those descriptor values has a bug: there is a 50% chance it will throw away the last character :-(
If you do not have access to that function so you can fix the bug, maybe you can use a spelling checker to complete the sentence.@hskoglund Regarding the spelling checker, could you provide a simple example or reference link?
@hskoglund said in How to solve the problem of garbled characters in characteristic names for lowenergyscanner on the Windows platform?:
maybe you can use a spelling checker to complete the sentence.
-
Glad to hear it worked out for you.
Maybe if you have time you could try what @Christian-Ehrlicher suggested: file a Qt bug report, something like this:
"Qt bluetooth qlowenergycontroller_winrt.cpp incorrectly assumes CharacteristicUserDescription returns UTF16 when it actually returns UTF8"@hskoglund Thank you again for your help. I have created a bug report for this issue: https://bugreports.qt.io/browse/QTBUG-132202.
-
@hskoglund Thank you for your reply. I think what you said is correct. I tried making the following modifications to the
byteArrayFromBuffer
function and found that I could correctly retrieve the value ofCharacteristicUserDescription
without needing to use thegetUserDescriptionOnWindows
function.if (isWCharString) { // QString valueString = QString::fromUtf16(reinterpret_cast<char16_t *>(data)).left(size / 2); // return valueString.toUtf8(); return QByteArray(data, size); }
The new
getName
function in thecharacteristicinfo.cpp
:QString CharacteristicInfo::getName() const { //! [les-get-descriptors] QString name = m_characteristic.name(); if (!name.isEmpty()) return name; // find descriptor with CharacteristicUserDescription const QList<QLowEnergyDescriptor> descriptors = m_characteristic.descriptors(); for (const QLowEnergyDescriptor &descriptor : descriptors) { if (descriptor.type() == QBluetoothUuid::DescriptorType::CharacteristicUserDescription) { name = descriptor.value(); // qDebug() << "original: " <<descriptor.value(); // name = getUserDescriptionOnWindows(descriptor.value()); // qDebug() << descriptor.value().toHex('-'); // qDebug() << "converted: " << name; break; } } //! [les-get-descriptors] if (name.isEmpty()) name = u"Unknown"_s; return name; }
@swjqq To clarify, there's no need to modify the
byteArrayFromBuffer
function directly. Instead, the following changes should be made to theQWinRTLowEnergyServiceHandler::obtainCharList
function and theQLowEnergyControllerPrivateWinRT::readDescriptorHelper
function in theqlowenergycontroller_winrt.cpp
file (both of which callbyteArrayFromGattResult
, which in turn callsbyteArrayFromBuffer
):if (descData.uuid == QBluetoothUuid::DescriptorType::CharacteristicUserDescription) // descData.value = byteArrayFromGattResult(descriptorValue, true); descData.value = byteArrayFromGattResult(descriptorValue, false); else descData.value = byteArrayFromGattResult(descriptorValue);