Unicode case folding
-
How can I get form char "ẞ" unicode case folded "ss" in Qt 6?
-
I'm not aware of any explicit (imperative) Qt API for this. Can you elaborate a bit on your use case?
-
-
@imironchik said in Unicode case folding:
CommonMark Markdown Spec needs such conversion
So is your use case that you want to compare case folded strings , so that "ẞ" == "SS"?
This might be working if you're directly using QCollator, and enable the ICU backend...
-
@kkoehne said in Unicode case folding:
So is your use case that you want to compare case folded strings , so that "ẞ" == "SS"?
Yes, it's so.
This might be working if you're directly using QCollator, and enable the ICU backend...
But I don't want to deal with the locale...
-
Well, per unicode standard,the case folded form of "ẞ" (Latin Capital Letter Sharp S, U+1E9E) is "ß" (Latin Small Letter Sharp S, U+00DF). That is also what QString::toCaseFolded() returns.
What you can do is either doing comparisons via QCollator, which I believe should result in "ß" == "ss"at least for German locales. I don't know whether this also works for other locales though.
Or you have to do a manual replacement by yourself ...
-
Heh, this actually works :)
QString("ẞ").toCaseFolded().toUpper(); // does return "SS" for me
-
@kkoehne said in Unicode case folding:
Heh, this actually works :)
:) I guess that your locale is German.
Ok, guys, I understood you, so will mark this topic as solved.
-
@imironchik said in Unicode case folding:
:) I guess that your locale is German.
No, actually not. QString::toUpper() always operates in the C locale....
It seems that the definition in Unicode that ß should become SS when upper-cased predates the introduction of the upper case ẞ , and they didn't want to change this behavior ...
That was an interesting rabbit hole ;) You might still be better of handling this case explicitly in your logic though, who knows whether Unicode will change this at one point.