FTS5 tokenchars
-
Hello,
Is anyone using FTS5 on SQLITE? I tried a lot of syntaxes, but without success. It crashes when i use the tokenchars option.this is what i got from the documentation. An example with tokenchars.
-- Create an FTS5 table that does not remove diacritics from Latin -- script characters, and that considers hyphens and underscore characters -- to be part of tokens. CREATE VIRTUAL TABLE ft USING fts5(a, b, tokenize = "unicode61 remove_diacritics 0 tokenchars '-_'" );
Believe me, I've tried a lot, syntaxing the apostrophes and quotes in different ways. Without tokenchars option it works perfectly. Here are a few:
CREATE VIRTUAL TABLE IF NOT EXISTS ftsdb USING fts5(id, name, status UNINDEXED, tokenize = 'trigram tokenchars -', content = 'thedb', content_rowid='key') CREATE VIRTUAL TABLE IF NOT EXISTS ftsdb USING fts5(id, name, status UNINDEXED, tokenize = 'unicode61 tokenchars'-'', content = 'thedb', content_rowid='key')
I hope I have missed that one and only (-:
thanks.
-
Why not simply check your statement with sqlite3 executable?
sqlite> CREATE VIRTUAL TABLE IF NOT EXISTS ftsdb USING fts5(id, name, status UNINDEXED, tokenize = 'trigram tokenchars -', content = 'thedb', content_rowid='key'); Runtime error: parse error in tokenize directive sqlite>
-
Hi,
Are you sure the support for FTS5 is built ?
-
Thanks @SGaist
When I run "pragma compile_options" in a SQL shell I get this. FTS5 seems enabled. And when I use FTS4 in my query, the shadow tables looks different and everything works fine except for the tokenchars option.COMPILER=msvc-1916
ENABLE_FTS3
ENABLE_FTS3_PARENTHESIS
ENABLE_FTS5
ENABLE_GEOPOLY
ENABLE_JSON1
ENABLE_RTREE
ENABLE_STAT4
MAX_ATTACHED=125
SOUNDEX
THREADSAFE=1 -
Hello,
Is anyone using FTS5 on SQLITE? I tried a lot of syntaxes, but without success. It crashes when i use the tokenchars option.this is what i got from the documentation. An example with tokenchars.
-- Create an FTS5 table that does not remove diacritics from Latin -- script characters, and that considers hyphens and underscore characters -- to be part of tokens. CREATE VIRTUAL TABLE ft USING fts5(a, b, tokenize = "unicode61 remove_diacritics 0 tokenchars '-_'" );
Believe me, I've tried a lot, syntaxing the apostrophes and quotes in different ways. Without tokenchars option it works perfectly. Here are a few:
CREATE VIRTUAL TABLE IF NOT EXISTS ftsdb USING fts5(id, name, status UNINDEXED, tokenize = 'trigram tokenchars -', content = 'thedb', content_rowid='key') CREATE VIRTUAL TABLE IF NOT EXISTS ftsdb USING fts5(id, name, status UNINDEXED, tokenize = 'unicode61 tokenchars'-'', content = 'thedb', content_rowid='key')
I hope I have missed that one and only (-:
thanks.
@Jan-Bakker said in FTS5 tokenchars:
. It crashes when i use the tokenchars option.
What does this mean? Please provide the backtrace and a minimal, compilable example. What exact Qt version do you use?
-
Thanks @SGaist
When I run "pragma compile_options" in a SQL shell I get this. FTS5 seems enabled. And when I use FTS4 in my query, the shadow tables looks different and everything works fine except for the tokenchars option.COMPILER=msvc-1916
ENABLE_FTS3
ENABLE_FTS3_PARENTHESIS
ENABLE_FTS5
ENABLE_GEOPOLY
ENABLE_JSON1
ENABLE_RTREE
ENABLE_STAT4
MAX_ATTACHED=125
SOUNDEX
THREADSAFE=1@Jan-Bakker said in FTS5 tokenchars:
When I run "pragma compile_options" in a SQL shell I get this.
When you did not compile Qt with the external Sqlite library then this has nothing to say. Qt is using its own Sqlite lib compiled in the plugin by default. But even then FTS5 is active since at least Qt5.15
-
the qt version is 6.7.1 / compiler mingw
for clarity! Not the application crashes and I can reach the database too, except that the FTS tables are not created. If I don't use "tokenchars" the tables are created properly as well the needed triggers to keep the fts table up-to-date. Searching with trigram tokenizer, which is a standard fts5 option only, is working correctly.
The database is created and filled by the application itself.
Do you think it's still needed to rebuild QT?Here's a simplified snippet of my code.
The given error from QsqlQuery:
[parse error in "tokenize = 'trigram' 'tokenchars '-''" Unable to fetch row]QSqlQuery kwerie; kwerie.prepare("CREATE VIRTUAL TABLE IF NOT EXISTS ftsdb USING fts5(id, name, status UNINDEXED, tokenize = 'trigram tokenchars -', content = 'thedb', content_rowid='key')"); if(!kwerie.exec()){ qDebug() << kwerie.lastError().text(); }
Thanks.
-
the qt version is 6.7.1 / compiler mingw
for clarity! Not the application crashes and I can reach the database too, except that the FTS tables are not created. If I don't use "tokenchars" the tables are created properly as well the needed triggers to keep the fts table up-to-date. Searching with trigram tokenizer, which is a standard fts5 option only, is working correctly.
The database is created and filled by the application itself.
Do you think it's still needed to rebuild QT?Here's a simplified snippet of my code.
The given error from QsqlQuery:
[parse error in "tokenize = 'trigram' 'tokenchars '-''" Unable to fetch row]QSqlQuery kwerie; kwerie.prepare("CREATE VIRTUAL TABLE IF NOT EXISTS ftsdb USING fts5(id, name, status UNINDEXED, tokenize = 'trigram tokenchars -', content = 'thedb', content_rowid='key')"); if(!kwerie.exec()){ qDebug() << kwerie.lastError().text(); }
Thanks.
@Jan-Bakker said in FTS5 tokenchars:
Do you think it's still needed to rebuild QT?
I never said something wrt to rebuilding Qt.
kwerie.prepare()
Are you sure a "CREATE TABLE" statement can be prepared at all? I would guess no - use a simple QSqlQuery::exec().
-
Indeed, in this simplyfied code, the prepare is not needed. But as I noted earlier, without "tokenchar" option it is working pefectly that way. So I guess it is really related to that option. The question is, can it be a syntax problem what i do not understand or do I miss something else.
An yes I tried exec() directly on your advise. Prepare() returns always true, even when I use the option tokenchars. Only exec() returns false.
-
Why not simply check your statement with sqlite3 executable?
sqlite> CREATE VIRTUAL TABLE IF NOT EXISTS ftsdb USING fts5(id, name, status UNINDEXED, tokenize = 'trigram tokenchars -', content = 'thedb', content_rowid='key'); Runtime error: parse error in tokenize directive sqlite>
-
A very good idea. I have done that and found out the tokenchars option is accepted and, as I expected, it was a syntax problem from my side. \" \" did the trick.
....... tokenize = \"unicode61 tokenchars '-'\", ...........
and... tokenchars doesn't work with the trigram tokenizer. I have to use standard unicode61 tokenizer. That's another chapter (-:
Thanks for help.
-
-