Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FTS5_TOKEN_COLOCATED #25

Open
andersjo opened this issue Jun 5, 2021 · 1 comment
Open

FTS5_TOKEN_COLOCATED #25

andersjo opened this issue Jun 5, 2021 · 1 comment

Comments

@andersjo
Copy link

andersjo commented Jun 5, 2021

The FTS tokenizer API has the concept of "colocated" tokens where multiple tokens can occupy the same position in a sentence. The main use of this functionality is to implement synonyms (See Sec 7.1.1).

Is there any way to mark a token as colocated through the Python API?

@macabrus
Copy link

macabrus commented Nov 26, 2022

I believe the author thought of it, however, I haven't tested it.

xToken(pCtx, 0, "i",                      1,  0,  1);
xToken(pCtx, 0, "won",                    3,  2,  5);
xToken(pCtx, 0, "first",                  5,  6, 11);
xToken(pCtx, FTS5_TOKEN_COLOCATED, "1st", 3,  6, 11);
xToken(pCtx, 0, "place",                  5, 12, 17);

It should be possible, or even if it isn't yet, shouldn't be hard to implement. Will test it to see if it works and make PR if it doesn't.

EDIT: Remove docs link. sorry 😅, you already linked relevant section on SQLite docs website.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants