Initialization of TagReader #165
Replies: 3 comments 4 replies
-
This is a valid point. While I don't think that this is a large problem now, it is certainly something to keep in mind in the context the tag search. This caching mechanism is used for to speed up the suggestion mechanism of tags when entering them on the UI side. With the large number of tags that are available for the datasets used in the recent past, I'd argue for a more powerful mechanism to specify them on the UI side, since the simple suggestion of every element which contains a matching substring gets unwieldy if you don't know exactly what you are looking for (in which case, it is unnecessary). The caching mechanism currently also lacks a rebuild trigger, based on the assumption that no new tags will be added during operation. This has usually been the case in past use cases, but is not actually a given. |
Beta Was this translation helpful? Give feedback.
-
Contrary to @lucaro I would suggest to expand the existing endpoint in the backend (cineast). In doing so, any UI could benefit from an improved behaviour and thus, ultimately be faster. Also, let's not forget that in the current implementation in the UI, we cannot get three letter tags easily (i.e. |
Beta Was this translation helpful? Give feedback.
-
I like the idea and on a Cottontail DB side, this should not be a huge change. We're already tracking metadata such as the last changes to an entity. All we really need is a way to query this information. I guess from a functionality perspective, the only real question is whether this should be implemented as a "normal" query to some kind of special entity (as for most DBMS) or whether we should add dedicated endpoints for this. |
Beta Was this translation helpful? Give feedback.
-
Currently, the
TagReader
reads at every startup all available tags and puts them into the cache. This takes for V3C1 approximately 300ms when cottontail is available locally and you are on a fast node (Purple Nodes), 2s when cottontail is available locally and you are on a slow node (dmi-vitrivr
), and when accessing cottontail on another machine, it takes significantly longer (11 seconds for me at home). This is not a bug or anything but makes every restart when developing on V3C1 slightly more annoying.I thought the new Discussions feature of Github would be a good place to brainstorm on whether this is (a) even a problem, (b) if there are any sensible solutions and (c) if we want to implement those.
One simple solution would be that we cache the available tags cineast-side per db-host in a file and then ask cottontail for the current state of the table at startup (e.g. hash / version / last change timestamp - such a feature does not yet exist AFAIK but would maybe be a good addition to cottontail anyway?).
Thoughts @ppanopticon @lucaro @sauterl ?
Beta Was this translation helpful? Give feedback.
All reactions