Replies: 3 comments 2 replies
-
A key can only have one value, if you add a key multiple times with different values the behavior is unspecified. Only one entry will end up in the dictionary. Internally the FSA generator ignores a key that is added again, however the order isn't defined, the compiler sorts key/values and the sort algorithm does not guarantee stable sort (a stable sort algorithm keeps insertion order for equal entries). Multi value dictionaries need to be implemented client side, e.g. if you use a json dictionary you could wrap your values in an array. |
Beta Was this translation helpful? Give feedback.
-
hey @kamcio181
Right, we are adding all values to the value store, but later on when another entry with same key is being added, one of them is becoming like a "dangling" value, without a reference to it.
One possible workaround on this is to make use of prefix matching. Like you can keep a global counter, and add it to the end of each key, this way all keys in your dictionary will be unique, and value pointers won't be overwritten. Later on when reading from file you can do a prefix lookup and find all the matches that belong to your key. Something like this: import keyvi.compiler
import keyvi.dictionary
import keyvi.completion
KEYVI_FILE_NAME = 'keyvi.kv'
class MultiValueDictionary:
def __init__(self, filename):
self._dictionary = keyvi.dictionary.Dictionary(filename)
def get_values(self, key):
prefix_completion = keyvi.completion.PrefixCompletion(self._dictionary)
for match in prefix_completion.GetCompletions(key):
yield match.GetValue()
def stream_data():
return (
('key1', 'value1'),
('key1', 'value2'),
('key1', 'value3'),
('key2', 'value4'),
)
compiler = keyvi.compiler.JsonDictionaryCompiler()
counter = 0
for key, value in stream_data():
unique_key = f'{key}:{counter}'
compiler.Add(unique_key, value)
counter += 1
compiler.Compile()
compiler.WriteToFile(KEYVI_FILE_NAME)
multi_value_dict = MultiValueDictionary(KEYVI_FILE_NAME)
print('----- key1 -----')
values_iter = multi_value_dict.get_values('key1')
print(list(values_iter))
print('----- key2 -----')
values_iter = multi_value_dict.get_values('key2')
print(list(values_iter))
Output -> python multi_value_keyvi.py
----- key1 -----
['value1', 'value2', 'value3']
----- key2 -----
['value4']
Sure this can be improved with some additional checks and etc, but the basic idea is ^. |
Beta Was this translation helpful? Give feedback.
-
Thank you guys for the prompt response. It is clear that I need to implement multi value feature on client side. P.S. Keyvi is great piece of code and I can't wait to explore it more deeply. |
Beta Was this translation helpful? Give feedback.
-
Hello,
I can see that during compilation I can add multiple values for one key and then after compilation I can see in Statistics that there are one key but many values. However I cannot figure out how to retrieve more than one value for a given key.
I am using python.
Beta Was this translation helpful? Give feedback.
All reactions