Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maybe very important!!! Fix embedding error #464

Merged
merged 2 commits into from
Dec 13, 2024

Conversation

billvsme
Copy link
Contributor

The asyncio.as_completed() function does not guarantee that the results are ordered

When using this project, I found that the entities and relationships of the recall were somewhat related to the low-level keywords and the high-level keyword, but it was not very related, and I felt that the ability of the vector model was not fully utilized, it maybe a strange problem.

I looked at the code of inserting vectors and found problem. asyncio.as_completed is used incorrectly, asyncio.as_completed results are not returned in the order in input tasks, but in the order in which the tasks were completed. I changed to use asyncio.gather to make sure the results were orderly. And ensure that tqdm works properly.

This code need asyncio.as_completed results were orderly

if len(embeddings) == len(list_data):
for i, d in enumerate(list_data):
d["__vector__"] = embeddings[i]

@LarFii LarFii merged commit ae0c43b into HKUDS:main Dec 13, 2024
1 check failed
@LarFii
Copy link
Collaborator

LarFii commented Dec 13, 2024

Thank you so much for your valuable contribution. This is indeed an issue that can occur.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants