Colbert #456

michaelfeil · 2024-11-11T06:27:37Z

Description

Please provide a clear and concise description of the changes in this PR.

Related Issue

If applicable, link the issue this PR addresses.

Types of Change

Bug fix
New feature
Documentation update

Checklist

I have read the CONTRIBUTING guidelines.
My code follows the code style of this project.
I have added tests to cover my changes.
All new and existing tests passed.
My changes generate no new warnings.
I have updated the documentation accordingly.

Additional Notes

Add any other context about the PR here.

License

By submitting this PR, I confirm that my contribution is made under the terms of the MIT license.

greptile-apps

PR Summary

Here's a concise summary of the key changes in this PR that adds ColBERT model support:

Added support for multiple model replicas and device placement across GPUs:

Introduced new --device-id CLI argument to specify GPU/MPS device placement (e.g. "0,1") for model distribution
Added LoadingStrategy class to manage device mapping, dtype selection, and quantization across replicas
Modified BatchHandler to support multiple ModelWorker instances running parallel model replicas
Updated AsyncEmbeddingEngine to handle multiple model replicas instead of single instance

Added ColBERT-specific functionality:

Introduced mode_colbert flag in SentenceTransformerPatched for ColBERT model detection
Modified embedding processing to handle ColBERT's token-level embeddings and attention masks
Added normalize_embeddings flag to control embedding normalization behavior
Added test coverage for ColBERT models with token-level embedding verification

The changes enable efficient parallel processing across devices while maintaining backward compatibility with existing models.

_{25 file(s) reviewed, 9 comment(s)}
_{Edit PR Review Bot Settings | Greptile}

greptile-apps · 2024-11-11T06:29:06Z

libs/infinity_emb/infinity_emb/args.py

@@ -53,6 +60,7 @@ class EngineArgs:
    model_warmup: bool = MANAGER.model_warmup[0]
    vector_disk_cache_path: str = ""
    device: Device = Device[MANAGER.device[0]]
+    device_id: DeviceID = field(default_factory=lambda: DeviceID(MANAGER.device_id[0]))


logic: device_id is not included in from_env() method's zip_longest parameters, causing potential initialization issues

libs/infinity_emb/infinity_emb/args.py

libs/infinity_emb/infinity_emb/engine.py

greptile-apps · 2024-11-11T06:35:08Z

libs/infinity_emb/infinity_emb/transformer/embedder/sentence_transformer.py

+                embeddings: "Tensor" = out_features.to(torch.float32)
+                if self.normalize_embeddings:


logic: redundant check for not self.mode_colbert since it's already in an if not self.mode_colbert block

libs/infinity_emb/infinity_emb/transformer/embedder/sentence_transformer.py

libs/infinity_emb/tests/end_to_end/test_sentence_transformers_colbert.py

codecov-commenter · 2024-11-11T06:48:35Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 91.66667% with 2 lines in your changes missing coverage. Please review.

Project coverage is 79.19%. Comparing base (317e809) to head (e6fe213).

Files with missing lines	Patch %	Lines
libs/infinity_emb/infinity_emb/infinity_server.py	75.00%	2 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #456      +/-   ##
==========================================
+ Coverage   79.12%   79.19%   +0.07%     
==========================================
  Files          42       42              
  Lines        3367     3379      +12     
==========================================
+ Hits         2664     2676      +12     
  Misses        703      703

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

michaelfeil and others added 8 commits November 3, 2024 11:48

inital commit

052378f

fmt

b11ea98

lint

a873b21

Merge branch 'main' into model-parallel-device-interface

791996e

fix: typos, more docs

e4bb6a4

colbert v1

c4958e0

Merge branch 'main' into colbert

7d5ce83

Merge branch 'main' into colbert

642d879

greptile-apps bot reviewed Nov 11, 2024

View reviewed changes

michaelfeil added 2 commits November 10, 2024 22:53

Merge branch 'main' into colbert

850736c

fix: usage of uvloop

e6fe213

michaelfeil merged commit 98dffca into main Nov 11, 2024
36 checks passed

michaelfeil deleted the colbert branch November 11, 2024 07:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Colbert #456

Colbert #456

michaelfeil commented Nov 11, 2024

greptile-apps bot left a comment

greptile-apps bot Nov 11, 2024

greptile-apps bot Nov 11, 2024

codecov-commenter commented Nov 11, 2024 •

edited

Loading

		embeddings: "Tensor" = out_features.to(torch.float32)
		if self.normalize_embeddings:

Colbert #456

Colbert #456

Conversation

michaelfeil commented Nov 11, 2024

Description

Related Issue

Types of Change

Checklist

Additional Notes

License

greptile-apps bot left a comment

Choose a reason for hiding this comment

PR Summary

greptile-apps bot Nov 11, 2024

Choose a reason for hiding this comment

greptile-apps bot Nov 11, 2024

Choose a reason for hiding this comment

codecov-commenter commented Nov 11, 2024 • edited Loading

Codecov Report

codecov-commenter commented Nov 11, 2024 •

edited

Loading