add dtype-based loading #461

michaelfeil · 2024-11-13T05:17:55Z

This pull request includes several changes to improve the handling of loading strategies, device placement, and quantization in the infinity_emb library. The most important changes involve updates to the SentenceClassifier, CrossEncoder, and SentenceTransformer classes to incorporate new loading strategies and device placements, as well as handling different data types and quantization.

Improvements to handling loading strategies and device placement:

libs/infinity_emb/infinity_emb/transformer/classifier/torch.py: Added support for loading strategies, device placement, and quantization in the SentenceClassifier class. [1] [2]
libs/infinity_emb/infinity_emb/transformer/crossencoder/torch.py: Updated the CrossEncoder class to handle loading strategies, device placement, and quantization. [1] [2] [3] [4]
libs/infinity_emb/infinity_emb/transformer/embedder/sentence_transformer.py: Enhanced the SentenceTransformer class to support loading strategies and device placement.<!--
Congratulations! You've made it this far! Thanks for submitting a PR to Infinity!

License & CLA

By submitting this PR, I confirm that my contribution is made under the terms of the MIT license.
-->

Related Issue

Checklist

I have read the CONTRIBUTING guidelines.
I have added tests to cover my changes.
I have updated the documentation (docs folder) accordingly.

Additional Notes

Add any other context about the PR here.

greptile-apps

PR Summary

This PR implements dtype-based loading strategies and device placement across transformer models, replacing manual dtype/device management with a more consistent approach.

Added loading strategy support in /libs/infinity_emb/infinity_emb/transformer/embedder/sentence_transformer.py with loading_dtype parameter for model initialization
Integrated quantization interface via quant_interface in /libs/infinity_emb/infinity_emb/transformer/classifier/torch.py and /libs/infinity_emb/infinity_emb/transformer/crossencoder/torch.py
Added torch.compile support in classifier and crossencoder implementations
Standardized float32 numpy output in CrossEncoder's encode_post method
Removed manual half-precision conversion in favor of loading_dtype across transformer classes

_{3 file(s) reviewed, 4 comment(s)}
_{Edit PR Review Bot Settings | Greptile}

libs/infinity_emb/infinity_emb/transformer/classifier/torch.py

libs/infinity_emb/infinity_emb/transformer/crossencoder/torch.py

codecov-commenter · 2024-11-13T05:26:04Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 87.09677% with 4 lines in your changes missing coverage. Please review.

Project coverage is 79.08%. Comparing base (4ab717b) to head (11fc52b).

Files with missing lines	Patch %	Lines
...y_emb/infinity_emb/transformer/classifier/torch.py	78.57%	3 Missing ⚠️
...emb/infinity_emb/transformer/crossencoder/torch.py	92.85%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #461      +/-   ##
==========================================
+ Coverage   78.97%   79.08%   +0.10%     
==========================================
  Files          42       42              
  Lines        3392     3414      +22     
==========================================
+ Hits         2679     2700      +21     
- Misses        713      714       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

greptile-apps bot reviewed Nov 13, 2024

View reviewed changes

add dtype-based loading

11fc52b

michaelfeil merged commit 0a688b6 into main Nov 13, 2024
36 checks passed

michaelfeil deleted the dtype-based-loading branch November 13, 2024 06:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add dtype-based loading #461

add dtype-based loading #461

michaelfeil commented Nov 13, 2024

greptile-apps bot left a comment

codecov-commenter commented Nov 13, 2024 •

edited

Loading

add dtype-based loading #461

add dtype-based loading #461

Conversation

michaelfeil commented Nov 13, 2024

Improvements to handling loading strategies and device placement:

License & CLA

Related Issue

Checklist

Additional Notes

greptile-apps bot left a comment

Choose a reason for hiding this comment

PR Summary

codecov-commenter commented Nov 13, 2024 • edited Loading

Codecov Report

codecov-commenter commented Nov 13, 2024 •

edited

Loading