Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Openvino runtime for transformer backend and streaming support for Openvino and CUDA #1892

Merged
merged 14 commits into from
Mar 26, 2024

Commits on Mar 12, 2024

  1. fixes #1775 and #1774

    Add BitsAndBytes Quantization and fixes embedding on CUDA devices
    fakezeta committed Mar 12, 2024
    Configuration menu
    Copy the full SHA
    826dd94 View commit details
    Browse the repository at this point in the history

Commits on Mar 14, 2024

  1. Manage 4bit and 8 bit quantization

    Manage different BitsAndBytes options with the quantization: parameter in yaml
    fakezeta committed Mar 14, 2024
    Configuration menu
    Copy the full SHA
    2f73c8d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0d96ed6 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    fbdbc58 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    d304e33 View commit details
    Browse the repository at this point in the history

Commits on Mar 17, 2024

  1. OpenVINO draft

    First draft of OpenVINO integration in transformer backend
    fakezeta committed Mar 17, 2024
    Configuration menu
    Copy the full SHA
    9c1059a View commit details
    Browse the repository at this point in the history

Commits on Mar 20, 2024

  1. first working implementation

    fakezeta committed Mar 20, 2024
    Configuration menu
    Copy the full SHA
    44f2f9e View commit details
    Browse the repository at this point in the history
  2. Streaming working

    fakezeta committed Mar 20, 2024
    Configuration menu
    Copy the full SHA
    30fdc7e View commit details
    Browse the repository at this point in the history

Commits on Mar 25, 2024

  1. Configuration menu
    Copy the full SHA
    d92afe7 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6c67b37 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b2ffd14 View commit details
    Browse the repository at this point in the history

Commits on Mar 26, 2024

  1. Update backend/python/transformers/transformers_server.py

    Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
    mudler authored Mar 26, 2024
    Configuration menu
    Copy the full SHA
    18747ef View commit details
    Browse the repository at this point in the history
  2. Merge branch 'master' into master

    Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
    mudler authored Mar 26, 2024
    Configuration menu
    Copy the full SHA
    b934741 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    afa088d View commit details
    Browse the repository at this point in the history