Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(deps): update transformers dep #464

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

renovate[bot]
Copy link
Contributor

@renovate renovate bot commented Dec 15, 2024

This PR contains the following updates:

Package Change Age Adoption Passing Confidence
accelerate ==1.1.1 -> ==1.2.1 age adoption passing confidence
transformers ==4.46.3 -> ==4.47.0 age adoption passing confidence

Release Notes

huggingface/accelerate (accelerate)

v1.2.1: : Patchfix

Compare Source

Full Changelog: huggingface/accelerate@v1.2.0...v1.2.1

v1.2.0

Compare Source

huggingface/transformers (transformers)

v4.47.0: v4.47.0: PaliGemma-2, I-JEPA, OLMo-2, LayerSkip, Tensor Parallel

Compare Source

New models

PaliGemma-2

PaliGemma 2 and PaliGemma are lightweight open vision-language models (VLM) inspired by PaLI-3, and based on open components like the SigLIP vision model and the Gemma language model. PaliGemma takes both images and text as inputs and can answer questions about images with detail and context, meaning that PaliGemma can perform deeper analysis of images and provide useful insights, such as captioning for images and short videos, object detection, and reading text embedded within images.

PaliGemma 2 is available in 3B, 10B, and 28B parameter sizes, which are based on Gemma 2 2B, 9B, and 27B models, respectively. The original PaliGemma models are available in the 3B size. For more information on Gemma model variants, see the Gemma models list. PaliGemma model variants support different pixel resolutions for image inputs, including 224 x 224, 448 x 448, and 896 x 896 pixels.

image
I-JEPA

The I-JEPA model was proposed in Image-based Joint-Embedding Predictive Architecture by Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, Nicolas Ballas. I-JEPA is a self-supervised learning method that predicts the representations of one part of an image based on other parts of the same image. This approach focuses on learning semantic features without relying on pre-defined invariances from hand-crafted data transformations, which can bias specific tasks, or on filling in pixel-level details, which often leads to less meaningful representations.

image
OLMo 2
image

The OLMo2 model is the successor of the OLMo model, which was proposed in OLMo: Accelerating the Science of Language Models.

The architectural changes from the original OLMo model to this model are:

  • RMSNorm is used instead of standard layer norm.
  • Norm is applied to attention queries and keys.
  • Norm is applied after attention/feedforward layers rather than before.

Commits:

Layer-Skip Llama

We add support for Meta's Layer-Skip Llama 3.2 1B model.

The Llama3.2 1B model was continually pretrained with LayerSkip recipe, early exit loss and layer dropout, as presented in Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding and is capable of performing self-speculative decoding: decode with earlier layers and verify with remaining layers.

image

Tensor Parallel implementation

This PR uses the torch.distributed.tensor.parallel subpackage to implement Tensor Parallel for Llama (as an example).

The motivation is multi-fold:

  1. to make modeling code simple as single-worker case:
    all manual TP implementations under if self.config.pretraining_tp > 1 can be removed.

  2. to make tensor parallelism easily accessible by users:
    added a model.tensor_parallel(device_mesh) method that allows users to turn a single-proc model into a parallel model. !- Please guide me to a right place to put this function/method if PreTrainedModel is not a preferred place. -!

This is the first PR of many to simplify and enable Tensor Parallel across models.

Farewell, Python 3.8

Python 3.8 reaches end of life, and, as such, we drop it from our CI.

GGUF improvements

Several improvements have been done to the GGUF support in transformers; notably by adding new architectures to the list of supported architectures.

Fast processors

We continue the work to improve the speed of fast processors as detailed in this roadmap.

We contribute a fast processor to RT-DETR.

New pipelines

A new pipeline has been added to transformers: image-text-to-text!

the pipeline support the following inputs:

  • unbatched images and text - images=image, text=text
  • batched images and text - images = [image, image], text= [text, text]
  • several images per prompt (only for models supporting the use of an image token) - images = [[image, image], [image]] or images=[image, image, image], text = ["... ......", "......"]
  • Chat templates (for models supporting them).
Notable refactors
Separate chat templates into a single file

We have had several issues with chat templates because they're stored as single lines in the JSON config files:

  • Impossible to review diffs
  • Very hard to edit in the web UI (or in general)
  • Differences between processor templates in chat_template.json and tokenizer templates in tokenizer_config.json causing confusion
  • Some models use multiple templates, requiring a template dict, but we're trying to discourage that in future and move those models to single templates with conditional behaviour instead

The solution:

  • Just move chat templates to a single chat_template.jinja file in the repo
  • If multiple templates are required, then they should still be stored in the JSON file. This is not supported for Processor classes, so processors should always be able to save their template as a raw Jinja file. In general, we'll be gently deprecating multiple templates in future.
  • If a chat_template.jinja file is present, it overrides the JSON files. If a tokenizer is loaded with both Jinja and JSON chat templates and resaved, it should save only the Jinja file, and not have any chat_template entry in tokenizer_config.json.

For now, we continue saving in the old format by default. I'll probably keep it this way for several versions before making the new format the default, to ensure that most users are able to load the new format before it becomes common. Until then, the new format should mostly be used for testing, to make sure it's ready for deployment when we do the switch.

Large modular logic refactor

This PR largely rework the logic we use in the modular converter. It is (hopefully) clearer and maintainable. Instead of going in all directions, adding stuff, then deleting it if not needed, we now do the following:

  • visit all the modular file (record imports/functions/classes/assignments nodes)
    • create function dependency mapping
  • for each import coming from another model:
    • visit the corresponding file
    • create function dependency mapping
    • update mapping with function/assignment from the modular (updated/new functions)
    • create the class dependency graph based on merged dependencies
  • update dependency graph of the modular with the functions and assignments imported from the other files
  • for each class recorded in the modular:
    • if inherithing from class in another file:
      • replace call to super
      • find the dependencies after the node was replaced
      • follow (updated with modular defs) dependency mapping to add all nodes
    • else:
      • only add needed imported functions (and their dependencies)
  • determine the needed imports and add them

Community bugfixes and improvements


Configuration

📅 Schedule: Branch creation - "on sunday" (UTC), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

👻 Immortal: This PR will be recreated if closed unmerged. Get config help if that's undesired.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@renovate renovate bot added the dependencies Pull requests that update a dependency file label Dec 15, 2024
@renovate renovate bot force-pushed the renovate/transformers-dep branch 3 times, most recently from 4cd4925 to c154b8f Compare December 15, 2024 09:20
@renovate renovate bot force-pushed the renovate/transformers-dep branch from c154b8f to 6857942 Compare December 15, 2024 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants