forked from huggingface/transformers
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
upgrade other-than-pytorch folders under examples/
- Loading branch information
1 parent
c0d0614
commit 781a215
Showing
248 changed files
with
3,515 additions
and
2,024 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Using the `diff_converter` linter | ||
|
||
`pip install libcst` is a must! | ||
|
||
# `sh examples/diff-conversion/convert_examples.sh` to get the converted outputs | ||
|
||
The diff converter is a new `linter` specific to `transformers`. It allows us to unpack inheritance in python to convert a modular `diff` file like `diff_gemma.py` into a `single model single file`. | ||
|
||
Examples of possible usage are available in the `examples/diff-conversion`, or `diff_gemma` for a full model usage. | ||
|
||
`python utils/diff_model_converter.py --files_to_parse "/Users/arthurzucker/Work/transformers/examples/diff-conversion/diff_my_new_model2.py"` | ||
|
||
## How it works | ||
We use the `libcst` parser to produce an AST representation of the `diff_xxx.py` file. For any imports that are made from `transformers.models.modeling_xxxx` we parse the source code of that module, and build a class dependency mapping, which allows us to unpack the difference dependencies. | ||
|
||
The code from the `diff` file and the class dependency mapping are "merged" to produce the single model single file. | ||
We use ruff to automatically remove the potential duplicate imports. | ||
|
||
## Why we use libcst instead of the native AST? | ||
AST is super powerful, but it does not keep the `docstring`, `comment` or code formatting. Thus we decided to go with `libcst` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
#!/bin/bash | ||
|
||
# Iterate over each file in the current directory | ||
for file in examples/diff-conversion/diff_*; do | ||
# Check if it's a regular file | ||
if [ -f "$file" ]; then | ||
# Call the Python script with the file name as an argument | ||
python utils/diff_model_converter.py --files_to_parse "$file" | ||
fi | ||
done |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
from math import log | ||
from typing import List, Optional, Tuple, Union | ||
|
||
import torch | ||
|
||
from transformers import Cache | ||
from transformers.modeling_outputs import CausalLMOutputWithPast | ||
from transformers.models.llama.modeling_llama import LlamaModel | ||
|
||
|
||
def _pre_process_input(input_ids): | ||
print(log(input_ids)) | ||
return input_ids | ||
|
||
|
||
# example where we need some deps and some functions | ||
class DummyModel(LlamaModel): | ||
def forward( | ||
self, | ||
input_ids: torch.LongTensor = None, | ||
attention_mask: Optional[torch.Tensor] = None, | ||
position_ids: Optional[torch.LongTensor] = None, | ||
past_key_values: Optional[Union[Cache, List[torch.FloatTensor]]] = None, | ||
inputs_embeds: Optional[torch.FloatTensor] = None, | ||
use_cache: Optional[bool] = None, | ||
output_attentions: Optional[bool] = None, | ||
output_hidden_states: Optional[bool] = None, | ||
return_dict: Optional[bool] = None, | ||
cache_position: Optional[torch.LongTensor] = None, | ||
) -> Union[Tuple, CausalLMOutputWithPast]: | ||
input_ids = _pre_process_input(input_ids) | ||
|
||
return super().forward( | ||
None, | ||
attention_mask, | ||
position_ids, | ||
past_key_values, | ||
inputs_embeds, | ||
use_cache, | ||
output_attentions, | ||
output_hidden_states, | ||
return_dict, | ||
cache_position, | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
from transformers.models.llama.configuration_llama import LlamaConfig | ||
|
||
|
||
# Example where we only want to only add a new config argument and new arg doc | ||
# here there is no `ARG` so we are gonna take parent doc | ||
class MyNewModelConfig(LlamaConfig): | ||
r""" | ||
mlp_bias (`bool`, *optional*, defaults to `False`) | ||
""" | ||
|
||
def __init__(self, mlp_bias=True, new_param=0, **super_kwargs): | ||
self.mlp_bias = mlp_bias | ||
self.new_param = new_param | ||
super().__init__(self, **super_kwargs) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
from transformers.models.gemma.modeling_gemma import GemmaForSequenceClassification | ||
from transformers.models.llama.configuration_llama import LlamaConfig | ||
|
||
|
||
# Example where we only want to only modify the docstring | ||
class MyNewModel2Config(LlamaConfig): | ||
r""" | ||
This is the configuration class to store the configuration of a [`GemmaModel`]. It is used to instantiate an Gemma | ||
model according to the specified arguments, defining the model architecture. Instantiating a configuration with the | ||
defaults will yield a similar configuration to that of the Gemma-7B. | ||
e.g. [google/gemma-7b](https://huggingface.co/google/gemma-7b) | ||
Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the | ||
documentation from [`PretrainedConfig`] for more information. | ||
Args: | ||
vocab_size (`int`, *optional*, defaults to 256000): | ||
Vocabulary size of the Gemma model. Defines the number of different tokens that can be represented by the | ||
`inputs_ids` passed when calling [`GemmaModel`] | ||
```python | ||
>>> from transformers import GemmaModel, GemmaConfig | ||
>>> # Initializing a Gemma gemma-7b style configuration | ||
>>> configuration = GemmaConfig() | ||
>>> # Initializing a model from the gemma-7b style configuration | ||
>>> model = GemmaModel(configuration) | ||
>>> # Accessing the model configuration | ||
>>> configuration = model.config | ||
```""" | ||
|
||
|
||
# Example where alllllll the dependencies are fetched to just copy the entire class | ||
class MyNewModel2ForSequenceClassification(GemmaForSequenceClassification): | ||
pass |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# Example where we only want to overwrite the defaults of an init | ||
|
||
from transformers.models.gemma.configuration_gemma import GemmaConfig | ||
|
||
|
||
class NewModelConfig(GemmaConfig): | ||
def __init__( | ||
self, | ||
vocab_size=256030, | ||
hidden_size=64, | ||
intermediate_size=90, | ||
num_hidden_layers=28, | ||
num_attention_heads=16, | ||
num_key_value_heads=16, | ||
head_dim=256, | ||
hidden_act="gelu_pytorch_tanh", | ||
hidden_activation=None, | ||
max_position_embeddings=1500, | ||
initializer_range=0.02, | ||
rms_norm_eps=1e-6, | ||
use_cache=True, | ||
pad_token_id=0, | ||
eos_token_id=1, | ||
bos_token_id=2, | ||
tie_word_embeddings=True, | ||
rope_theta=10000.0, | ||
attention_bias=False, | ||
attention_dropout=0.0, | ||
): | ||
super().__init__(self) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
from typing import List, Optional, Tuple, Union | ||
|
||
import torch | ||
|
||
from transformers import Cache | ||
from transformers.modeling_outputs import CausalLMOutputWithPast | ||
from transformers.models.llama.modeling_llama import LlamaModel | ||
|
||
|
||
# example where we need some deps and some functions | ||
class SuperModel(LlamaModel): | ||
def forward( | ||
self, | ||
input_ids: torch.LongTensor = None, | ||
attention_mask: Optional[torch.Tensor] = None, | ||
position_ids: Optional[torch.LongTensor] = None, | ||
past_key_values: Optional[Union[Cache, List[torch.FloatTensor]]] = None, | ||
inputs_embeds: Optional[torch.FloatTensor] = None, | ||
use_cache: Optional[bool] = None, | ||
output_attentions: Optional[bool] = None, | ||
output_hidden_states: Optional[bool] = None, | ||
return_dict: Optional[bool] = None, | ||
cache_position: Optional[torch.LongTensor] = None, | ||
) -> Union[Tuple, CausalLMOutputWithPast]: | ||
out = super().forward( | ||
input_ids, | ||
attention_mask, | ||
position_ids, | ||
past_key_values, | ||
inputs_embeds, | ||
use_cache, | ||
output_attentions, | ||
output_hidden_states, | ||
return_dict, | ||
cache_position, | ||
) | ||
out.logits *= 2**4 | ||
return out |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,10 @@ | ||
datasets >= 1.1.3 | ||
pytest | ||
datasets >= 1.13.3 | ||
pytest<8.0.1 | ||
conllu | ||
nltk | ||
rouge-score | ||
seqeval | ||
tensorboard | ||
evaluate >= 0.2.0 | ||
evaluate >= 0.2.0 | ||
torch | ||
accelerate |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.