Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Getting error : AttributeError: RobertaTokenizerFast has no attribute _bos_token #3582

Closed
stsfaroz opened this issue Dec 10, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@stsfaroz
Copy link

Describe the bug

when i try to run the Zero shot with TARSTagger , I face the below issue

I tried this : https://github.com/flairNLP/flair/blob/master/resources/docs/TUTORIAL_10_TRAINING_ZERO_SHOT_MODEL.md

To Reproduce

from flair.models import TARSTagger
from flair.data import Sentence

# 1. Load zero-shot NER tagger
tars = TARSTagger.load('tars-ner')

# 2. Prepare some test sentences
sentences = [
    Sentence("The Humboldt University of Berlin is situated near the Spree in Berlin, Germany"),
    Sentence("Bayern Munich played against Real Madrid"),
    Sentence("I flew with an Airbus A380 to Peru to pick up my Porsche Cayenne"),
    Sentence("Game of Thrones is my favorite series"),
]

# 3. Define some classes of named entities such as "soccer teams", "TV shows" and "rivers"
labels = ["Soccer Team", "University", "Vehicle", "River", "City", "Country", "Person", "Movie", "TV Show"]
tars.add_and_switch_to_new_task('task 1', labels, label_type='ner')

# 4. Predict for these classes and print results
for sentence in sentences:
    tars.predict(sentence)
    print(sentence.to_tagged_string("ner"))

Expected behavior

Sentence: "The Humboldt University of Berlin is situated near the Spree in Berlin , Germany" → ["Humboldt University of Berlin"/University, "Spree"/River, "Berlin"/City, "Germany"/Country]

Sentence: "Bayern Munich played against Real Madrid" → ["Bayern Munich"/Soccer Team, "Real Madrid"/Soccer Team]

Sentence: "I flew with an Airbus A380 to Peru to pick up my Porsche Cayenne" → ["Airbus A380"/Vehicle, "Peru"/Country, "Porsche Cayenne"/Vehicle]

Sentence: "Game of Thrones is my favorite series" → ["Game of Thrones"/TV Show]

Logs and Stack traces

2024-12-10 10:03:03,799 SequenceTagger predicts: Dictionary with 5 tags: O, S-entity, B-entity, E-entity, I-entity
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[4], line 5
      2 from flair.data import Sentence
      4 # 1. Load zero-shot NER tagger
----> 5 tars = TARSTagger.load('tars-ner')
      7 # 2. Prepare some test sentences
      8 sentences = [
      9     Sentence("The Humboldt University of Berlin is situated near the Spree in Berlin, Germany"),
     10     Sentence("Bayern Munich played against Real Madrid"),
     11     Sentence("I flew with an Airbus A380 to Peru to pick up my Porsche Cayenne"),
     12     Sentence("Game of Thrones is my favorite series"),
     13 ]

File ~/miniconda3/envs/mner/lib/python3.12/site-packages/flair/models/tars_model.py:654, in TARSTagger.load(cls, model_path)
    650 @classmethod
    651 def load(cls, model_path: Union[str, Path, Dict[str, Any]]) -> "TARSTagger":
    652     from typing import cast
--> 654     return cast("TARSTagger", super().load(model_path=model_path))

File ~/miniconda3/envs/mner/lib/python3.12/site-packages/flair/models/tars_model.py:320, in FewshotClassifier.load(cls, model_path)
    316 @classmethod
    317 def load(cls, model_path: Union[str, Path, Dict[str, Any]]) -> "FewshotClassifier":
    318     from typing import cast
--> 320     return cast("FewshotClassifier", super().load(model_path=model_path))

File ~/miniconda3/envs/mner/lib/python3.12/site-packages/flair/nn/model.py:564, in Classifier.load(cls, model_path)
    560 @classmethod
    561 def load(cls, model_path: Union[str, Path, Dict[str, Any]]) -> "Classifier":
    562     from typing import cast
--> 564     return cast("Classifier", super().load(model_path=model_path))

File ~/miniconda3/envs/mner/lib/python3.12/site-packages/flair/nn/model.py:197, in Model.load(cls, model_path)
    194 if "__cls__" in state:
    195     state.pop("__cls__")
--> 197 model = cls._init_model_with_state_dict(state)
    199 if "model_card" in state:
    200     model.model_card = state["model_card"]

File ~/miniconda3/envs/mner/lib/python3.12/site-packages/flair/models/tars_model.py:454, in TARSTagger._init_model_with_state_dict(cls, state, **kwargs)
    452     tars_embeddings = tars_model.embeddings
    453 # init new TARS classifier
--> 454 model = super()._init_model_with_state_dict(
    455     state,
    456     task_name=state.get("current_task"),
    457     label_dictionary=state.get("tag_dictionary"),
    458     label_type=state.get("tag_type"),
    459     embeddings=tars_embeddings,
    460     num_negative_labels_to_sample=state.get("num_negative_labels_to_sample"),
    461     prefix=state.get("prefix"),
    462     **kwargs,
    463 )
    464 # set all task information
    465 model._task_specific_attributes = state["task_specific_attributes"]

File ~/miniconda3/envs/mner/lib/python3.12/site-packages/flair/nn/model.py:103, in Model._init_model_with_state_dict(cls, state, **kwargs)
    100         embeddings = load_embeddings(embeddings)
    101     kwargs["embeddings"] = embeddings
--> 103 model = cls(**kwargs)
    105 model.load_state_dict(state["state_dict"])
    107 return model

File ~/miniconda3/envs/mner/lib/python3.12/site-packages/flair/models/tars_model.py:386, in TARSTagger.__init__(self, task_name, label_dictionary, label_type, embeddings, num_negative_labels_to_sample, prefix, **tagger_args)
    384 # transformer separator
    385 self.separator = str(self.tars_embeddings.tokenizer.sep_token)
--> 386 if self.tars_embeddings.tokenizer._bos_token:
    387     self.separator += str(self.tars_embeddings.tokenizer.bos_token)
    389 self.prefix = prefix

File ~/miniconda3/envs/mner/lib/python3.12/site-packages/transformers/tokenization_utils_base.py:1104, in SpecialTokensMixin.__getattr__(self, key)
   1101         return self.convert_tokens_to_ids(attr_as_tokens) if attr_as_tokens is not None else None
   1103 if key not in self.__dict__:
-> 1104     raise AttributeError(f"{self.__class__.__name__} has no attribute {key}")
   1105 else:
   1106     return super().__getattr__(key)

AttributeError: RobertaTokenizerFast has no attribute _bos_token

Screenshots

No response

Additional Context

No response

Environment

Versions:

Flair

0.14.0

Pytorch

2.5.1+cu124

Transformers

4.47.0

GPU

True

@stsfaroz stsfaroz added the bug Something isn't working label Dec 10, 2024
@helpmefindaname
Copy link
Collaborator

Hi @stsfaroz
this is due to the newest version of the transformers library having some internal changes. This is already fixed on master, so you can either install the master branch or use transformers<4.47.0 until a new version of flair is released

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants