Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add skip_verbose_naming in add_hook to give an option for skipping the naming #635

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

verlocks
Copy link

Description

Like the proposal mentioned in issue, running full_hook._name_ = (hook._repr_()) in add_hook may cause large overhead in some situation. This PR add a new parameter skip_verbose_naming that allows user to skip this line.

Fixes #631

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

Screenshots

The overhead caused by naming being removed by adding skip_verbose_naming=True in with model.hooks(fwd_hooks=fwd_hooks, skip_verbose_naming=True):.

Before
image
After
image

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

There is some problem with the current unit test here:
image
After my changes, the parameter would be different for add_hook. Should I do anything about it?

verlocks and others added 21 commits June 11, 2024 10:32
* created colab test notebook

* tested more notebooks

* tested more models

* tested some more models

* tested some more models

* added more tests

* updated model name to be inline with hf names

* updated config check

* updated name check

* finished colab demo

* added demo to ci
* add mistral nemo model

* fix bug

* fix bug

* fix: bug nemo model has a defined d_head

* update colab notebook

* update colab notebook

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
…ransformerLensOrg#758)

* added new block for recent diagram, and colab compatability notebook

* updated check

* ran format

* added better checking for prop

* fixed format
…merLensOrg#757)

* Generating a warning and stopping execution for incorrect use of the T5 model

* formatting the file using black
…B) (TransformerLensOrg#761)

* Add configurations for Llama 3.1 models(Llama-3.1-8B and Llama-3.1-70B)

* formatted using black

* added models to colab compatability notebook

* added configurations for Llama-3.1-8B-Instruct and Llama-3.1-70B-Instruct

* cleaned up config a bit

* updated compatibility notebook

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: Curt Tigges <curttigges@Curts-MacBook-Pro.local>
…e for Llama-3.1 series (TransformerLensOrg#764)

* Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1-8B

* Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1 series

* Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1 series

* Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1 series

* fix import order

* fix black check

* fix rope settings also for 3.2 models

* fix rope settings also for llama-3 models

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
…erLensOrg#775)

* fix prepend_bos to False by default for bloom model family

* add comment

* edit documentation

* fix wrong expected value for bloom-560m model loss

* fix expected loss value for bloom model computed with google colab

* set prepend_bos to user value, then to value in model config and then default to true

* fix format

* remove log points in test_hooked_transformer

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: Fabian Degen <fabian.degen@mytum.de>
…mily produce weird outputs. (TransformerLensOrg#777)

* Fix kv_cache leads to wrong output when used with bloom models

* add test for bloom models when use_past_kv_cache is set to true

* fix max_length for huggingface model in kv_cache test

* set max_length to 13 for huggingface model in kv_cache test

* use max_new_tokens for huggingface model instead of max_length in kv_cache test

* fix format

---------

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: Fabian Degen <fabian.degen@mytum.de>
* added typeguard dependency

* updated lock file

* Revert "updated lock file"

This reverts commit 9d65d65.

* fixed lock file
…g#783)

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: Fabian Degen <fabian.degen@mytum.de>
* call functions on model object instead of model string in run_encoder_decoder_set

* remove generate call in run_encoder_decoder_set because HookedEncoderDecoder doesn't support generate yet

* add testing function for HookedEncoders and stop testing BERT as HookedTransformer

* clear cell output to prevent test from failing

* add comment about bert working with free version of colab

---------

Co-authored-by: Fabian Degen <fabian.degen@mytum.de>
…ion (TransformerLensOrg#781)

Co-authored-by: Bryce Meyer <bryce13950@gmail.com>
Co-authored-by: Fabian Degen <fabian.degen@mytum.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Proposal] Remove the overhead caused by full_hook.__name__ = (hook.__repr__())?
9 participants