Add skip_verbose_naming in add_hook to give an option for skipping the naming #635

verlocks · 2024-06-11T05:47:49Z

Description

Like the proposal mentioned in issue, running full_hook._name_ = (hook._repr_()) in add_hook may cause large overhead in some situation. This PR add a new parameter skip_verbose_naming that allows user to skip this line.

Fixes #631

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)

Screenshots

The overhead caused by naming being removed by adding skip_verbose_naming=True in with model.hooks(fwd_hooks=fwd_hooks, skip_verbose_naming=True):.

Before

After

Checklist:

I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have not rewritten tests relating to key interfaces which would affect backward compatibility

There is some problem with the current unit test here:

After my changes, the parameter would be different for add_hook. Should I do anything about it?

…gram add transformer diagram

* created colab test notebook * tested more notebooks * tested more models * tested some more models * tested some more models * added more tests * updated model name to be inline with hf names * updated config check * updated name check * finished colab demo * added demo to ci

* add mistral nemo model * fix bug * fix bug * fix: bug nemo model has a defined d_head * update colab notebook * update colab notebook --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

Upstream update

…ransformerLensOrg#758) * added new block for recent diagram, and colab compatability notebook * updated check * ran format * added better checking for prop * fixed format

…merLensOrg#757) * Generating a warning and stopping execution for incorrect use of the T5 model * formatting the file using black

)

…B) (TransformerLensOrg#761) * Add configurations for Llama 3.1 models(Llama-3.1-8B and Llama-3.1-70B) * formatted using black * added models to colab compatability notebook * added configurations for Llama-3.1-8B-Instruct and Llama-3.1-70B-Instruct * cleaned up config a bit * updated compatibility notebook --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Curt Tigges <curttigges@Curts-MacBook-Pro.local>

…e for Llama-3.1 series (TransformerLensOrg#764) * Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1-8B * Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1 series * Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1 series * Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1 series * fix import order * fix black check * fix rope settings also for 3.2 models * fix rope settings also for llama-3 models --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

TransformerLensOrg#772) Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

…erLensOrg#775) * fix prepend_bos to False by default for bloom model family * add comment * edit documentation * fix wrong expected value for bloom-560m model loss * fix expected loss value for bloom model computed with google colab * set prepend_bos to user value, then to value in model config and then default to true * fix format * remove log points in test_hooked_transformer --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

…mily produce weird outputs. (TransformerLensOrg#777) * Fix kv_cache leads to wrong output when used with bloom models * add test for bloom models when use_past_kv_cache is set to true * fix max_length for huggingface model in kv_cache test * set max_length to 13 for huggingface model in kv_cache test * use max_new_tokens for huggingface model instead of max_length in kv_cache test * fix format --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

* added typeguard dependency * updated lock file * Revert "updated lock file" This reverts commit 9d65d65. * fixed lock file

…g#783) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

* call functions on model object instead of model string in run_encoder_decoder_set * remove generate call in run_encoder_decoder_set because HookedEncoderDecoder doesn't support generate yet * add testing function for HookedEncoders and stop testing BERT as HookedTransformer * clear cell output to prevent test from failing * add comment about bert working with free version of colab --------- Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

…ion (TransformerLensOrg#781) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

verlocks and others added 21 commits June 11, 2024 10:32

Add skip_verbose_naming in add_hook

eabc539

Merge branch 'dev' into verlock-dev

3e15f9c

add transformer diagram

fafe46f

Merge pull request TransformerLensOrg#749 from akozlo/transformer-dia…

28eb2f7

…gram add transformer diagram

Add support for Mistral-Nemo-Base-2407 model (TransformerLensOrg#751)

70029b9

* add mistral nemo model * fix bug * fix bug * fix: bug nemo model has a defined d_head * update colab notebook * update colab notebook --------- Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

Merge pull request TransformerLensOrg#755 from TransformerLensOrg/main

0dbc7a8

Upstream update

added new block for recent diagram, and colab compatibility notebook (T…

ab27ac5

…ransformerLensOrg#758) * added new block for recent diagram, and colab compatability notebook * updated check * ran format * added better checking for prop * fixed format

Add warning and halt execution for incorrect T5 model usage (Transfor…

4e7e23e

…merLensOrg#757) * Generating a warning and stopping execution for incorrect use of the T5 model * formatting the file using black

added template for reporting model compatibility (TransformerLensOrg#759

15ae297

)

added logit comparator (TransformerLensOrg#765)

8029d13

Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Curt Tigges <curttigges@Curts-MacBook-Pro.local>

fix the bug that attention_mask and past_kv_cache cannot work together (

d4c8612

TransformerLensOrg#772) Co-authored-by: Bryce Meyer <bryce13950@gmail.com>

added typeguard dependency (TransformerLensOrg#786)

e0a1787

* added typeguard dependency * updated lock file * Revert "updated lock file" This reverts commit 9d65d65. * fixed lock file

remove einsum in forward pass in AbstractAttention (TransformerLensOr…

7e2877c

…g#783) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

remove einsum usage from create_alibi_bias function in AbstractAttent…

623407f

…ion (TransformerLensOrg#781) Co-authored-by: Bryce Meyer <bryce13950@gmail.com> Co-authored-by: Fabian Degen <fabian.degen@mytum.de>

Merge branch 'dev' into verlock-dev

31852b6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add skip_verbose_naming in add_hook to give an option for skipping the naming #635

Add skip_verbose_naming in add_hook to give an option for skipping the naming #635

verlocks commented Jun 11, 2024

Add skip_verbose_naming in add_hook to give an option for skipping the naming #635

Are you sure you want to change the base?

Add skip_verbose_naming in add_hook to give an option for skipping the naming #635

Conversation

verlocks commented Jun 11, 2024

Description

Type of change

Screenshots

Checklist: