Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extend quick list creation options #286

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

TDulka
Copy link

@TDulka TDulka commented Sep 12, 2024

Description

Extends get_neuronpedia_quick_list with more options:

  • passing a description and text to test across features
  • passing a description of the sae to display instead of having to pass the whole SAE (sometimes you don't want to load the full SAE to just get a link to neuronpedia)
  • giving more flexibility in what features you pick, should be able to make a list with features from different models and/or SAEs and provide a description for each feature

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

You have tested formatting, typing and unit tests (acceptance tests not currently in use)

  • I have run make check-ci to check format and linting. (you can run make format to format code if needed.)

@TDulka
Copy link
Author

TDulka commented Sep 12, 2024

make format wanted to change formatting of lots of old jupyter notebooks, I let it do it since it's in the checklist, but not sure what's your preferred approach. So some of the changed files are just automatically changed, my edits are only in neuronpedia_integration and test_neuronpedia_integration

Copy link
Collaborator

@chanind chanind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is adding a lot of formatting changes to tutorial notebooks which don't seem to have any functionality change. Would it be possible to remove the notebooks from the PR that only have formatting changes?

feature_index: int
description: str
model_name: Optional[str]
neuronpedia_id: Optional[str]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should be default None, otherwise users will have to provide model_name = None explicitly

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed!

if sae_id is None:
print(
if isinstance(sae, SAE) and neuronpedia_id is None:
raise ValueError(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

mock_open.assert_called_once_with(expected_url)

# Reset mock
mock_open.reset_mock()
Copy link
Collaborator

@chanind chanind Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are great tests! minor, but this is best split into two separate tests, one to test get_neuronpedia_quick_list with int lists of features, one to test with detailed FeatureInfo. Names could be, e.g. test_get_neuronpedia_quick_list_works_with_int_features and test_get_neuronpedia_quick_list_works_with_detailed_feature_info or something similar

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense, I split it into two.

@chanind
Copy link
Collaborator

chanind commented Sep 12, 2024

We should probably exclude .ipynb files from make format to avoid this issue, either that or explicitly check notebook formatting in CI. Sorry for that! I'll open an issue for this

@hijohnnylin
Copy link
Collaborator

thanks for writing this!
question: these changes are being drafted in preparation of an API update on Neuronpedia's side, correct? eg it's blocked on my side to implement the API support for this first.

@TDulka
Copy link
Author

TDulka commented Sep 16, 2024

@hijohnnylin Yes, this was indeed drafted in preparation for API update on Neuronpedia side, to give a sense how the query parameters can look like. Let me know if something doesn't make sense.

@TDulka
Copy link
Author

TDulka commented Sep 16, 2024

I removed the .ipynb formatting changes, they were indeed not appropriate for this PR.

@jbloomAus
Copy link
Owner

Will attempt to make API support on our backend servers for this (search for a specific group of features) which I think will enable the api this will hit to be written / work efficiently :)

@hijohnnylin
Copy link
Collaborator

Hey, while implementing the quick list with test text on the server side, I realized it's a bit difficult to cache these quick list activations, because the quick lists aren't stored in the database, nor are the activations that would be created. And we really do want to cache these activations, because we don't want to hit our GPUs every time that someone loads the page.

Proposal: Use the (non-quick) Lists API to create these lists. The server-side already exists, though it was missing proper documentation. The documentation was created today.

The difference is:

  • POST call instead of GET call to create the list
    • the POST call will return a permanent link to the list that anyone can access
  • You need a Neuronpedia API key, which you can get from "Settings" after logging in.

The documentation for doing this: https://www.neuronpedia.org/api-doc#tag/lists/POST/api/list/new-with-features
(the rest of the List docs are here: https://www.neuronpedia.org/api-doc#tag/lists)

Example code:

import requests

url = "https://www.neuronpedia.org/api/list/new-with-features"

payload = {
    "name": "Twitter features",
    "description": "This is a twitter list",
    "testText": "something about twitter accounts and Tweets. Follow them at @abc",
    "features": [
        {
            "modelId": "gpt2-small",
            "layer": "9-res-jb",
            "index": 1124
        },
        {
            "modelId": "gpt2-small",
            "layer": "2-att-kk",
            "index": 10116
        }
    ]
}
headers = {
    "Content-Type": "application/json",
    "X-Api-Key": "YOUR_API_KEY"
}

response = requests.post(url, json=payload, headers=headers)

print(response.json())

@TDulka @jbloomAus thoughts on this?

My apologies for not raising this earlier!

@TDulka
Copy link
Author

TDulka commented Nov 4, 2024

@hijohnnylin I'm not too familiar with the codebase, it seems from what I see that the advantage of these quick lists is that they don't rely on working with authentication and they are nice for just getting a quick view of a couple of features next to each other. I understand the point about the activations and caching making it complicated to include the test text in that. It seems in that case that it might not be appropriate to have the test text be a part of the quick list and if we'd like to have the test text it would be better to make another method which would incorporate authentication and use the api you mentioned.

My proposal would be to remove the test text from this PR to make it work without authentication and hitting the GPUs (if I understand it correctly). I think it could still be worth merging the rest to close this. We can leave the test text for another PR if someone still needs that feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants