Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Som plugin #1269

Merged
merged 28 commits into from
Oct 9, 2023
Merged

Som plugin #1269

merged 28 commits into from
Oct 9, 2023

Conversation

LuisSanchez25
Copy link
Contributor

@LuisSanchez25 LuisSanchez25 commented Sep 24, 2023

Before you submit this PR: make sure to put all operations-related information in a wiki-note, a PR should be about code and is publicly accessible

What does the code in this PR do / what does it improve?

Creates a new peaklet classification plugin that uses SOM to refine our current peaklet classification algorithm.

Can you briefly describe how it works?

We first classify the data using out traditional peaklet classification algorithm, then all the data that gets classified as either an S1 or an S2 are passed to the SOM where the new data can be reclassified as type 0, 1 or 2, corresponding to the typical designation of junk, S1 and S2.

We also create a new context xenonnt_som which uses this pluggin instead of the usual peaklet classification plugin.

Can you give a minimal working example (or illustrate with a figure)?

Please include the following if applicable:

  • Update the docstring(s)
  • Update the documentation
  • Tests to check the (new) code is working as desired.
  • Does it solve one of the open issues on github?

Notes on testing

  • Until the automated tests pass, please mark the PR as a draft.
  • On the XENONnT fork we test with database access, on private forks there is no database access for security considerations.

All italic comments can be removed from this template.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remaining comments which cannot be posted as a review comment to avoid GitHub Rate Limit

pep8

straxen/plugins/peaklets/peaklet_classification_som.py|103 col 6| N806 variable 'SOM_ydim' in function should be lowercase
straxen/plugins/peaklets/peaklet_classification_som.py|103 col 6| N806 variable 'SOM_zdim' in function should be lowercase
straxen/plugins/peaklets/peaklet_classification_som.py|104 col 5| WPS359 Found an iterable unpacking to list
straxen/plugins/peaklets/peaklet_classification_som.py|104 col 6| N806 variable 'IMG_xdim' in function should be lowercase
straxen/plugins/peaklets/peaklet_classification_som.py|104 col 6| N806 variable 'IMG_ydim' in function should be lowercase
straxen/plugins/peaklets/peaklet_classification_som.py|104 col 6| N806 variable 'IMG_zdim' in function should be lowercase
straxen/plugins/peaklets/peaklet_classification_som.py|105 col 5| WPS221 Found line with high Jones Complexity: 16 > 14
straxen/plugins/peaklets/peaklet_classification_som.py|107 col 101| E501 line too long (125 > 100 characters)
straxen/plugins/peaklets/peaklet_classification_som.py|108 col 101| E501 line too long (125 > 100 characters)
straxen/plugins/peaklets/peaklet_classification_som.py|116 col 6| N806 variable 'SOM_cls_array' in function should be lowercase
straxen/plugins/peaklets/peaklet_classification_som.py|117 col 5| WPS362 Found assignment to a subscript slice
straxen/plugins/peaklets/peaklet_classification_som.py|119 col 6| N806 variable 'data_with_SOM_cls' in function should be lowercase
straxen/plugins/peaklets/peaklet_classification_som.py|121 col 101| E501 line too long (111 > 100 characters)
straxen/plugins/peaklets/peaklet_classification_som.py|125 col 1| D103 Missing docstring in public function
straxen/plugins/peaklets/peaklet_classification_som.py|128 col 9| WPS221 Found line with high Jones Complexity: 16 > 14
straxen/plugins/peaklets/peaklet_classification_som.py|131 col 13| WPS221 Found line with high Jones Complexity: 15 > 14
straxen/plugins/peaklets/peaklet_classification_som.py|135 col 1| D103 Missing docstring in public function
straxen/plugins/peaklets/peaklet_classification_som.py|135 col 6| N802 function name 'SOM_cls_recall' should be lowercase
straxen/plugins/peaklets/peaklet_classification_som.py|135 col 36| N803 argument name 'data_in_SOM_fmt' should be lowercase
straxen/plugins/peaklets/peaklet_classification_som.py|136 col 5| WPS359 Found an iterable unpacking to list
straxen/plugins/peaklets/peaklet_classification_som.py|136 col 6| N806 variable 'SOM_xdim' in function should be lowercase
straxen/plugins/peaklets/peaklet_classification_som.py|136 col 6| N806 variable 'SOM_ydim' in function should be lowercase
straxen/plugins/peaklets/peaklet_classification_som.py|138 col 5| WPS221 Found line with high Jones Complexity: 17 > 14
straxen/plugins/peaklets/peaklet_classification_som.py|138 col 101| E501 line too long (106 > 100 characters)
straxen/plugins/peaklets/peaklet_classification_som.py|149 col 1| RST301 Unexpected indentation.
straxen/plugins/peaklets/peaklet_classification_som.py|161 col 1| E800 Found commented out code
straxen/plugins/peaklets/peaklet_classification_som.py|161 col 5| E265 block comment should start with '# '
straxen/plugins/peaklets/peaklet_classification_som.py|161 col 101| E501 line too long (125 > 100 characters)
straxen/plugins/peaklets/peaklet_classification_som.py|172 col 101| E501 line too long (109 > 100 characters)
straxen/plugins/peaklets/peaklet_classification_som.py|176 col 1| E800 Found commented out code
straxen/plugins/peaklets/peaklet_classification_som.py|182 col 5| WPS221 Found line with high Jones Complexity: 20 > 14
straxen/plugins/peaklets/peaklet_classification_som.py|182 col 101| E501 line too long (105 > 100 characters)
straxen/plugins/peaklets/peaklet_classification_som.py|186 col 101| E501 line too long (119 > 100 characters)
straxen/plugins/peaklets/peaklet_classification_som.py|200 col 10| WPS459 Found comparison with float or complex number
straxen/plugins/peaklets/peaklet_classification_som.py|200 col 17| WPS358 Found a float zero (0.0)
straxen/plugins/peaklets/peaklet_classification_som.py|200 col 24| WPS358 Found a float zero (0.0)
straxen/plugins/peaklets/peaklet_classification_som.py|238 col 9| WPS362 Found assignment to a subscript slice
straxen/plugins/peaklets/peaklet_classification_som.py|239 col 9| WPS362 Found assignment to a subscript slice
straxen/plugins/peaklets/peaklet_classification_som.py|240 col 9| WPS362 Found assignment to a subscript slice
straxen/plugins/peaklets/peaklet_classification_som.py|241 col 9| WPS362 Found assignment to a subscript slice
straxen/plugins/peaklets/peaklet_classification_som.py|242 col 9| WPS362 Found assignment to a subscript slice

straxen/plugins/peaklets/__init__.py Outdated Show resolved Hide resolved
straxen/plugins/peaklets/__init__.py Outdated Show resolved Hide resolved
straxen/plugins/peaklets/__init__.py Outdated Show resolved Hide resolved
straxen/plugins/peaklets/__init__.py Outdated Show resolved Hide resolved
result['length'] = peaklets['length']
result['dt'] = peaklets['dt']
result['type'][mask_non_zero] = strax_type
#result['som_type'][mask_non_zero] = som_type + 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
E265 block comment should start with '# '

result['dt'] = peaklets['dt']
result['type'][mask_non_zero] = strax_type
#result['som_type'][mask_non_zero] = som_type + 1
return dict(peaklet_classification_som=result, som_peaklet_data=som_info)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
C408 Unnecessary dict call - rewrite as a literal.

return dict(peaklet_classification_som=result, som_peaklet_data=som_info)


def recall_populations(dataset, weight_cube, SOM_cls_img, norm_factors):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
N803 argument name 'SOM_cls_img' should be lowercase

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix this

dataset: Data to preform the recall on (Should be peaklet level data)
normfactos: A set of 11 numbers to normalize the data so we can preform a recall
"""
[SOM_xdim, SOM_ydim, SOM_zdim] = weight_cube.shape

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
WPS359 Found an iterable unpacking to list

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[SOM_xdim, SOM_ydim, SOM_zdim] = weight_cube.shape
xdim, ydim, zdim = weight_cube.shape

dataset: Data to preform the recall on (Should be peaklet level data)
normfactos: A set of 11 numbers to normalize the data so we can preform a recall
"""
[SOM_xdim, SOM_ydim, SOM_zdim] = weight_cube.shape

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pep8] reported by reviewdog 🐶
N806 variable 'SOM_xdim' in function should be lowercase

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove all these CamelCase variable names. Use snake_case for variable names.

@coveralls
Copy link

coveralls commented Oct 4, 2023

Coverage Status

coverage: 92.659% (-1.0%) from 93.643% when pulling d5cf03f on SOM_plugin into 8c70f8f on master.

@LuisSanchez25 LuisSanchez25 marked this pull request as ready for review October 4, 2023 22:18
Copy link
Contributor

@jmosbacher jmosbacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are committing to collaborative project. Please conform to existing coding styles so that others can easily read your work.

straxen/contexts.py Show resolved Hide resolved
straxen/contexts.py Show resolved Hide resolved
straxen/plugins/peaklets/__init__.py Outdated Show resolved Hide resolved
som_files = straxen.URLConfig(default='resource:///stor2/data/LS_data/SOM_data/som_data_v4.1.npz?fmt=npy')

def infer_dtype(self):
dtype = dict()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dtype = dict()
dtype = {}

return dict(peaklet_classification_som=result, som_peaklet_data=som_info)


def recall_populations(dataset, weight_cube, SOM_cls_img, norm_factors):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix this

dataset: Data to preform the recall on (Should be peaklet level data)
normfactos: A set of 11 numbers to normalize the data so we can preform a recall
"""
[SOM_xdim, SOM_ydim, SOM_zdim] = weight_cube.shape
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[SOM_xdim, SOM_ydim, SOM_zdim] = weight_cube.shape
xdim, ydim, zdim = weight_cube.shape

dataset: Data to preform the recall on (Should be peaklet level data)
normfactos: A set of 11 numbers to normalize the data so we can preform a recall
"""
[SOM_xdim, SOM_ydim, SOM_zdim] = weight_cube.shape
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove all these CamelCase variable names. Use snake_case for variable names.

@LuisSanchez25
Copy link
Contributor Author

The problem with changing the structure with how you suggested (just importing straxen and using straxen.PeakletClassification for the plugin) is that it creates a circular import error since straxen cant initialize it. This is also part of the reason I removed it from the init.py file since it can just be initialized later by the context, the other reason is that in contexts if you look at line 88, it will register all plugins and since the SOM pluggin also provides peaklet classification I think it will cause problems. I could explicitly state not to fetch that config in the context where its registering everything but I think we probably do not want to hardcode that.

jmosbacher
jmosbacher previously approved these changes Oct 6, 2023
missed a function with upper cases
@LuisSanchez25
Copy link
Contributor Author

@jmosbacher all test have passed so I think this is ready to merge now

@LuisSanchez25
Copy link
Contributor Author

I cant merge the pull request so if you could merge it that would be great!

@ahiguera-mx ahiguera-mx merged commit 02621e1 into master Oct 9, 2023
7 checks passed
@yuema137 yuema137 deleted the SOM_plugin branch September 14, 2024 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants