-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update som classifcation #1300
Update som classifcation #1300
Conversation
straxen.PeakBasicsSOM, | ||
straxen.EventBasicsSOM, | ||
) | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for clarification, the PeaksSOM, PeakBasicsSOM, and EventBasicsSOM is so that we do not interfere with the lineage of the non-SOM plugins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a few more thoughts about it during the weekend. This is the idea yes. However, this is only true for not using SOM at all. If we want to use SOM as a cut we also have to use the SOM plugins to do it properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lineages of the two will always be separate since the peaklet clasification will have a different liniage as that propagates all the way down the processing pipeline, I thought the main Idea was just to propagate the SOM clusters down the chain no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well yes and no, when I made this PR I made different assumptions. Right now one cannot separate the two.
self.peak_properties = tuple(self.peak_properties) | ||
|
||
def compute(self, events, peaks): | ||
result = super().compute(events, peaks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume that the super().compute()
will copy the peak info into EventBasics. But just for clarification, the reason we don't need to use depends_on = (..., peaks_som)
or something like that is because peaks
itself is being provided by the PeaksSOM plugin which is registered in the context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but the reason is PeakBasicsSOM provides a different version of peak_basics
which EventBasics(SOM) depends on.
@@ -16,7 +16,7 @@ class MergedS2s(strax.OverlapWindowPlugin): | |||
"""Merge together peaklets if peak finding favours that they would form a single peak | |||
instead.""" | |||
|
|||
__version__ = "1.0.2" | |||
__version__ = "1.1.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably changes in this file are all from PR #1299?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they are. Sorry that it is a bit messy, but I thought you guys would look first at #1299
@@ -12,7 +12,7 @@ | |||
class PeakletClassification(strax.Plugin): | |||
"""Classify peaklets as unknown, S1, or S2.""" | |||
|
|||
__version__ = "3.0.3" | |||
__version__ = "3.0.4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no problems with the change here, but out of curiosity, why did we return peaklet_classification as a dict before? Also, @FaroutYLq , will this be an issue for reprocessing if the version changes but the output is functionally the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the headup @JYangQi00 , very important catch! I would try to avoid touching the standard peaklet_classification
in this SOM PR. It doesn't look to me that the performance of peaklet_classification
is meaningfully changed, and if it is just about style we should also have a different PR for this. I would try to avoid bumping version which changes lineage so we have to reprocess again the standard pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
straxen/plugins/peaks/peak_basics.py
Outdated
@@ -15,40 +15,10 @@ class PeakBasics(strax.Plugin): | |||
|
|||
""" | |||
|
|||
__version__ = "0.1.4" | |||
__version__ = "0.1.5" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably this is also just from PR #1299 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, same as above purely optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with concern by @WenzDaniel and suggest we don't touch the regular peaklet_classification
in this PR. The rest look good to me, but I have to admit I didn't run tests locally.
Another general comments: I think for this level of complexity and significance in update we should add some tests. I would let strax* maintainers to comment more here.
@@ -12,7 +12,7 @@ | |||
class PeakletClassification(strax.Plugin): | |||
"""Classify peaklets as unknown, S1, or S2.""" | |||
|
|||
__version__ = "3.0.3" | |||
__version__ = "3.0.4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the headup @JYangQi00 , very important catch! I would try to avoid touching the standard peaklet_classification
in this SOM PR. It doesn't look to me that the performance of peaklet_classification
is meaningfully changed, and if it is just about style we should also have a different PR for this. I would try to avoid bumping version which changes lineage so we have to reprocess again the standard pipeline.
def compute(self, peaklets, merged_s2s): | ||
result = super().compute(peaklets, merged_s2s) | ||
|
||
# For merged S2s SOM and straxen type are undefined: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this is a really basic thing that I'm missing, but why are types undefined for merged S2s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because SOM classifies only based on peaklets. Merged S2s is only called on S2s (type==2
) and we assume that these merged peaks stay S2s (type==2
). However, for each S2 there are multiple SOM sub-types and it is currently not defined which sub-type should be inherited to the merged S2s after merging. We could do something like the sub-type of the peak with the largest area, but not sure if this make sense.
If one uses straxen types as classification for peaks then of course type == straxen_type
and the straxen_type
of MergedS2s would be defined again, however in that case the information is trivial to retrieve. So I preferred to not add an extra if case and make the code more complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Daniel, thanks for the improvements! Everything looks fine to me I am just confused about a few things that I left in the comments. If you think they are fine then I will approve.
…raxen into make_peaklets_dtype_flexiable
Hmm @FaroutYLq which type of test do you have in mind? I can try to add some in #1299 because there we make actually all technical changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey guys I think I made all needed modifications. I also added some comments to the code for guidance. I conducted the following test to see if everything works as intended:
- Process run 025423 up to event basics using the current master using the default offline context. Switch to this branch and see if I can still load peaklet_classifcation, merged_s2s, peak_basics and event_basics with the default nT offline context.
- Processed for the same run the three new SOM data-types and checked if I can load them with meaningful values too.
- I repeated the test of 1. with run
053494
using the XENONnT online context.
If you guys do not have any objections I think we can merge.
straxen/contexts.py
Outdated
@@ -19,12 +19,15 @@ | |||
straxen.PulseProcessing, | |||
straxen.Peaklets, | |||
straxen.PeakletClassification, | |||
straxen.PeakletSOMClass, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These three plugins registered to the default nT context are additional plugins. They will allow us to keep the already processed data as is and provide the additional SOM fields at peak and event level as additional data-types. Thus, after processing on midway we have to use OSG to process peak_som_classification
and event_som_classification
.
straxen.PeakletClassificationSOM, | ||
straxen.PeaksSOM, | ||
straxen.PeakBasicsSOM, | ||
straxen.EventBasicsSOM, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are new plugins which extend the already existing classical versions. With this context we will have a new lineage for everything above peaklets. The same data-type names as for the classical case are used (peaklet_classification, peaks, peak_basics, event_basics), such that we can use the standard processing procedure.
PeakletClassificationSOM
computes first the classical classification before adding the SOM information.PeaksSOM
propagates the information to Peaks such that we can propagate it to peak basics inPeakBasicsSOM
.EventBasicsSOM
is the same as the classical EventBasics, but also extended by the additional SOM fields.
|
||
|
||
@export | ||
class EventBasicsSOM(EventBasics): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This plugin is inherited from the classic event-basics and only adds in addition the new SOM fields. It will also provide event_basics
and is used in the SOM context.
|
||
|
||
@export | ||
class EventSOMClassification(strax.Plugin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This plugin is the extra plugin for the default context case in which we provide in addition event_som_clasification.
For peaks which do not provide any SOM information like S2s -1 is set as default value.
channel=-1, | ||
length=peaklets["length"], | ||
) | ||
peaklets_classification = np.zeros(len(peaklets), dtype=self.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I change the plugin return from a dict into an array. There is no impact whatsoever for our processing. Only more convenient to work with when inhering.
|
||
|
||
@export | ||
class PeakletSOMClass(PeakletClassificationSOM): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This plugin is used in the default straxen context as the provided data-type needs to be different from peaklet_classification
. The rest is inherited from above.
@@ -66,6 +36,45 @@ class PeakBasics(strax.Plugin): | |||
), | |||
) | |||
|
|||
def infer_dtype(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only moving the dtype in here so it can be better inherited.
|
||
|
||
@export | ||
class PeakBasicsSOM(PeakBasics): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This plugin is used in the SOM context and replaces peak_basics with a new version including the SOM parameters.
|
||
|
||
@export | ||
class PeaksSOM(Peaks): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This plugin is used in the SOM context to propagate the SOM parameters from peaklet_classification
to peaks
.
|
||
|
||
@export | ||
class PeaksSOMClassification(strax.Plugin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is again a stand alone plugin to propagate the SOM settings to peaks for the default nT context using the default classification.
I checked that the That peaklet_classification (SOM context) has the same straxen_type as the type of peaklet_classification (nT online context), and the same som fields as peaklet_classification_som (also in nT context). I did similar things as well for the peak and event basics as well, but only looking at the S1 SOM information (as intended by this PR). Everything looks good on my end and I think we are good to merge! |
This PR will change the lineage for > peaklets!
What does the code in this PR do / what does it improve?
In this PR we change the SOM plugin such that it only returns a single data-type. It is based on the changes made in #1299. Further, we propagate the SOM and straxen classification, including the SOM subclasses, up to event basics for a better analysis.
I test processed a short run 025423 with the normal offline and SOM context.