-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip the holes for GPU CKF #556
Conversation
ef6cdba
to
a90e721
Compare
@krasznaa it is still a long shot but could you try this PR with the ODD chain? |
The CPU version seems to behave reasonably. But the CUDA version doesn't seem to finish running. 🤔 I let it run for a few minutes, but it's still processing the first event. In principle you should be able to test this as well. Since this branch seems to be up to date with the main branch.
|
So in my case, I'm just stuck here:
(I tried both the G4 and Fatras files, just to see if one would behave better than the other.) |
ad2c004
to
3d062e9
Compare
@krasznaa It is good to go now
The programs wre stalled because tracks lost its way and propagate forever in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am extremely happy that you managed to make it work! 🥳
I'd prefer some of the access operators to be changed, but all in all I don't feel super strongly about any of my comments. As I'll want to use the code next week, in whatever form I can. 😄
device/common/include/traccc/finding/device/impl/prune_tracks.ipp
Outdated
Show resolved
Hide resolved
device/common/include/traccc/finding/device/impl/add_links_for_holes.ipp
Outdated
Show resolved
Hide resolved
90d66b9
to
a92ce5f
Compare
a92ce5f
to
74772b9
Compare
fe27b42
to
60d580d
Compare
60d580d
to
e253aa3
Compare
So, where do we stand with this? From a quick glance, I'm okay with how this looks, if you claim that it works for you at the moment Beomki. I'd be keen to merge this in, and then work with @asalzburger and @niermann999 to make it work with the latest geometry / simulation files. 🤔 |
This PR works for my setup (gcc 11.2 and cuda 12.4) with data v6. Please go ahead with it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This PR is extending the scope of #529 to GPU as well, which jump over surfaces in case of no branches.
There are two new GPU kernels introduced:
add_links_for_holes
#529 used dummy measurements where the Kalman Fitter (KF) update is applied but this PR just uses the bound track parameter from propagator w/o KF, which is more precise. This is what is done by
add_links_for_holes
prune_tracks
This comes after the
build_tracks
kernel, hence the last GPU kernel. What it does is filtering the tracks whose the number of measurements is out of the given range of [min,max]. This is used to be done insidebuild_tracks
in the past, but I had to add a new kernel because I could not count the number of holes and filter the track in the same kernel ofbuild_tracks
.Due to the technical issue, I had to remove one functionality of #529 which counts the number of holes on the fly and abort tracks with (the number of holes > given value). I will take care of this in the later PR (Apologies to @shimasnd)(GPU version of #529 is fully implemented in this PR now)Extending #488 to the GPU CKF is also not addressed in this PR, hopefully I can handle this in the near future.
CPU CKF is also refactored to make the code sequence as similar to the GPU CKF as possible. I admit it is a bit anti-performance change but it is snecessary to keep the codes understandable.
Previous results with Toy detector
Following are the results from toy geometry simulation and reconstruction:
Update with ODD
The ODD is now working! I attached the logs for double and single precision (gcc 13.1 & cuda 12.4)
Also note that the matching rate of the track fitting with single precision is not good. But it is not the matter of matching rate but it is because of the poor precision of single precision KF. Both CPU and GPU KF yield negative chi2 value sometimes and the precision of Jacobian is very poor with single precision. We will need to do a dedicated study to improve the physical outputs of KF with single precision. Must be a good research topic for new comers to start with...
Previous results with ODD
Double precision
Single precision