-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory Consumption of Sensitive Detectors #1285
Comments
Hi @s6anloes , Can you reduce your example to something that still shows the scaling behaviour, but can be run in a much shorter time , and use not more than 1GB of memory for example? Thanks, |
I think the issue is that there is an entry for each sensitive element with its unique path. DD4hep/DDG4/src/Geant4VolumeManager.cpp Line 180 in 3ccf907
|
Yes. I confirm this. There is an entry for each path to allow lookups using the touchable history. |
I guess one shouldn't add a DetElement for each fiber, but depend on a segmentation to give a number to the fiber in the tower? |
I'm not sure how this could work. Currently we need to mark the fibres as sensitive for signal generation. We have had some discussion with Sanghyun, whose geometry propagates optical photons, but it takes several minutes to simulate one event. Something we would like to avoid |
This path reflects the path of volumes. Having a DetElement at each level makes things worse, but already the In DD4hep such situations are meant to be handled by a relatively large sensitive volumes and then the little sensitive In this case the envelope of fibers would be the sensitive volume and the individual fibres would then be handed by a |
I think I understand how this approach might work. Although I have one question. You say the envelope of the fibre would be the sensitive volume, I guess this means the mother volume. For our geometry, the sensible choice for sensitive volume would be three levels of hierarchy higher (the grand-grandmother volume), since each fibre core is placed within a claading volume within a tube volume. And then the tower would be the large sensitive volume which can be segmented. Would this approach still work? It is not clear to me how sensitive volumes treat daughter and grand-daughter volumes and even further. These volumes are no longer 'sensitive' in the sense that the sensitve detector action would not be called for steps in this daughter volume? |
This is sort of the idea behind the segmentation concept. If this works for fibers (which I guess are thin cylinders) I cannot tell, because there is some space between them filled probably with some glue. This then would not be handled correctly by Geant4, because the glue has different material characteristics than the fibers. |
@s6anloes Nevertheless: Do you know where the memory really goes ?
So where does the memory really go? |
I ran heaptrack to monitor allocations
With and without setting the couple volumes as sensitive. And the line I link to above is the main difference between the two runs, as far as I can tell. This is a bit complicated because I have never used heaptrack before, and the recursion makes callstacks a little bit broader. |
Hmm, these are really good questions I wished I knew the answer to. I'm not really an expert on these things, so if you know any way I can figure this out, it would be greatly appreciated. |
@s6anloes Now for the facts:
So where does the memory go? |
@s6anloes , @andresailer Apparently the LCG views miss the Geant4 data tables:
Do I miss some environment or is LCG_106 incomplete? |
Works for me on lxplus.
|
Yes this is the problem: |
It seems geant4 cvmfs is the hidden dependency. |
There are more problems. I tried to build on lxplus, but there I got a clash with python between system python and hsf python:
|
That is working for me:
echo "Sourcing environment dirs for lxplus9 [zsh|bash]"
echo "Sourcing environment dirs for AlmaLinux 9.4"
#========================================================================================================================
export LIBGL_ALWAYS_INDIRECT=1
# Including key4hep (EDM4hep, podio)
source /cvmfs/fcc.cern.ch/sw/latest/setup.sh
source /cvmfs/sft.cern.ch/lcg/views/LCG_105b/x86_64-el9-gcc13-opt/setup.sh
export
PATH=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/CMake/3.26.2/x86_64-el9-gcc13-opt/bin:$PATH
export
PATH=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/ninja/1.10.0/x86_64-el9-gcc13-opt/bin:$PATH
export
CMAKE_PREFIX_PATH=/cvmfs/sft.cern.ch/lcg/releases/cfitsio/3.48-e4bb8/x86_64-el9-gcc13-dbg/:$CMAKE_PREFIX_PATH
export
CMAKE_PREFIX_PATH=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/hepmc3/3.2.7/x86_64-el9-gcc13-opt/:$CMAKE_PREFIX_PATH
export
CMAKE_PREFIX_PATH=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/xrootd/5.6.3/x86_64-el9-gcc13-opt/:$CMAKE_PREFIX_PATH
export
Python_ROOT_DIR=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/Python/3.9.12/x86_64-el9-gcc13-opt/
export
Boost_DIR=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/Boost/1.82.0/x86_64-el9-gcc13-opt/
export
LCIO_DIR=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/LCIO/02.20/x86_64-el9-gcc13-opt/
export
Qt5_DIR=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/qt5/5.15.9/x86_64-el9-gcc13-opt/
export
TBB_DIR=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/tbb/2021.10.0/x86_64-el9-gcc13-opt/
export
VDT_DIR=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/vdt/0.4.4/x86_64-el9-gcc13-opt/
export
Vc_DIR=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/Vc/1.4.4/x86_64-el9-gcc13-opt/
export
HEPMC3=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/hepmc3/3.2.7/x86_64-el9-gcc13-opt/
export
PYTHIA8=/cvmfs/sft.cern.ch/lcg/releases/MCGenerators/pythia8/310-2f242/x86_64-el9-gcc13-opt
export
PYTHIA8DATA=/cvmfs/sft.cern.ch/lcg/releases/MCGenerators/pythia8/310-2f242/x86_64-el9-gcc13-opt/share/Pythia8/xmldoc
export
XercesC_LIBRARY=/cvmfs/sft.cern.ch/lcg/releases/XercesC/3.2.4-9e637/x86_64-el9-gcc13-opt/lib/libxerces-c.so
export
XercesC_INCLUDE_DIR=/cvmfs/sft.cern.ch/lcg/releases/XercesC/3.2.4-9e637/x86_64-el9-gcc13-opt//include/
export
CLHEP_DIR=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/clhep/2.4.7.1/x86_64-el9-gcc13-opt/
export
LCIO_DIR=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/LCIO/02.20/x86_64-el9-gcc13-opt/
export CMAKE_PREFIX_PATH=$Qt5_DIR:$VDT_DIR:$CMAKE_PREFIX_PATH
# ROOT
source
/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/ROOT/6.30.06/x86_64-el9-gcc13-opt/bin/thisroot.sh
#GEANT4
export
Geant4_DIR=/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/Geant4/11.2.0/x86_64-el9-gcc13-opt/
export G4INSTALL=$Geant4_DIR
source
/cvmfs/sft.cern.ch/lcg/releases/LCG_105b/Geant4/11.2.0/x86_64-el9-gcc13-opt/share/Geant4/geant4make/geant4make.sh;
cd -
Message ID: ***@***.***>
|
Here are some results from simply using Invocation of
Tests involving
Hence:
Hence a possible strategy would be:
This is all not implossible, but requires significant work and is not done in an afternoon. We can develop this as a common effort provided several persons work together..... |
How do you know How did you get this output? I would be interested to see how this scales with the full (or at least more complete) geometry. @MarkusFrankATcernch |
@s6anloes So what? |
Regarding @s6anloes description here: #1285 (comment) . Shall we first try to understand why Scenario 1 leads to no difference with/without sensitive volumes while Scenario 2 leads to significant differences with/without sensitive volumes? Can someone explain that to me? |
Check duplicate issues.
Goal
I'm trying to understand the memory usage of sensitive volumes in dd4hep. I have a detector with a large number of sensitive volumes which seem to have a large impact on the memory consumption. More details are given below.
Operating System and Version
Centos 7
compiler
GCC 12.2.0
ROOT Version
6.28/10
DD4hep Version
1.28
Reproducer
To install the dual-readout calorimeter geometry:
Note: the export command will need to be executed in each new shell
To run the simulation with all fibres marked as sensitive detector and monitor the memory usage via htop:
Note: With the full geometr, this will take ~10GB of memory and about 10 minutes to build the geometry (at around 3 1/2 minutes you should be able to see the memory usage increase gradually)
Then to run the simulation without fibres being sensitive:
In the DDDRCaloTibes.xml file in lines 333 and 335 change the "sensitive" value to
false
and run with the same command. This should take less than 1GB of memory.Note: the simulation will take slightly longer, because optical photons are propagated instead of killed like in the custom sensitive detector action
Additional context
I have been trying to improve the memory consumption of the calorimeter for some time now and had meetings with and feedback from some of the experts. In a recent FCC Full Sim Working Group meeting I presented some studies I did on the memory consumption. Mainly I show the CPU and memory usage as function of time using the psrecord software. There you can also find one slide on the geometry and volume hierarchy of the detector.
The slides are mainly about trying different options to improve the memory consumption, but one important point was also the discrepancy in memory consumption between the ddsim and dd4hep2root commands. While running
ddsim
takes 10GB of memory,dd4hep2root
takes less than 1GB.In this meeting, a colleague working on the 'monolithic' version of the geometry, suggested to run the simulation without having any volume marked as sensitive. And indeed, this seems to be the cause for the high discrepancy.
A colleague said that in Geant4 the sensitive volume is linked to the logical volume, so even if the volume is placed many times (as is the case for the fibres in my geometry), there is still just one sensitive volume. It looks like this is not the case in dd4hep, where it seems that the sensitive volume is tied to the placed volume.
I wonder if this is something that can be changed, as it causes a problem with geometries with many small sensitive volumes.
Sidenote: this issue is somewhat related to issue #1173, where Sarah Eno was also looking at the memory consumption of the dual-readout calorimeter. While I think there is still some optimisation possible for my geometry (I'm not using all possible symmetries at the moment), this doesn't seem to be the main problem here
The text was updated successfully, but these errors were encountered: