-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Circuitscape in Julia unable to process a 1.4 Billion pixel landscape #232
Comments
1.4 billion just sounds really big. You should try with 1 core first. And use the 64-bit indexing. |
Hi Viral. Thanks for the reply. I have upgraded my RAM to 348GB now. Still I will go with 1 core first as you suggested. Issue #200 provided some help. Will be using the following to use 64 bit indexing: |
Great. Parallel processing also needs extra memory. So the larger the problem, the more you have to start slowly and push the parallelism. |
Hi Viral. Ran the same landcape with 1 core and 64 bit indexing. FAILED !! OutOfMemoryError() My conductance surface is 10.5GB and the core area raster is 7.69GB. I am unable to understand why 348GB RAM is not enough. Attaching screenshot. Also, this is being run on Windows 10 Pro (if that information is of any use). I have access to a Linux machine as well. I can swap the RAM modules and try on that if that helps. |
Which solver are you using? Make sure it is cg+amg. |
The underlying graph and solver data structures easily take 10x-20x more memory than your surface. We have done reasonably careful memory profiling, but there may be more options. |
I was using the cholmod solver. I will give it a try with cg+amg. |
Oh yeah - cholmod will not work for anything more than 1-10M cell landscapes. Maybe we should print a warning. |
Hi Viral, with reduced file sizes things seem to be running, even with 4 cores in parallel !! Processing time in each step has drastically reduced too along with a much lower memory footprint. Of all the ini files provided with the test run, can you please point me to one which has all the settings that is honored by Circuitscape in Julia? In case there isn't one, can we compile an exhaustive list of all the setting in an ini file and keep it on the main GitHub page of Circuitscape.jl ? I also believe that it will be helpful if the ini file is annoted. Users who have graduated from the Python, ArcGIS or the Windows standalone version will have clues to what each settings mean, but it will be helpful for first time Circuitscape users who are taking their first go on Julia. |
The old files are fully compatible. Yes, we should document these things. Beyond what I shared here, there really are not many more new options in the ini files. Would you be able to submit a PR to the documentation or to this repo? |
I am new to GitHub as well, so I will need to read up about pull requests before I am able to do so. However, I can easily prepare a template ini file with annotation. |
Close and move to #233? |
Once I resampled and made my landscape smaller in MBs, Circuitscape in Julia started working without any error. In fact the analysis that I started is still running. It has 19,900 pairs, and at this rate, I calculated, it will take 200 days!! I am running in 4 cores in parallel and about 50% of 384GB memory usage. My question is how does this parallelism work? I was thinking that each core will be processing a pair at a time, so 4 pairs will be processed at the same time. But I don't see that from the logs. I wanted to use parallelism to save me some time. This is when the job started: |
We need to move to the multi-threading version of circuitscape that will allow more parallelism and lesser memory. But it is not even ready to try yet, since we are running into some issues. For now, you can make your problem even smaller and increase parallelism, and/or reduce the number of pairs. What mode are you using? Simple pairwise resistance, no polygons? The simpler you can make it the faster it will go. |
We can keep this open to discuss the particular problem at hand. |
Are you using an include pairs file to specify which pairs you want connected? That can affect the scheduler and make parallelism less efficient (#165) |
I am using pairwise mode and I am not using an include pairs file. I have made a raster from the polygon shapefile of the core area polygons and mentioned the path of the file in the "point_file" option. The contents of the ini file are as below. Please tell me if anything is extra or if I am missing anything. [Options for advanced mode] [Calculation options] [Options for pairwise and one-to-all and all-to-one modes] [Output options] [Short circuit regions (aka polygons)] [Connection scheme for raster habitat data] [Habitat raster or graph] [Options for one-to-all and all-to-one modes] [Version] [Mask file] [Circuitscape mode] |
Nothing jumps out to me as problematic in your .ini There seems to be something going on with the task scheduler. @ranjanan or @ViralBShah would there be any reason that the scheduler would behave differently in Julia 1.4? I'm seeing similar output patterns (that imply no parallel processing) for a specific test on Julia 1.4 as well.
|
@indranil-wii it seems that your point file contains polygons. That is, you have focal regions you'd like to collapse into one point in your point file. We do not support parallelism (yet) in this mode because you have to keep recomputing the graph datastructure for every solve and this would take a lot more memory if done in parallel (imagine holding 4 of these large graph datastructures in memory). Is there a way you can make your point files have points instead of regions? (Use the centroids of those focal regions, for example. @vlandau you may have better advice on this). If you do this, the parallelism will work and your problem may be solved faster. |
I would do exactly that. |
I could do that to simplify the problem and conceptually too it seems fine to use the centroids, However, there are practical problems. The corridors won't be mapped right. On ground the movement of animals is not directed towards the centroid of a park from outside but towards the nearest place on the perimeter. So I have to preserve the actual shape of the core area polygon. |
You could set the sources to be centroids and set your original source polygons as short circuit regions. I believe that would resolve the issue and current (i.e. movement) would be directed toward the nearest polygon edge instead of the centroid. @ranjanan would that work with parallelism? |
@indranil-wii Can we request you to take the learnings from this discussion and add them to the README for now https://github.com/Circuitscape/Circuitscape.jl/blob/master/README.md? There is a little pen next to the README file, so you can just edit the file and submit some text. We can have a new section called - Notes for using Circuitscape on very large grids. |
Sure, I can do that. |
I would like to try once with point files as my core area. However, I have never done that before. How do I format my core area file then, if they are to be points? |
The suggestion is not use the core area / polygons at all, and just compute resistance between pairs of central points in all your core areas. |
Yes, I understand. In what format do I supply the central points file? |
This is in the documentation. See the section titled "Focal node location and data type". Focal points can be in the same ASCII format. They just must be only one pixel each. |
Should work. |
Just read through this and while I can't help on the technical issues, I'm curious to know what is being modeled here, given there is such a huge number of pairs to connect. If you are connecting "everything to everything" maybe Omniscape would be worth a try? While Vincent is the expert, I'm happy to do a zoom overview of that tool and show you some examples to see if it might fit your problem. I'm assuming similar issues would arise given you still have a gigantic grid, but for that tool you would only be connecting "everything to everything" in a moving window, not trying to directly connect pairs that are long distances apart. |
This issue appears to have been sufficiently addressed and gone somewhat stale. I'm going to close for now. Can reopen later if needed. |
Hello,
I am new to Circuitscape in Julia and also Julia per se. I am trying to run Circuitscape on a landscape which is 37946 X 36946 pixels of 30m spatial resolution. I have scaled the conductance values from 1 to 100.
The specifications of my system is:
2 X Intel Xeon Gold 6136 so total 24 physical processor cores
128GB RAM.
When I tried parallel processing using 10 cores, my system ran out of memory. Then I disabled parallel processing and ran it. First it took a lot of time to read the maps even on the system that I am using, which I thought has a good specification. Then it gave this new error:
ArgumentError: the index type Int32 cannot hold 4075170303 elements; use a larger index type
I am attaching a screen shot of the console.
Regards,
Indranil.
The text was updated successfully, but these errors were encountered: