-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Suggestion] Decreasing the size of test data #104
Comments
Yeah that's a good idea :) |
Thank you for the feedback, I'll start working on this ASAP. |
Seems like a good idea to me! Commenting here as a way to follow this activity, as GeoArrays over at https://github.com/evetion/GeoArrays.jl/blob/master/test/get_testdata.jl sneakily also uses the same testdata, so this could break tests. |
Yeah I noticed this about GeoArrays. In my opinion, after ArchGDAL, same treatment can be given to GeoArrays as well – or maybe hosting different sets of test data can also work. |
I was working on this, starting with the largest raster - ospy/data4/aster.img I was testing the testsets of test/test_array.jl with a cropped version of the raster file. @yeesian can you specify what this test does ( Here is the code: 33 @testset "int indexing" begin
34 @test ds[755, 2107, 1] == 0xff
35 end |
The largest rasters are a good place to start :) In #10 I noted that these two files took up the bulk of the test size. In general, for unit testing, small test datasets suffice. Larger datasets can be useful for benchmarking however. But since we currently don't have benchmarks set up, perhaps we should thake these two rasters out of the unit tests altogether, and do these same test on one of the smaller rasters instead. To answer your question, if I open the raster in QGIS, I only see a value of By the way I just noticed that a lot of the |
Thanks for the suggestions! Comparing the checksums didn't really cross my mind, and now that I am looking at the data, these are indeed the same datasets. Removing the redundant files brings down the size to And choosing the pixel values of the clouds do seem reasonable – that being said, I'll see if the clipping of the large |
[UPDATE] So I've been working on this issue, and with help and suggestions from fellow contributors, I've managed to decrease the size of the required data-set from ~200MB to 32MB. Changes Made:
Other changes which I plan to work on:
Please feel free to have a look at the changes made at pritamd47/ArchGDAL.jl and make any suggestions. |
Nice work! Change overview: master...pritamd47:master |
Will do this soon. I plan to write some tests and then make a pull request for discussion over the changes. |
Closed by #121 |
Currently the size of test data that is downloaded is ~200MB, and depending on the connection speed can take quite some time to download (I couldn't run tests on ArchGDAL while I was not in the university, due to low connection speed).
I believe if the datasets can be trimmed to — say cropped versions of the datasets where-ever possible — something smaller than their current size, can drastically improve the time required to test for fresh installs (or when downloading datasets is necessitated).
Looking for people's thoughts on this suggestion.
The text was updated successfully, but these errors were encountered: