Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

names lost with writeCDF #1077

Closed
dramanica opened this issue Mar 24, 2023 · 6 comments
Closed

names lost with writeCDF #1077

dramanica opened this issue Mar 24, 2023 · 6 comments

Comments

@dramanica
Copy link
Contributor

When saving a SpatRaster to netCDF, layers are saved as different times, with a loss of variable names. For example:

x <- rast(nrows=5, ncols=9, vals=1:45)
add(x) <- x*3
names(x) <-c("var1","var2")
x
writeCDF(x,"test.nc", overwrite=TRUE)
x1<-rast("test.nc")
names(x1)==names(x)

gives FALSE, as the variable names have been replaced with the name of the file ("test_1" and "test_2"). In the netcdf, the variables have been turned into different time steps for a single variable "test" (thus messing up any real time information if present in the original SpatRaster object).
To allow saving and reloading the object as is, we would need to save each layer with a unique name in the SpatRaster object as a different variable within the netcdf file, and use time only if there are multiple layers with the same name but different times (and I guess throw an error if there are multiple instances of same name and same time, which is possible in terra).

@rhijmans
Copy link
Member

A netCDF file does not have layers. It has a 3rd dimension that you can name, and the layer names are derived from that. As for time information, that does not need to be (and should not, I think) be handled as layer names. So I might do

x <- rast(nrows=5, ncols=9, vals=1:45)
add(x) <- x*3
names(x) <-c("var1","var2")
time(x) <- as.Date("2023-03-25") + 0:1
writeCDF(x,"test.nc", overwrite=TRUE, varname="var")

r <- rast("test.nc")
#class       : SpatRaster 
#dimensions  : 5, 9, 2  (nrow, ncol, nlyr)
#resolution  : 40, 36  (x, y)
#extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84 
#source      : test.nc 
#varname     : var 
#names       : var_1, var_2 
#time (days) : 2023-03-25 to 2023-03-26 

Does that work for you?

@dramanica
Copy link
Contributor Author

My issue with that solution is that the SpatRaster that you read from the netcdf has different variable names from the SpatRaster that generated the netcdf. I would favour an approach where layers becomes variables in the netcdf, so that you can save and reload the same object. In other words, if my layers are called "foo" and "bar", I would like to get back "foo" and "bar" from the netcdf. I am aware that I can achieve that by creating an sds object with each layer as a dataset, and then saving to netcdf, but it seems convoluted.

@rhijmans
Copy link
Member

I think that what you want does not really match the netCDF format. The third dimension is normally time or something of that nature. It can also be used for different variables that can then become layers, but that is not the most common or useful use of that format. If you must use that format, you could, as you point out, use a SpatRasterDataset instead. For example like this:

x <- rast(nrows=5, ncols=9, vals=1:45)
add(x) <- x*3
names(x) <-c("var1","var2")

# coerce to sds
s <- sds(as.list(x))
names(s) <- names(x)

writeCDF(s, "test.nc", overwrite=TRUE)
rast("test.nc")

#class       : SpatRaster 
#dimensions  : 5, 9, 2  (nrow, ncol, nlyr)
#resolution  : 40, 36  (x, y)
#extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84 
#sources     : test.nc:var1  
#              test.nc:var2  
#varnames    : var1 
#              var2 
#names       : var1, var2 

I considered adding an argument "split=FALSE" to writeCDF that could do this automatically when set to TRUE, but it would be messy to deal with the other arguments.

@dramanica
Copy link
Contributor Author

On this one, we might have to agree to disagree. The way I see it, when an object is saved, it should be possible to retrieve it as is, without losing information. The netCDF format is really a container, and one can have as many variables and dimensions as needed, so I am not sure I see the problem of turning layers into variables.
Having said all that, you are the package maintainer, so I defer to you. I can just keep coercing to sds as I have done in the past.

@dramanica
Copy link
Contributor Author

@rhijmans Thinking about it, a split=TRUE could be easily fitted without too much trouble by simply imposing that the other variables have to be unset if you use split (as names and units then come from the SpatRaster itself). So, the following would allow for objects to be saved and reloaded without loss of information:

setMethod("writeCDF", signature(x="SpatRaster"),
          function(x, filename, varname, longname="", unit="", split=FALSE, ...) {
            filename <- trimws(filename)
            stopifnot(filename != "")
            if (split==FALSE){
              if (missing(varname)) {
                varname <- tools::file_path_sans_ext(basename(filename))
              }
              varnames(x) <- varname
              longnames(x) <- longname
              units(x) <- unit
              y <- sds(x)              
            } else {
              if (any(!missing(varname),
                      longname!="",
                      unit!="")){
                stop("split=TRUE requires varname, longname and unit not to be set to any value")
              }
              y <- sds(as.list(x))
              names(y) <- names(x)
              units(y) <- units(x)
            }
            invisible( writeCDF(y, filename=filename, ...) )
          }
)

And now everything works just fine:

x <- rast(nrows=5, ncols=9, vals=1:45)
add(x) <- x*3
names(x) <-c("var1","var2")
units(x) <- c("unit1","unit2")
time(x,tstep="years")<-c(20,20)
x
writeCDF(x, "test.nc", overwrite=TRUE, split=TRUE)
rast("test.nc")

Acceptable? Or too messy for your taste? If acceptable, happy to make a pull request and provide appropriate guidance in the manpage.

@rhijmans
Copy link
Member

rhijmans commented Apr 4, 2023

Thanks, I have now added argument "split=FALSE".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants