-
Notifications
You must be signed in to change notification settings - Fork 3
/
README.Rmd
195 lines (145 loc) · 7.73 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# tidyrgee
<!-- badges: start -->
[![R-CMD-check](https://github.com/r-tidy-remote-sensing/tidyrgee/workflows/R-CMD-check/badge.svg)](https://github.com/r-tidy-remote-sensing/tidyrgee/actions)
[![CRAN status](https://www.r-pkg.org/badges/version/tidyrgee)](https://CRAN.R-project.org/package=tidyrgee)
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![codecov](https://codecov.io/gh/r-tidy-remote-sensing/tidyrgee/branch/main/graph/badge.svg)](https://app.codecov.io/gh/r-tidy-remote-sensing/tidyrgee)
[![contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat)](https://github.com/dwyl/esta/issues)
<!-- badges: end -->
tidyrgee brings components of [dplyr's](https://github.com/tidyverse/dplyr/) syntax to remote sensing analysis, using the [rgee](https://github.com/r-spatial/rgee) package.
rgee is an R-API for the [Google Earth Engine (GEE)](https://earthengine.google.com/) which provides R support to the methods/functions available in the JavaScript code editor and python API. The `rgee` syntax was written to be very similar to the GEE Javascript/python. However, this syntax can feel unnatural and difficult at times especially to users with less experience in GEE. Simple concepts that are easy express verbally can be cumbersome even to advanced users (see *Syntax Comparison*). The `tidyverse` has provided [principals and concepts](https://tidyr.tidyverse.org/articles/tidy-data.html) that help data scientists/R-users efficiently write and communicate there code in a clear and concise manner. `tidyrgee` aims to bring these principals to GEE-remote sensing analyses.
tidyrgee provides the convenience of pipe-able dplyr style methods such as `filter`, `group_by`, `summarise`, `select`,`mutate`,etc. using [rlang's](https://github.com/r-lib/rlang) style of non-standard evaluation (NSE)
try it out!
## Installation
Install from CRAN with:
``` r
install.packages("tidyrgee")
```
You can install the development version of tidyrgee from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("r-tidy-remote-sensing/tidyrgee")
```
It is important to note that to use tidyrgee you must be signed up for a GEE developer account. Additionally you must install the rgee package following there [installation and setup instructions here](https://github.com/r-spatial/rgee)
## Syntax Comparison
Below is a quick example demonstrating the simplified syntax. Note that the `rgee` syntax is very similar to the syntax in the Javascript code editor. In this example I want to simply calculate mean monthly NDVI (per pixel) for every year from 2000-2015. This is clearly a fairly simple analysis to verbalize/conceptualize. Yet, using using standard GEE conventions, the process is not so simple. Aside, from many peculiarities such as `flattening` a list and then calling and then rebuilding the `imageCollection` at the end, I also have to write and **think about** a double mapping statement using months and years (sort of like a double for-loop). By comparison the tidyrgee syntax removes the complexity and allows me to write the code in a more human readable/interpretable format.
<table class='table' width="100%">
<tr>
<th>rgee (similar to Javascript)</th>
<th>tidyrgee</th> <tr>
<tr>
<td>
```{r, eval=F}
modis <- ee$ImageCollection( "MODIS/006/MOD13Q1")
modis_ndvi <- modis$select("NDVI")
month_list <- ee$List$sequence(1,12)
year_list <- ee$List$sequence(2000,2015)
mean_ndvi <- ee$ImageCollection$fromImages(
year_list$map(
ee_utils_pyfunc(function (y) {
month_list$map(
ee_utils_pyfunc(function (m) {
# dat_pre_filt <-
modis_ndvi$
filter(ee$Filter$calendarRange(y, y, 'year'))$
filter(ee$Filter$calendarRange(m, m, 'month'))$
mean()$
set('year',y)$
set('month',m)$
set('date',ee$Date$fromYMD(y,m,1))$
set('system:time_start',ee$Date$millis(ee$Date$fromYMD(y,m,1)))
})
)
}))$flatten())
```
</td>
<td>
```{r,eval =F}
modis <- ee$ImageCollection( "MODIS/006/MOD13Q1")
modis_tidy <- as_tidyee(modis)
mean_ndvi <- modis_tidy |>
select("NDVI") |>
filter(year %in% 2000:2015) |>
group_by(year, month) |>
summarise(stat= "mean")
```
</td>
<tr>
</table>
## Example usage
Below are a couple examples showing some of the available functions.
To load images/imageCollections you follow the standard approach using `rgee`:
- load libraries
- initialize the GEE session
- load `ee$ImageCollection`/ `ee$Image`
```{r example,warning= F,message=FALSE,eval=T}
library(tidyrgee)
library(rgee)
ee_Initialize(quiet = T)
modis_ic <- ee$ImageCollection("MODIS/006/MOD13Q1")
```
Once the above steps are performed you can convert the `ee$ImageCollection` to a `tidyee` object with the function `as_tidyee`. The tidyee object stores the original `ee$ImageCollection` as `ee_ob` (for earth engine object) and produces as virtual table/data.frame stored as `vrt`. This vrt not only facilitates the use of dplyr/tidyverse methods, but also allows the user to better view the data stored in the accompanying imageCollection. The `ee_ob` and `vrt` inside the tidyee object are linked, any function applied to the tidyee object will apply to them both so that they remain in sync.
```{r, eval=T}
modis_tidy <- as_tidyee(modis_ic)
```
the `vrt` comes with a few built in columns which you can use off the bat for filtering and grouping, but you can also `mutate` additional info for filtering and grouping (i.e using `lubridate` to create new temporal groupings)
```{r}
knitr::kable(modis_tidy$vrt |> head())
```
Next we demonstrate filtering by date, month, and year. The `vrt` and `ee_ob` are always filtered together
- **by date**
```{r}
modis_tidy |>
filter(date>="2021-06-01")
```
- **by year**
```{r}
modis_tidy |>
filter(year%in% 2010:2011)
```
- **by month**
```{r}
modis_tidy |>
filter(month%in% c(7,8))
```
### Putting a dplyr-like chain together:
In this next example we pipe together multiple functions (`select`, `filter`, `group_by`, `summarise`) to
1. select the `NDVI` band from the ImageCollection
2. filter the imageCollection to a desired date range
2. group the filtered ImageCollection by month
3. summarizing each pixel by year and month.
The result will be an `ImageCollection` with the one `Image` per month (12 images) where each pixel in each image represents the average NDVI value for that month calculated using monthly data from 2000 2015.
```{r}
modis_tidy |>
select("NDVI") |>
filter(year %in% 2000:2015) |>
group_by(month) |>
summarise(stat= "mean")
```
You can easily `group_by` more than 1 property to calculate different summary stats. Below we
1. filter to only data from 2021-2022
2. group by year, month and calculate the median NDVI pixel value
As we are using the MODIS 16-day composite we summarising approximately 2 images per month to create median composite image fo reach month in the specified years. The `vrt` holds a `list-col` containing all the dates summarised per new composite image.
```{r}
modis_tidy |>
select("NDVI") |>
filter(year %in% 2021:2022) |>
group_by(year,month) |>
summarise(stat= "median")
```
To improve interoperability with `rgee` we have included the `as_ee` function to return the `tidyee` object back to `rgee` classes when necessary
```{r,eval=T}
modis_ic <- modis_tidy |> as_ee()
```