From 64ead998736577a49e96e1c86d02e66a20b772f6 Mon Sep 17 00:00:00 2001
From: Robinlovelace
Date: Wed, 25 Sep 2024 12:53:14 +0000
Subject: [PATCH] Deploy commit: improves ch16 (#1127)
5e2dd849b5336e4c2605a103d8d23bf30ec7e893
---
13-transport.md | 2 +-
16-synthesis.md | 57 +++++++++++++-------------
conclusion.html | 66 +++++++++++++++---------------
figures/circle-intersection-1.png | Bin 13642 -> 13642 bytes
figures/cycleways-1.png | Bin 107988 -> 108473 bytes
figures/points-1.png | Bin 15698 -> 15698 bytes
figures/rnetvis-1.png | Bin 53514 -> 52169 bytes
figures/rnetvis-2.png | Bin 44331 -> 50205 bytes
figures/routes-1.png | Bin 130691 -> 132440 bytes
figures/wayssln-1.png | Bin 116559 -> 126094 bytes
search.json | 2 +-
transport.html | 2 +-
12 files changed, 65 insertions(+), 64 deletions(-)
diff --git a/13-transport.md b/13-transport.md
index fe28f1b2d..cb68bb7b3 100644
--- a/13-transport.md
+++ b/13-transport.md
@@ -570,7 +570,7 @@ routes_short_scenario = routes_short |>
mutate(bicycle = bicycle + car_driver * uptake,
car_driver = car_driver * (1 - uptake))
sum(routes_short_scenario$bicycle) - sum(routes_short$bicycle)
-#> [1] 558
+#> [1] 583
```
Having created a scenario in which approximately 4000 trips have switched from driving to cycling, we can now model where this updated modeled cycling activity will take place.
diff --git a/16-synthesis.md b/16-synthesis.md
index a1bf2d7c8..7f0d653b3 100644
--- a/16-synthesis.md
+++ b/16-synthesis.md
@@ -4,10 +4,10 @@
## Introduction
-Like the introduction, this concluding chapter contains few code chunks.
+Like the introduction, this concluding chapter contains a few code chunks.
The aim is to synthesize the contents of the book, with reference to recurring themes/concepts, and to inspire future directions of application and development.
The chapter has no prerequisites.
-However, you may get more out of it if you have read and attempted the exercises in Part I (Foundations), try more advances approaches in Part II, and considered how geocomputation can help you solve work, research or other problems, with reference to the chapters in Part III (Applications).
+However, you may get more out of it if you have read and attempted the exercises in Part I (Foundations), tried more advances approaches in Part II (Extensions), and considered how geocomputation can help you solve work, research or other problems, with reference to the chapters in Part III (Applications).
The chapter is organized as follows.
Section \@ref(package-choice) discusses the wide range of options for handling geographic data in R.
@@ -65,7 +65,7 @@ identical(nz_name1$Name, nz_name2$Name) # check results
This raises the question: which to use?
The answer is: it depends.
-Each approach has advantages: base R\index{R!base} tends to be stable, well-known, and has minimal dependencies which is why it is often preferred for software (package) development.
+Each approach has advantages: base R\index{R!base} tends to be stable, well-known, and has minimal dependencies, which is why it is often preferred for software (package) development.
The tidyverse approach, on the other hand, is often preferred for interactive programming.
Choosing between the two approaches is therefore a matter of preference and application.
@@ -89,8 +89,8 @@ Overlapping functionality can be good.
A new package with similar (but not identical) functionality compared to an existing package can increase resilience, performance (partly driven by friendly competition and mutual learning between developers) and choice, both of which are key benefits of doing geocomputation with open source software.
In this context, deciding which combination of **sf**, **tidyverse**, **terra** and other packages to use should be made with knowledge of alternatives.
The **sp** ecosystem that **sf**\index{sf} superseded, for example, can do many of the things covered in this book and, due to its age, is built on by many other packages.
-At the time of writing in 2024, 463 package `Depend` on or `Import` **sp**, up slightly from 452 in October 2018, showing that its data structures are widely used and have been extended in many directions.
-The equivalent numbers for **sf** are 69 in 2018 and 431 in 2024, highlighting that the package is future proof and has a growing user base and developer community [@bivand_progress_2021].
+At the time of writing in 2024, 463 packages `Depend` on or `Import` **sp**, up slightly from 452 in October 2018, showing that its data structures are widely used and have been extended in many directions.
+The equivalent numbers for **sf** are 69 in 2018 and 431 in 2024, highlighting that the package is future-proof and has a growing user base and developer community [@bivand_progress_2021].
Although best known for point pattern analysis, the **spatstat** package also supports raster\index{raster} and other vector geometries and provides powerful functionality for spatial statistics and more [@baddeley_spatstat_2005].
It may also be worth researching new alternatives that are under development if you have needs that are not met by established packages.
@@ -114,8 +114,8 @@ These resources include @zuur_mixed_2009, @zuur_beginners_2017 which focus on ec
[*R for Geographic Data Science*](https://sdesabbata.github.io/r-for-geographic-data-science/) provides an introduction to R for geographic data science and modeling.
We have largely omitted geocomputation on 'big data'\index{big data} by which we mean datasets that do not fit on a high-spec laptop.
-This decision is justified by the fact that the majority of geographic datasets that are needed for common research or policy applications *do* fit on consumer hardware, large high resolution remote sensing datasets being a notable exception (see Section \@ref(cloud)).
-It is possible to get more RAM on your computer or to temporarily 'renting' compute power available on platforms such as [GitHub Codespaces, which can be used to run the code in this book](https://github.com/codespaces/new?hide_repo_select=true&ref=main&repo=84222786&machine=basicLinux32gb&devcontainer_path=.devcontainer.json&location=WestEurope).
+This decision is justified by the fact that the majority of geographic datasets that are needed for common research or policy applications *do* fit on consumer hardware, large high-resolution remote sensing datasets being a notable exception (see Section \@ref(cloud)).
+It is possible to get more RAM on your computer or to temporarily 'rent' compute power available on platforms such as [GitHub Codespaces, which can be used to run the code in this book](https://github.com/codespaces/new?hide_repo_select=true&ref=main&repo=84222786&machine=basicLinux32gb&devcontainer_path=.devcontainer.json&location=WestEurope).
Furthermore, learning to solve problems on small datasets is a prerequisite to solving problems on huge datasets and the emphasis in this book is getting started, and the skills you learn here will be useful when you move to bigger datasets.
Analysis of 'big data' often involves extracting a small amount of data from a database for a specific statistical analysis.
Spatial databases, covered in Chapter \@ref(gis), can help with the analysis of datasets that do not fit in memory.
@@ -136,16 +136,16 @@ If you need to work with big geographic datasets, we also recommend exploring pr
Geocomputation is a large and challenging field, making issues and temporary blockers to work near inevitable.
In many cases you may just 'get stuck' at a particular point in your data analysis workflow facing cryptic error messages that are hard to debug.
Or you may get unexpected results with few clues about what is going on.
-This section provides pointers to help you overcome such problems, by clearly defining the problem, searching for existing knowledge on solutions and, if those approaches to not solve the problem, through the art of asking good questions.
+This section provides pointers to help you overcome such problems, by clearly defining the problem, searching for existing knowledge on solutions and, if those approaches do not solve the problem, through the art of asking good questions.
When you get stuck at a particular point, it is worth first taking a step back and working out which approach is most likely to solve the issue.
-Trying each of the following steps --- skipping steps already taken --- provides a structured approach to problem solving:
+Trying each of the following steps --- skipping steps already taken --- provides a structured approach to problem-solving:
1. Define exactly what you are trying to achieve, starting from first principles (and often a sketch, as outlined below)
-2. Diagnose exactly where in your code the unexpected results arise, by running and exploring the outputs of individual lines of code and their individual components (you can can run individual parts of a complex command by selecting them with a cursor and pressing Ctrl+Enter in RStudio, for example)
+2. Diagnose exactly where in your code the unexpected results arise, by running and exploring the outputs of individual lines of code and their individual components (you can run individual parts of a complex command by selecting them with a cursor and pressing Ctrl+Enter in RStudio, for example)
3. Read the documentation of the function that has been diagnosed as the 'point of failure' in the previous step. Simply understanding the required inputs to functions, and running the examples that are often provided at the bottom of help pages, can help solve a surprisingly large proportion of issues (run the command `?terra::rast` and scroll down to the examples that are worth reproducing when getting started with the function, for example)
-4. If reading R's built-in documentation, as outlined in the previous step, does not help to solve the problem, it is probably time to do a broader search online to see if others have written about the issue you're seeing. See a list of places to search for help below for places to search
+4. If reading R's built-in documentation, as outlined in the previous step, does not help to solve the problem, it is probably time to do a broader search online to see if others have written about the issue you're seeing. See a list of places to search for help below
5. If all the previous steps above fail, and you cannot find a solution from your online searches, it may be time to compose a question with a reproducible example and post it in an appropriate place
Steps 1 to 3 outlined above are fairly self-explanatory but, due to the vastness of the internet and multitude of search options, it is worth considering effective search strategies before deciding to compose a question.
@@ -154,11 +154,11 @@ Steps 1 to 3 outlined above are fairly self-explanatory but, due to the vastness
Search engines are a logical place to start for many issues.
'Googling it' can in some cases result in the discovery of blog posts, forum messages and other online content about the precise issue you're having.
-Simply typing in a clear description of the problem/question is a valid approach here but it is important to be specific (e.g., with reference to function and package names and input dataset sources if the problem is dataset specific).
+Simply typing in a clear description of the problem/question is a valid approach here, but it is important to be specific (e.g., with reference to function and package names and input dataset sources if the problem is dataset-specific).
You can also make online searches more effective by including additional detail:
-- Use quote marks to maximize the chances that 'hits' relate to the exact issue you're having by reducing the number of results returned. For example, if you try and fail to save a GeoJSON file in a location that already exists, you will get an error containing the message "GDAL Error 6: DeleteLayer() not supported by this dataset". A specific search query such as `"GDAL Error 6" sf` is more likely to yield a solution than searching for `GDAL Error 6` without the quote marks
+- Use quotation marks to maximize the chances that 'hits' relate to the exact issue you're having by reducing the number of results returned. For example, if you try and fail to save a GeoJSON file in a location that already exists, you will get an error containing the message "GDAL Error 6: DeleteLayer() not supported by this dataset". A specific search query such as `"GDAL Error 6" sf` is more likely to yield a solution than searching for `GDAL Error 6` without the quotation marks
- Set [time restraints](https://uk.pcmag.com/software-services/138320/21-google-search-tips-youll-want-to-learn), for example only returning content created within the last year can be useful when searching for help on an evolving package
- Make use of additional [search engine features](https://www.makeuseof.com/tag/6-ways-to-search-by-date-on-google/), for example restricting searches to content hosted on CRAN with site:r-project.org
@@ -176,10 +176,10 @@ There are many forums where you can do this, including:
### Reproducible examples with **reprex** {#reprex}
-In terms of asking a good question, a clearly stated questions supported by an accessible and fully reproducible example is key (see also https://r4ds.hadley.nz/workflow-help.html).
+In terms of asking a good question, a clearly stated question supported by an accessible and fully reproducible example is key (see also https://r4ds.hadley.nz/workflow-help.html).
It is also helpful, after showing the code that 'did not work' from the user's perspective, to explain what you would like to see.
A very useful tool for creating reproducible examples is the **reprex** package\index{reproducibility}.
-To highlight unexpected behaviour, you can write completely reproducible code that demonstrates the issue and then use the `reprex()` function to create a copy of your code that can be pasted into a forum or other online space.
+To highlight unexpected behavior, you can write completely reproducible code that demonstrates the issue and then use the `reprex()` function to create a copy of your code that can be pasted into a forum or other online space.
Imagine you are trying to create a map of the world with blue sea and green land.
You could simply ask how to do this in one of the places outlined in the previous section.
@@ -192,7 +192,7 @@ library(spData)
plot(st_geometry(world), col = "green")
```
-If you post this code in a forum, it is likely that you will get a more specific and useful responses.
+If you post this code in a forum, it is likely that you will get a more specific and useful response.
For example, someone might respond with the following code, which demonstrably solves the problem, as illustrated in Figure \@ref(fig:16-synthesis-reprex):
```r
@@ -216,7 +216,7 @@ Demonstrating your own efforts to solve a problem, and providing a reproducible
In some cases, you may not be able to find a solution to your problem online, or you may not be able to formulate a question that can be answered by a search engine.
The best starting point in such cases, or when developing a new geocomputational methodology, may be a pen and paper (or equivalent digital sketching tools such as [Excalidraw](https://excalidraw.com/) and [tldraw](https://www.tldraw.com/) which allow collaborative sketching and rapid sharing of ideas).
-During the most creative early stages of methodological development work software *of any kind* can slow down your thoughts and direct your thinking away from important abstract thoughts.
+During the most creative early stages of methodological development work, software *of any kind* can slow down your thoughts and direct them away from important abstract thoughts.
Framing the question with mathematics is also highly recommended, with reference to a minimal example that you can sketch 'before and after' versions of numerically.
If you have the skills and if the problem warrants it, describing the approach algebraically can in some cases help develop effective implementations.
@@ -241,7 +241,7 @@ Each has evolving geospatial capabilities.
[**rasterio**](https://github.com/rasterio/rasterio), for example, is a Python package with similar functionality as the **terra** package used in this book.
See [*Geocomputation with Python*](https://py.geocompx.org/), for an introduction to geocomputation with Python.
-Dozens of geospatial libraries have been developed in C++\index{C++}, including well known libraries such as GDAL\index{GDAL} and GEOS\index{GEOS}, and less well known libraries such as the **[Orfeo Toolbox](https://github.com/orfeotoolbox/OTB)** for processing remote sensing (raster) data.
+Dozens of geospatial libraries have been developed in C++\index{C++}, including well-known libraries such as GDAL\index{GDAL} and GEOS\index{GEOS}, and less well-known libraries such as the **[Orfeo Toolbox](https://github.com/orfeotoolbox/OTB)** for processing remote sensing (raster) data.
[**Turf.js**](https://github.com/Turfjs/turf) is an example of the potential for doing geocomputation with JavaScript.
[GeoTrellis](https://geotrellis.io/) provides functions for working with raster and vector data in the Java-based language Scala.
And [WhiteBoxTools](https://github.com/jblindsay/whitebox-tools) provides an example of a rapidly evolving command line GIS implemented in Rust.
@@ -258,10 +258,10 @@ A next step in this case is to read-up on relevant articles in the area such as
## The open source approach {#benefit}
-This is a technical book so it makes sense for the next steps, outlined in the previous section, to also be technical.
+This is a technical book, so it makes sense for the next steps, outlined in the previous section, to also be technical.
However, there are wider issues worth considering in this final section, which returns to our definition of geocomputation\index{geocomputation}.
One of the elements of the term introduced in Chapter \@ref(intro) was that geographic methods should have a positive impact.
-Of course, how to define and measure 'positive' is a subjective, philosophical question, beyond the scope of this book.
+Of course, how to define and measure 'positive' is a subjective, philosophical question that is beyond the scope of this book.
Regardless of your worldview, consideration of the impacts of geocomputational work is a useful exercise:
the potential for positive impacts can provide a powerful motivation for future learning and, conversely, new methods can open-up many possible fields of application.
These considerations lead to the conclusion that geocomputation is part of a wider 'open source approach'.
@@ -275,13 +275,14 @@ Both capture the essence of working with geographic data, but geocomputation has
- Reproducibility\index{reproducibility}
We added the final ingredient: reproducibility was barely mentioned in early work on geocomputation, yet a strong case can be made for it being a vital component of the first two ingredients.
+
Reproducibility\index{reproducibility}:
-- Encourages *creativity* by shifting the focus away from the basics (which are readily available through shared code) and towards applications
-- Discourages people from 'reinventing the wheel': there is no need to re-do what others have done if their methods can be used by others
-- Makes research more conducive to real-world applications, by enabling anyone in any sector to apply your methods in new areas
+- Encourages *creativity* by shifting the focus away from the basics (which are readily available through shared code) and toward applications
+- Discourages people from 'reinventing the wheel': there is no need to redo what others have done if their methods can be used by others
+- Makes research more conducive to real-world applications, by enabling anyone in any sector to apply one's methods in new areas
-If reproducibility is the defining asset of geocomputation (or command line GIS) it is worth considering what makes it reproducible.
+If reproducibility is the defining asset of geocomputation (or command line GIS), it is worth considering what makes it reproducible.
This brings us to the 'open source approach', which has three main components:
- A command line interface\index{command line interface} (CLI), encouraging scripts recording geographic work to be shared and reproduced
@@ -289,19 +290,19 @@ This brings us to the 'open source approach', which has three main components:
- An active user and developer community, which collaborates and self-organizes to build complementary and modular tools
Like the term geocomputation\index{geocomputation}, the open source approach is more than a technical entity.
-It is a community composed of people interacting daily with shared aims: to produce high performance tools, free from commercial or legal restrictions, that are accessible for anyone to use.
+It is a community composed of people interacting daily with shared aims: to produce high-performance tools, free from commercial or legal restrictions, that are accessible for anyone to use.
The open source approach to working with geographic data has advantages that transcend the technicalities of how the software works, encouraging learning, collaboration and an efficient division of labor.
There are many ways to engage in this community, especially with the emergence of code hosting sites, such as GitHub, which encourage communication and collaboration.
A good place to start is simply browsing through some of the source code, 'issues' and 'commits' in a geographic package of interest.
A quick glance at the `r-spatial/sf` GitHub repository, which hosts the code underlying the **sf**\index{sf} package, shows that 100+ people have contributed to the codebase and documentation.
-Dozens more people have contributed by asking question and by contributing to 'upstream' packages that **sf** uses.
+Dozens more people have contributed by asking questions and by contributing to 'upstream' packages that **sf** uses.
More than 1,500 issues have been closed on its [issue tracker](https://github.com/r-spatial/sf/issues), representing a huge amount of work to make **sf** faster, more stable and user-friendly.
This example, from just one package out of dozens, shows the scale of the intellectual operation underway to make R a highly effective and continuously evolving language for geocomputation.
It is instructive to watch the incessant development activity happen in public fora such as GitHub, but it is even more rewarding to become an active participant.
This is one of the greatest features of the open source approach: it encourages people to get involved.
-This book itself is a result of the open source approach:
-it was motivated by the amazing developments in R's geographic capabilities over the last two decades, but made practically possible by dialogue and code sharing on platforms for collaboration.
+This book is a result of the open source approach:
+it was motivated by the amazing developments in R's geographic capabilities over the last two decades, but made practically possible by dialogue and code-sharing on platforms for collaboration.
We hope that in addition to disseminating useful methods for working with geographic data, this book inspires you to take a more open source approach.
diff --git a/conclusion.html b/conclusion.html
index a8f93ec4e..c86fccdb1 100644
--- a/conclusion.html
+++ b/conclusion.html
@@ -6,16 +6,16 @@
Chapter 16 Conclusion | Geocomputation with R
-
+
-
+
-
+
@@ -112,10 +112,10 @@
16.1 Introduction
-
Like the introduction, this concluding chapter contains few code chunks.
+
Like the introduction, this concluding chapter contains a few code chunks.
The aim is to synthesize the contents of the book, with reference to recurring themes/concepts, and to inspire future directions of application and development.
The chapter has no prerequisites.
-However, you may get more out of it if you have read and attempted the exercises in Part I (Foundations), try more advances approaches in Part II, and considered how geocomputation can help you solve work, research or other problems, with reference to the chapters in Part III (Applications).
+However, you may get more out of it if you have read and attempted the exercises in Part I (Foundations), tried more advances approaches in Part II (Extensions), and considered how geocomputation can help you solve work, research or other problems, with reference to the chapters in Part III (Applications).
The chapter is organized as follows.
Section 16.2 discusses the wide range of options for handling geographic data in R.
Choice is a key feature of open source software; the section provides guidance on choosing between the various options.
@@ -163,7 +163,7 @@
#> [1] TRUE
This raises the question: which to use?
The answer is: it depends.
-Each approach has advantages: base R tends to be stable, well-known, and has minimal dependencies which is why it is often preferred for software (package) development.
+Each approach has advantages: base R tends to be stable, well-known, and has minimal dependencies, which is why it is often preferred for software (package) development.
The tidyverse approach, on the other hand, is often preferred for interactive programming.
Choosing between the two approaches is therefore a matter of preference and application.
While this book covers commonly needed functions — such as the base R [ subsetting operator and the dplyr function select() demonstrated in the code chunk above — there are many other functions for working with geographic data, from other packages, that have not been mentioned.
@@ -181,8 +181,8 @@
A new package with similar (but not identical) functionality compared to an existing package can increase resilience, performance (partly driven by friendly competition and mutual learning between developers) and choice, both of which are key benefits of doing geocomputation with open source software.
In this context, deciding which combination of sf, tidyverse, terra and other packages to use should be made with knowledge of alternatives.
The sp ecosystem that sf superseded, for example, can do many of the things covered in this book and, due to its age, is built on by many other packages.
-At the time of writing in 2024, 463 package Depend on or Importsp, up slightly from 452 in October 2018, showing that its data structures are widely used and have been extended in many directions.
-The equivalent numbers for sf are 69 in 2018 and 431 in 2024, highlighting that the package is future proof and has a growing user base and developer community (Bivand 2021).
+At the time of writing in 2024, 463 packages Depend on or Importsp, up slightly from 452 in October 2018, showing that its data structures are widely used and have been extended in many directions.
+The equivalent numbers for sf are 69 in 2018 and 431 in 2024, highlighting that the package is future-proof and has a growing user base and developer community (Bivand 2021).
Although best known for point pattern analysis, the spatstat package also supports raster and other vector geometries and provides powerful functionality for spatial statistics and more (Baddeley and Turner 2005).
It may also be worth researching new alternatives that are under development if you have needs that are not met by established packages.
@@ -203,8 +203,8 @@
These resources include A. Zuur et al. (2009), A. F. Zuur et al. (2017) which focus on ecological use cases, and freely available teaching material and code on Geostatistics & Open-source Statistical Computing hosted at css.cornell.edu/faculty/dgr2.
R for Geographic Data Science provides an introduction to R for geographic data science and modeling.
We have largely omitted geocomputation on ‘big data’ by which we mean datasets that do not fit on a high-spec laptop.
-This decision is justified by the fact that the majority of geographic datasets that are needed for common research or policy applications do fit on consumer hardware, large high resolution remote sensing datasets being a notable exception (see Section 10.8).
-It is possible to get more RAM on your computer or to temporarily ‘renting’ compute power available on platforms such as GitHub Codespaces, which can be used to run the code in this book.
+This decision is justified by the fact that the majority of geographic datasets that are needed for common research or policy applications do fit on consumer hardware, large high-resolution remote sensing datasets being a notable exception (see Section 10.8).
+It is possible to get more RAM on your computer or to temporarily ‘rent’ compute power available on platforms such as GitHub Codespaces, which can be used to run the code in this book.
Furthermore, learning to solve problems on small datasets is a prerequisite to solving problems on huge datasets and the emphasis in this book is getting started, and the skills you learn here will be useful when you move to bigger datasets.
Analysis of ‘big data’ often involves extracting a small amount of data from a database for a specific statistical analysis.
Spatial databases, covered in Chapter 10, can help with the analysis of datasets that do not fit in memory.
@@ -224,15 +224,15 @@
Geocomputation is a large and challenging field, making issues and temporary blockers to work near inevitable.
In many cases you may just ‘get stuck’ at a particular point in your data analysis workflow facing cryptic error messages that are hard to debug.
Or you may get unexpected results with few clues about what is going on.
-This section provides pointers to help you overcome such problems, by clearly defining the problem, searching for existing knowledge on solutions and, if those approaches to not solve the problem, through the art of asking good questions.
+This section provides pointers to help you overcome such problems, by clearly defining the problem, searching for existing knowledge on solutions and, if those approaches do not solve the problem, through the art of asking good questions.
When you get stuck at a particular point, it is worth first taking a step back and working out which approach is most likely to solve the issue.
-Trying each of the following steps — skipping steps already taken — provides a structured approach to problem solving:
+Trying each of the following steps — skipping steps already taken — provides a structured approach to problem-solving:
Define exactly what you are trying to achieve, starting from first principles (and often a sketch, as outlined below)
-
Diagnose exactly where in your code the unexpected results arise, by running and exploring the outputs of individual lines of code and their individual components (you can can run individual parts of a complex command by selecting them with a cursor and pressing Ctrl+Enter in RStudio, for example)
+
Diagnose exactly where in your code the unexpected results arise, by running and exploring the outputs of individual lines of code and their individual components (you can run individual parts of a complex command by selecting them with a cursor and pressing Ctrl+Enter in RStudio, for example)
Read the documentation of the function that has been diagnosed as the ‘point of failure’ in the previous step. Simply understanding the required inputs to functions, and running the examples that are often provided at the bottom of help pages, can help solve a surprisingly large proportion of issues (run the command ?terra::rast and scroll down to the examples that are worth reproducing when getting started with the function, for example)
-
If reading R’s built-in documentation, as outlined in the previous step, does not help to solve the problem, it is probably time to do a broader search online to see if others have written about the issue you’re seeing. See a list of places to search for help below for places to search
+
If reading R’s built-in documentation, as outlined in the previous step, does not help to solve the problem, it is probably time to do a broader search online to see if others have written about the issue you’re seeing. See a list of places to search for help below
If all the previous steps above fail, and you cannot find a solution from your online searches, it may be time to compose a question with a reproducible example and post it in an appropriate place
Steps 1 to 3 outlined above are fairly self-explanatory but, due to the vastness of the internet and multitude of search options, it is worth considering effective search strategies before deciding to compose a question.
@@ -242,11 +242,11 @@
Search engines are a logical place to start for many issues.
‘Googling it’ can in some cases result in the discovery of blog posts, forum messages and other online content about the precise issue you’re having.
-Simply typing in a clear description of the problem/question is a valid approach here but it is important to be specific (e.g., with reference to function and package names and input dataset sources if the problem is dataset specific).
+Simply typing in a clear description of the problem/question is a valid approach here, but it is important to be specific (e.g., with reference to function and package names and input dataset sources if the problem is dataset-specific).
You can also make online searches more effective by including additional detail:
-
Use quote marks to maximize the chances that ‘hits’ relate to the exact issue you’re having by reducing the number of results returned. For example, if you try and fail to save a GeoJSON file in a location that already exists, you will get an error containing the message “GDAL Error 6: DeleteLayer() not supported by this dataset”. A specific search query such as "GDAL Error 6" sf is more likely to yield a solution than searching for GDAL Error 6 without the quote marks
+
Use quotation marks to maximize the chances that ‘hits’ relate to the exact issue you’re having by reducing the number of results returned. For example, if you try and fail to save a GeoJSON file in a location that already exists, you will get an error containing the message “GDAL Error 6: DeleteLayer() not supported by this dataset”. A specific search query such as "GDAL Error 6" sf is more likely to yield a solution than searching for GDAL Error 6 without the quotation marks
Set time restraints, for example only returning content created within the last year can be useful when searching for help on an evolving package
Make use of additional search engine features, for example restricting searches to content hosted on CRAN with site:r-project.org
@@ -272,10 +272,10 @@
16.4.3 Reproducible examples with reprex
-
In terms of asking a good question, a clearly stated questions supported by an accessible and fully reproducible example is key (see also https://r4ds.hadley.nz/workflow-help.html).
+
In terms of asking a good question, a clearly stated question supported by an accessible and fully reproducible example is key (see also https://r4ds.hadley.nz/workflow-help.html).
It is also helpful, after showing the code that ‘did not work’ from the user’s perspective, to explain what you would like to see.
A very useful tool for creating reproducible examples is the reprex package.
-To highlight unexpected behaviour, you can write completely reproducible code that demonstrates the issue and then use the reprex() function to create a copy of your code that can be pasted into a forum or other online space.
+To highlight unexpected behavior, you can write completely reproducible code that demonstrates the issue and then use the reprex() function to create a copy of your code that can be pasted into a forum or other online space.
Imagine you are trying to create a map of the world with blue sea and green land.
You could simply ask how to do this in one of the places outlined in the previous section.
However, it is likely that you will get a better response if you provide a reproducible example of what you have tried so far.
@@ -284,7 +284,7 @@
If you post this code in a forum, it is likely that you will get a more specific and useful responses.
+
If you post this code in a forum, it is likely that you will get a more specific and useful response.
For example, someone might respond with the following code, which demonstrably solves the problem, as illustrated in Figure 16.1:
In some cases, you may not be able to find a solution to your problem online, or you may not be able to formulate a question that can be answered by a search engine.
The best starting point in such cases, or when developing a new geocomputational methodology, may be a pen and paper (or equivalent digital sketching tools such as Excalidraw and tldraw which allow collaborative sketching and rapid sharing of ideas).
-During the most creative early stages of methodological development work software of any kind can slow down your thoughts and direct your thinking away from important abstract thoughts.
+During the most creative early stages of methodological development work, software of any kind can slow down your thoughts and direct them away from important abstract thoughts.
Framing the question with mathematics is also highly recommended, with reference to a minimal example that you can sketch ‘before and after’ versions of numerically.
If you have the skills and if the problem warrants it, describing the approach algebraically can in some cases help develop effective implementations.
@@ -331,7 +331,7 @@
Each has evolving geospatial capabilities.
rasterio, for example, is a Python package with similar functionality as the terra package used in this book.
See Geocomputation with Python, for an introduction to geocomputation with Python.
-
Dozens of geospatial libraries have been developed in C++, including well known libraries such as GDAL and GEOS, and less well known libraries such as the Orfeo Toolbox for processing remote sensing (raster) data.
+
Dozens of geospatial libraries have been developed in C++, including well-known libraries such as GDAL and GEOS, and less well-known libraries such as the Orfeo Toolbox for processing remote sensing (raster) data.
Turf.js is an example of the potential for doing geocomputation with JavaScript.
GeoTrellis provides functions for working with raster and vector data in the Java-based language Scala.
And WhiteBoxTools provides an example of a rapidly evolving command line GIS implemented in Rust.
@@ -347,10 +347,10 @@
16.6 The open source approach
-
This is a technical book so it makes sense for the next steps, outlined in the previous section, to also be technical.
+
This is a technical book, so it makes sense for the next steps, outlined in the previous section, to also be technical.
However, there are wider issues worth considering in this final section, which returns to our definition of geocomputation.
One of the elements of the term introduced in Chapter 1 was that geographic methods should have a positive impact.
-Of course, how to define and measure ‘positive’ is a subjective, philosophical question, beyond the scope of this book.
+Of course, how to define and measure ‘positive’ is a subjective, philosophical question that is beyond the scope of this book.
Regardless of your worldview, consideration of the impacts of geocomputational work is a useful exercise:
the potential for positive impacts can provide a powerful motivation for future learning and, conversely, new methods can open-up many possible fields of application.
These considerations lead to the conclusion that geocomputation is part of a wider ‘open source approach’.
@@ -363,14 +363,14 @@
Building ‘scientific’ tools
Reproducibility
-
We added the final ingredient: reproducibility was barely mentioned in early work on geocomputation, yet a strong case can be made for it being a vital component of the first two ingredients.
-Reproducibility:
+
We added the final ingredient: reproducibility was barely mentioned in early work on geocomputation, yet a strong case can be made for it being a vital component of the first two ingredients.
+
Reproducibility:
-
Encourages creativity by shifting the focus away from the basics (which are readily available through shared code) and towards applications
-
Discourages people from ‘reinventing the wheel’: there is no need to re-do what others have done if their methods can be used by others
-
Makes research more conducive to real-world applications, by enabling anyone in any sector to apply your methods in new areas
+
Encourages creativity by shifting the focus away from the basics (which are readily available through shared code) and toward applications
+
Discourages people from ‘reinventing the wheel’: there is no need to redo what others have done if their methods can be used by others
+
Makes research more conducive to real-world applications, by enabling anyone in any sector to apply one’s methods in new areas
-
If reproducibility is the defining asset of geocomputation (or command line GIS) it is worth considering what makes it reproducible.
+
If reproducibility is the defining asset of geocomputation (or command line GIS), it is worth considering what makes it reproducible.
This brings us to the ‘open source approach’, which has three main components:
A command line interface (CLI), encouraging scripts recording geographic work to be shared and reproduced
@@ -378,18 +378,18 @@
An active user and developer community, which collaborates and self-organizes to build complementary and modular tools
Like the term geocomputation, the open source approach is more than a technical entity.
-It is a community composed of people interacting daily with shared aims: to produce high performance tools, free from commercial or legal restrictions, that are accessible for anyone to use.
+It is a community composed of people interacting daily with shared aims: to produce high-performance tools, free from commercial or legal restrictions, that are accessible for anyone to use.
The open source approach to working with geographic data has advantages that transcend the technicalities of how the software works, encouraging learning, collaboration and an efficient division of labor.
There are many ways to engage in this community, especially with the emergence of code hosting sites, such as GitHub, which encourage communication and collaboration.
A good place to start is simply browsing through some of the source code, ‘issues’ and ‘commits’ in a geographic package of interest.
A quick glance at the r-spatial/sf GitHub repository, which hosts the code underlying the sf package, shows that 100+ people have contributed to the codebase and documentation.
-Dozens more people have contributed by asking question and by contributing to ‘upstream’ packages that sf uses.
+Dozens more people have contributed by asking questions and by contributing to ‘upstream’ packages that sf uses.
More than 1,500 issues have been closed on its issue tracker, representing a huge amount of work to make sf faster, more stable and user-friendly.
This example, from just one package out of dozens, shows the scale of the intellectual operation underway to make R a highly effective and continuously evolving language for geocomputation.
It is instructive to watch the incessant development activity happen in public fora such as GitHub, but it is even more rewarding to become an active participant.
This is one of the greatest features of the open source approach: it encourages people to get involved.
-This book itself is a result of the open source approach:
-it was motivated by the amazing developments in R’s geographic capabilities over the last two decades, but made practically possible by dialogue and code sharing on platforms for collaboration.
+This book is a result of the open source approach:
+it was motivated by the amazing developments in R’s geographic capabilities over the last two decades, but made practically possible by dialogue and code-sharing on platforms for collaboration.
We hope that in addition to disseminating useful methods for working with geographic data, this book inspires you to take a more open source approach.