Skip to content

Latest commit

 

History

History
73 lines (46 loc) · 4.07 KB

README.md

File metadata and controls

73 lines (46 loc) · 4.07 KB

r.learn.ml

GRASS GIS add-on for applying machine learning to GRASS GIS spatial data

DESCRIPTION

r.learn.ml represents a front-end to the scikit learn python package. The module enables scikit-learn classification and regression models to be applied to GRASS GIS rasters that are stored as part of an imagery group group or specified as individual maps in the optional raster parameter.

The training component of the machine learning workflow is performed using the r.learn.train module. This module uses training data consisting of labelled pixels in a GRASS GIS raster map, or a GRASS GIS vector containing points, and develops a machine learning model on the rasters within a GRASS imagery group. This model needs to be saved to a file and can be automatically compressed if the .gz file extension is used.

After a model is training, the i.learn.predict module needs to be called, which will retrieve the saved and pre-fitted model and apply it to a GRASS GIS imagery group.

NOTES

r.learn.ml uses the "scikit-learn" machine learning python package (version ≥ 0.20) along with the "pandas" package. These packages need to be installed within your GRASS GIS Python environment. For Linux users, these packages should be available through the linux package manager. For MS-Windows users using a 64 bit GRASS, the easiest way of installing the packages is by using the precompiled binaries from Christoph Gohlke and by using the OSGeo4W installation method of GRASS, where the python setuptools can also be installed. You can then use 'easy_install pip' to install the pip package manager. Then, you can download the NumPy+MKL and scikit-learn .whl files and install them using 'pip install packagename.whl'. For MS-Windows with a 32 bit GRASS, scikit-learn is available in the OSGeo4W installer.

EXAMPLE

Here we are going to use the GRASS GIS sample North Carolina data set as a basis to perform a landsat classification. We are going to classify a Landsat 7 scene from 2000, using training information from an older (1996) land cover dataset.

Landsat 7 (2000) bands 7,4,2 color composite example:

Landsat 7 (2000) bands 7,4,2 color composite example

Note that this example must be run in the "landsat" mapset of the North Carolina sample data set location.

First, we are going to generate some training pixels from an older (1996) land cover classification:

g.region raster=landclass96 -p
r.random input=landclass96 npoints=1000 raster=landclass96_roi

Then we can use these training pixels to perform a classification on the more recently obtained landsat 7 image:

r.learn.train group=lsat7_2000 training_map=landclass96_roi \
	model_name=RandomForestClassifier n_estimators=500 save_model=rf_model.gz

r.learn.predict group=lsat7_2000 load_model=rf_model.gz output=rf_classification

Now display the results:

# copy category labels from landclass training map to result
r.category rf_classification raster=landclass96_roi

# copy color scheme from landclass training map to result
r.colors rf_classification raster=landclass96_roi
r.category rf_classification

Random forest classification result:

Random forest classification result

ACKNOWLEDGEMENTS

Thanks for Paulo van Breugel and Vaclav Petras for testing.

REFERENCES

Brenning, A. 2012. Spatial cross-validation and bootstrap for the assessment of prediction rules in remote sensing: the R package 'sperrorest'. 2012 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 23-27 July 2012, p. 5372-5375.

Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.

AUTHOR

Steven Pawley

Last changed: $Date: 2019-02-08 15:41:00 -0600 (Fri, 08 Feb 2019) $