Skip to content
Emmanuel Blondel edited this page Oct 19, 2019 · 19 revisions

geoflow – R engine to orchestrate and run geospatial (meta)data workflows

DOI

R engine to orchestrate and run geospatial (meta)data workflows


If you wish to sponsor geoflow, do not hesitate to contact me

Many thanks to the following organizations that have provided fundings for strenghtening the geoflow package:


Table of contents

1. Overview
2. Package status
3. Credits
4. User guide
   4.1 How to install geoflow
   4.2 How to use geoflow
   4.3 How to create a geoflow configuration file
      4.3.1 Create manually a configuration file
      4.3.2 Use the configuration Shiny User Interface
5. Issue reporting

1. Overview and vision


The principle of geoflow is to offer a simple framework in R to execute and orchestrate geospatial (meta)data management and publication tasks in an automated way.

2. Development status


On GitHub under consolidation.

First version in CRAN expected end 2019.

3. Credits


(c) 2019, Emmanuel Blondel, Julien Barde, Wilfried Heintz

Package distributed under MIT license.

If you use geoflow, w would be very grateful if you can add a citation in your published work. By citing geoflow, beyond acknowledging the work, you contribute to make it more visible and guarantee its growing and sustainability. For citation, please use the DOI: DOI

4. User guide


4.1 How to install geoflow

For now, the package can be installed from Github

install.packages("devtools")

Once the devtools package loaded, you can use the install_github to install geoflow. By default, package will be installed from master which is the current version in development (likely to be unstable).

require("remotes")
install_github("eblondel/geoflow")

4.2 How to use geoflow in R

In R, using geoflow consists essentially in running the function ``executeWorkflow", which takes a single parameter: the name of a configuration file in JSON format:

executeWorkflow("config.json")

The workflow that is going to be executed is entirely described in a configuration file. The main preparatory work of the data manager will then to prepare the configuration file, depending on the tasks to perform.

Note: It is planned to offer a shiny app interface, through geoflow, that will allow configure the workflow in a user-friendly manner (The shiny app will then take care of creating the appropriate JSON configuration file in a transparent way)

4.3 How to create a geoflow configuration file

To create a geoflow configuration file, first let's describe how the configuration file is structured.

A geoflow configuration contains several parts (some that are optional) that are defined here below.

Name Definition Optional/Required
id A string identifier/name for the workflow Required
mode A string, either 'raw' or 'entity' that defines the workflow mode.
  • raw mode: simple mode that allows to trigger basic tasks with R (known in geoflow as actions) in sequential way. This mode can be used by users that just want to chain R scripts.
  • entity mode: mode were all the actions will be performed based on a set of entities. In geoflow, an entity includes both metadata and data elements. In most of cases, an entity will describe a dataset for which we want to perform actions such as metadata handling/publishing in a web metadata catalogue, spatialdata upload in Geoserver, etc etc. With this mode, geoflow will take each entity for which a set of actions will be executed.|Required metadata|Part where the entity set is defined, to be used for executing actions in mode entity.|Required with entity mode software|Part where the software to interact with will be defined. It can be a software from where the user wants to get data, or a software where to publish data using geoflow e.g. a GeoNetwork metadata catalogue, a GeoServer, etc.|Optional actions|Part where the actions to use are defined. These can be source R scripts in case of the raw mode, or entity-based actions in case of mode entity. An action put in the list can be enabled/disabled and parameterized with a set of options that is specific to each action.|Required profile|Global metadata workflow. Information that is common to all entities in case of mode entity, and that can be exploited in some of the actions. e.g. add a project logo for all dataset descriptions.|Optional options|Global workflow options|Optional

The skeleton of the JSON configuration file will be then as follows:

{
  "id": "my-workflow",
  "mode": "entity",
  "metadata": { <metadata sources defined here> },
  "software": [ <pieces of software defined here> ],
  "actions": [ <actions defined here>  ],
  "profile": { <global profile (metadata) defined here> },
  "options": { <global options defined here> },
}
4.3.1 Create manually a configuration file

TODO

4.3.2 Use the geoflow configuration Shiny User Interface

NOT YET AVAILABLE

5. Issue reporting


Issues can be reported at https://github.com/eblondel/geoflow/issues

Clone this wiki locally