Skip to content

Releases: mmaelicke/scikit-gstat

Version 1.0

04 Feb 09:00
Compare
Choose a tag to compare

Here we present SciKit-GStat, an open source Python package for variogram estimation, that fits well into established frameworks for scientific computing like SciPy, numpy, gstools or pandas. SciKit-GStat is written in a mutable, object-oriented way that mimics the typical geostatistical analysis workflow. Its main strength is the ease of usage and interactivity and it is therefore usable with only a little or even no knowledge in Python.

SciKit-GStat ships with a large number of predefined procedures, algorithms, and models, such as variogram estimators, theoretical spatial models, or binning algorithms. Common approaches to estimate variograms are covered and can be used out of the box. At the same time, the base class is very flexible and can be adjusted to less common problems, as well.

SciKit-GStat can easily interface to GSTools.

If you use SciKit-GStat, pleace cite this publication:

Mälicke, M.: SciKit-GStat 1.0: A SciPy flavoured geostatistical variogram estimation toolbox written in Python, Geosci. Model Dev. Discuss.
[preprint], https://doi.org/10.5194/gmd-2021-174, in review, 2021.

The code itself can also be cited:

Mirko Mälicke, Egil Möller, Helge David Schneider, & Sebastian Müller. (2021, May 28).
mmaelicke/scikit-gstat: A scipy flavoured geostatistical variogram analysis toolbox (Version v0.6.0). Zenodo.
http://doi.org/10.5281/zenodo.4835779

What's Changed

New Contributors

Full Changelog: v0.6.0...v1.0.0

Version 0.6

28 May 08:13
Compare
Choose a tag to compare

Description

SciKit-Gstat is a scipy-styled geostatistical toolbox for variogram estimation. It includes two base classes Variogram and OrdinaryKriging. Additionally, various variogram classes inheriting from Variogram are available for solving directional or space-time related tasks. The module makes use of a rich selection of semi-variance estimators and variogram model functions while being extensible at the same time.

This version may be the last minor version before the first stable release 1.0 is released!

Version 0.6 brings several smaller adjustments. A new interface was introduced to export a Variogram directly into a gstools.Krige instance. This makes kriging even more seamless between scikit-gstat and gstools.
The Variogram has a new method called cross_validate to validate variograms by a leave-one-out Kriging interpolation. This is accompanied by some internals to estimate observation uncertainty and plot error bars in the default plot. Proper uncertainty estimation is still a long way to go and possible a good objective for version 1.1.
Finally, SciKit-GStat has a skgstat.data submodule, that can return sample data.

Documentation

Changes since 0.5

  • The util and data submodule are now always loaded at top-level
  • fixed a potential circular import
  • added uncertainty tools to util. This is not yet finished and may change the signature before it gets stable with Version 1.0.0

Version 0.5.6

  • [Variogram] the interal MetricSpace instance used to calculate the distance matrix is now available as the Variogram.metric_space property.
  • [Variogram] Variogram.metric_space is now read-only.
  • [unittest] two unittests are changed (linting, not functionality)

Version 0.5.5

  • [data] new submodule skgstat.data contains sample random fields and methods for sampling these fields in a reproducible way at random locations and different sample sizes.

Version 0.5.4

  • [util] added a new cross_validation utility module to cross-validate variograms with leave-one-out Kriging cross validations.

Version 0.5.3

  • [MetricSpace] new class skgstat.MetricSpace.ProbabilisticMetricSpace that extends the metric space by a stochastic element to draw samples from the input data, instead of using the full dataset.

Version 0.5.2

  • [interface] new interface function added: skgstat.Variogram.to_gs_krige. This interface will return a gstools.Krige instance from the fitted variogram.
  • some typos were corrected
  • some code refactored (mainly linting errors)

Version 0.5.1

  • [plotting] the spatio-temporal 2D and 3D plots now label the axis correctly.
  • [plotting] fixed swapped plotting axes for spatio-temporal plots.

Version 0.5

20 Apr 08:01
Compare
Choose a tag to compare

Description

SciKit-Gstat is a scipy-styled analysis module for geostatistics. It includes two base classes Variogram and OrdinaryKriging. Additionally, various variogram classes inheriting from Variogram are available for solving directional or space-time related tasks. The module makes use of a rich selection of semi-variance estimators and variogram model functions while being extensible at the same time.

Version 0.5 brings two major improvements: Instead of passing a numpy.ndarray, you can now use the new class skgstat.MetricSpace, which can pre-calculate distances in case they are used all over the place. Secondly, the new interface functions Variogram.to_gstools and Variogram.to_empirical can be used to export a Variogram to gstools and use their field generation, kriging and all the other fancy stuff there.

Documentation

Changes since 0.4

  • [MetricSpace] A new class :class:MetricSpace <skgstat.MetricSpace> was introduced. This class can be passed
    to any class that accepted coordinates so far. This wrapper can be used to pre-calculate large distance
    matrices and pass it to a lot of Variograms.
  • [MetricSpacePair] A new class :class:MetricSpacePair <skgstat.MetricSpacePair> was introduced.
    This is a pair of two :class:MetricSpaces <skgstat.MetricSpace> and pre-calculates all distances between
    the two spaces. This is i.e. used in Kriging to pre-calcualte all distance between the input coordinates and
    the interpolation grid only once.

Version 0.4.4

  • [models] the changes to :func:matern <skgstat.models.matern> introduced in 0.3.2 are reversed.
    The Matérn model does not adapt the smoothness scaling to effective range anymore, as the behavior was too
    inconsistent.
  • [interface] minor bugfix of circular import in variogram_estimator interface
  • [models] :func:matern(0, ...) <skgstat.models.matern> now returns the nugget instead of numpy.NaN
  • [models] :func:stable(0, ...) <skgstat.models.stable> now returns the nugget instead of numpy.NaN or a
    ZeroDivisionError.

Version 0.4.3

  • [Variogram] :func:dim <skgstat.Variogram.dim> now returns the spatial dimensionality of the input data.
  • [Variogram] fixed a numpy depreaction warning in _calc_distances

Version 0.4.2

  • [Variogram] :func:bins <skgstat.Variogram.bins> now cases manual setted bin edges automatically
    to a :func:numpy.array.
  • [Variogram] :func:get_empirical <skgstat.Variogram.get_empirical> returns the empirical variogram.
    That is a tuple of the current :func:bins <skgstat.Variogram.bins> and
    :func:experimental <skgstat.Variogram.experimental> arrays, with the option to move the bin to the
    lag classes centers.

Version 0.4.1

  • [Variogram] moved the bin function setting into a wrapper instance method, which was an anonymous lambda before.
    This makes the Variogram serializable again.
  • [Variogram] a list of pylint errors were solved. Still enough left.
  • [binning] added 'stable_entropy' option that will optimize the lag class edges to be of comparable Shannon Entropy.

Version 0.4

22 Mar 08:22
Compare
Choose a tag to compare

Description

SciKit-Gstat is a scipy-styled analysis module for geostatistics. It includes two base classes Variogram and OrdinaryKriging. Additionally, various variogram classes inheriting from Variogram are available for solving directional or space-time related tasks. The module makes use of a rich selection of semi-variance estimators and variogram model functions while being extensible at the same time.

Documentation

Breaking change

There is one potetial breaking change compared to version 0.3.0: The lag_classes generator now yields empty arrays for unoccupied lag classes. This will result in NaN values for the semi-variance.

Changes since 0.3

Version 0.3.11

  • [binning] added stable_entropyoption that will optimize the lag class edges to be of comparable Shannon Entropy.
  • [Variogram] A new method is introduced to calculate fitting weights. Works for all but the manual fit method. By setting fit_sigma='entropy', the fitting weights will be adjusted according to the lag classes’ Shannon entropy. That will ignore lag classes of high uncertainty and emphasize lags of low uncertainty.

Version 0.3.10

  • [binning] added a median aggregation option to ward. This can be enabled by setting binning_agg_func=‘median’. The cluster centroids will be derived from the members median value, instead of mean value.
  • [Variogram] added fit_method='ml' - a maximum likelihood fitting procedure to fit the theoretical variogram to the experimental
  • [Variogram] added fit_method='manual'. This is a manual fitting method that takes the variogram parameters either at instantiation prefixed by fit_, or as keyword arguments by fit.
  • [Variogram] the manual fitting method will preseve the previous parameters, if the Variogram was fitted before and the fitting parameters are not manually overwritten.

Version 0.3.9

  • [binning] added kmeans and ward for forming non-equidistant lag classes based on a distance matrix clustering
  • [Kriging] Kriging now stores the last interpolated field as z. This is the first of a few changes in future releases, which will ultimately add some plotting methods to Kriging.

Version 0.3.8

  • [plotting] minor bugfixes in plotting routines (wrong arguments, pltting issues)
  • [docs] added a tutorial about plotting
  • [binning] added auto_derived_lags for a variety of different methods that find a good estimate for either the number of lag classes or the lag class width. These can be used by passing the method name as bin_func parameter: Freedman-Diaconis (‘fd’), Sturge’s rule (‘sturges’), Scott’s rule (‘scott’) and Doane’s extension to Sturge’s rule (‘doane’). Uses histogram_bin_edges <numpy.histogram_bin_edges> internally.

Version 0.3.7

  • [Variogram] now accepts arbitary kwargs. These can be used to further specify functional behavior of the class. As of Version 0.3.7 this is used to pass arguments down to the entropy and percentile estimators.
  • [Variogram] the describe now adds the init arguments by default to the output. The method can output the init params as a nested dict inside the output or flatten the output dict.

Version 0.3.6

  • [Variogram] some internal code cleanup. Removed some unnecessary loops
  • [Variogram] setting the n_lags property now correctly forces a recalculation of the lag groupings. So far they were kept untouches, which might result in old experimental variogram values for the changed instance. This is a potential breaking change.
  • [Variogram] The lag_classes generator now yields empty arrays for unoccupied lag classes. This will result in NaN values for the semi-variance. This is actually a bug-fix. This is a potential breaking change

Version 0.3.5

  • [plotting] The location_trend can now add trend model lines to the scatter plot for the ‘plotly’ backend and calculate the R² for the trend model.
  • [Variogram] the internal attribute holding the name of the current distance function was renamed from _dict_func to _dist_func_name

Version 0.3.4

  • [plotting] The scattergram functions color the plotted points with respect to the lag bin they are originating from. For matplotlib, this coloring is suppressed, but can activated by passing the argument scattergram(single_color=False).

Version 0.3.3

  • [plotting] a new submodule is introduced: skgstat.plotting. This contains all plotting functions. The plotting behavior is not changed, but using skgstat.plotting.backend(), the used plotting library can be switched from matplotlib to plotly
  • [stmodels] some code cleanup
  • [SpaceTimeVariogram] finally can fit the product-sum model to the experimental variogram

Version 0.3.2

  • [models] Matérn model now adapts effective range to smoothness parameter
  • [models] Matérn model documentation updated
  • [models] some minor updates to references in the docs

Version 0.3.1

  • [Variogram] - internal distance calculations were refactored, to speed things up
  • [Kriging] - internal distance calculations were refactored, to speed things up

Version 0.3

04 Oct 09:58
Compare
Choose a tag to compare

SciKit-Gstat is a scipy-styled analysis module for geostatistics. It includes two base classes Variogram and OrdinaryKriging. Additionally, various variogram classes inheriting from Variogram are available for solving directional or space-time related tasks. The module makes use of a rich selection of semi-variance estimators and variogram model functions while being extensible at the same time.

This version changed the DirectionalVariogram class quite substantially. The circular search area was removed, therefore shapely is not a dependency anymore and the variogram estimation for directional variograms got a performance gain of several magnitudes.

Version 0.2.8

18 Sep 07:17
Compare
Choose a tag to compare

SciKit-Gstat is a scipy-styled analysis module for geostatistics. It includes two base classes Variogram and OrdinaryKriging. Additionally, various variogram classes inheriting from Variogram are available for solving directional or space-time related tasks. The module makes use of a rich selection of semi-variance estimators and variogram model functions while being extensible at the same time.

This version changed some of the internal parameter settings and removed old, not working code. An interface to gstools CovModel was added, which is still experimental and untested.

Version 0.2.7

25 Nov 09:35
2a9ab55
Compare
Choose a tag to compare

SciKit-Gstat is a scipy-styled analysis module for geostatistics. It includes two base classes Variogram and OrdinaryKriging. Additionally, various variogram classes inheriting from Variogram are available for solving directional or space-time related tasks. The module makes use of a rich selection of semi-variance estimators and variogram model functions while being extensible at the same time.

This version increases the test coverage a bit and the documentation made progress. Besides some minor bug fixes, the main new feature of this version is the module skgstat.interfaces that collects interfaces to other packages. PyKrige and scikit-learn are available. GsTools will follow with next release.

Version 0.2.6

07 Nov 14:34
Compare
Choose a tag to compare

SciKit-Gstat is a scipy-styled analysis module for geostatistics. It includes two base classes Variogram and OrdinaryKriging. Additionally, various variogram classes inheriting from Variogram are available for solving
directional or space-time related tasks. The module makes use of a rich selection of semi-variance estimators and variogram model functions while being extensible at the same time.

Note that there are no unit tests for Kriging so far, and they are not documented. Kriging got some new keywords in this version and there are some strategies to increase performance or gain better results. The main bottleneck for performance is not handled yet (on purpose).
The Variogram.compiled_model function is deprecated and was replaced by the much faster Variogram.fitted_model.

Version 0.2.5

14 Mar 08:21
Compare
Choose a tag to compare

SciKit-Gstat is a scipy-styled analysis module for geostatistics. It includes
two base classes Variogram and OrdinaryKriging. Additionally, various
variogram classes inheriting from Variogram are available for solving
directional or space-time related tasks. The module makes use of a rich selection of semi-variance
estimators and variogram model functions, while being extensible at the same
time.

Note that there are no unit tests for Kriging so far and they are not documented. At the current stage, the Kriging is also not optimized for performance. It may change significantly in a future version.

Version 0.2.3

02 Nov 07:40
dccdbd2
Compare
Choose a tag to compare

[severe bug] A severe bug in Variogram.__vdiff_indexer was found and fixed. The iterator was indexing the Variogram._diff array different from Variogram.distance. This lead to wrong semivariance values for all versions > 0.1.8!. Fixed now.

Beside this major bug fix unit tests for parameter setting were added and fit_sigma setting of 'exp' was fixed.
The formula from e^(1 / x) to 1. - e^(1 / x) in order to increase with distance and, thus, give less weight to distant lag classes during fitting.