Releases: opendatacube/datacube-core
1.6.0 (23 August 2018)
- Enable use of aliases when specifying band names
- Fix ingestion failing after the first run #510
- Docker images now know which version of ODC they contain #523
- Fix data loading when
nodata
isNaN
#531 - Allow querying based on python
datetime.datetime
objects. #499 - Require rasterio 1.0.2 or higher, which fixes several critical bugs when loading and reprojecting from multi-band files.
- Assume fixed paths for
id
andsources
metadata fields #482 datacube.model.Measurement
was put to use for loading in attributes and made to inherit fromdict
to preserve current behaviour. #502- Updates when indexing data with
datacube dataset add
(See #485, #451 and #480)- Allow indexing without lineage
datacube dataset add --ignore-lineage
- Removed the
--sources-policy=skip|verify|ensure
. Instead use--[no-]auto-add-lineage
and--[no-]verify-lineage
- New option
datacube dataset add --exclude-product
<name>
allows excluding some products from auto-matching
- Allow indexing without lineage
- Preliminary API for indexing datasets #511
- Enable creation of MetadataTypes without having an active database connection #535
v1.6rc2 (29 June 2018)
Backwards Incompatible Changes
- The
helpers.write_geotiff()
function has been updated to support
files smaller than 256x256. It also no longer supports specifying
the time index. Before passing data in, use
xarray_data.isel(time=<my_time_index>)
. (#277) - Removed product matching options from
datacube dataset update
(#445). No matching is needed in this case as all datasets are
already in the database and are associated to products. - Removed
--match-rules
option fromdatacube dataset add
(#447) - The seldom-used
stack
keyword argument has been removed from
Datcube.load
. (#461) - The behaviour of the time range queries has changed to be compatible
with standard Python searches (eg. time slice an xarray). Now the
time range selection is inclusive of any unspecified time units.
(#440)-
Example 1:
time=('2008-01', '2008-03')
previously would have returned all
data from the start of 1st January, 2008 to the end of 1st of
March, 2008. Now, this query will return all data from the start
of 1st January, 2008 and 23:59:59.999 on 31st of March, 2008. -
Example 2:
To specify a search time between 1st of January and 29th of
February, 2008 (inclusive), use a search query like
time=('2008-01', '2008-02')
. This query is equivalent to using
any of the following in the second time element:('2008-02-29')
('2008-02-29 23')
('2008-02-29 23:59')
('2008-02-29 23:59:59')
('2008-02-29 23:59:59.999')
-
Changes
-
A
--location-policy
option has been added to thedatacube dataset update
command. Previously this command would always add a new
location to the list of URIs associated with a dataset. It's now
possible to specifyarchive
andforget
options, which will mark
previous location as archived or remove them from the index
altogether. The default behaviour is unchanged. (#469) -
The masking related function
describe_variable_flags()
now returns
a pandas DataFrame by default. This will display as a table in
Jupyter Notebooks. (#422) -
Usability improvements in
datacube dataset [add|update]
commands
(#447, #448, #398)- Embedded documentation updates
- Deprecated
--auto-match
(it was always on anyway) - Renamed
--dtype
to--product
(the old name will still work,
but with a warning) - Add option to skip lineage data when indexing (useful for saving
time when testing) (#473)
-
Enable compression for metadata documents stored in NetCDFs
generated bystacker
andingestor
(#452) -
Implement better handling of stacked NetCDF files (#415)
- Record the slice index as part of the dataset location URI,
using#part=<int>
syntax, index is 0-based - Use this index when loading data instead of fuzzy searching by
timestamp - Fall back to the old behaviour when
#part=<int>
is missing and
the file is more than one time slice deep
- Record the slice index as part of the dataset location URI,
-
Expose the following dataset fields and make them searchable:
indexed_time
(when the dataset was indexed)indexed_by
(user who indexed the dataset)creation_time
(creation of dataset: when it was processed)label
(the label for a dataset)
(See #432 for more details)
Bug Fixes
- The
.dimensions
property of a product no longer crashes when
product is missing agrid_spec
. It instead defaults totime,y,x
- Fix a regression in
v1.6rc1
which made it impossible to run
datacube ingest
to create products which were defined in1.5.5
and earlier versions of ODC. (#423, #436) - Allow specifying the chunking for string variables when writing
NetCDFs (#453)
1.6rc1 Easter Bilby (10 April 2018)
v1.6rc1 Easter Bilby (10 April 2018)
This is the first release in a while, and so there’s a lot of changes, including
some significant refactoring, with the potential having issues when upgrading.
Backwards Incompatible Fixes
- Drop Support for Python 2. Python 3.5 is now the earliest supported Python version.
- Removed the old
ndexpr
,analytics
andexecution engine
code. There is work underway in the execution engine branch to replace these features.
Enhancements
-
Support for third party drivers, for custom data storage and custom index implementations
-
The correct way to get an Index connection in code is to use
datacube.index.index_connect()
. -
Changes in ingestion configuration
-
Must now specify the Data Write Plug-ins to use. For s3 ingestion there was a top level
container
specified, which has been renamed and moved understorage
. The entirestorage
section is passed through to the Data Write Plug-ins, so drivers requiring other configuration can include them here. eg:... storage: ... driver: s3aio bucket: my_s3_bucket ...
-
-
Added a
Dockerfile
to enable automated builds for a reference Docker image. -
Multiple environments can now be specified in one datacube config. See PR 298 and the Runtime Config
- Allow specifying which
index_driver
should be used for an environment.
- Allow specifying which
-
Command line tools can now output CSV or YAML. (Issue issue 206, PR 390)
-
Support for saving data to NetCDF using a Lambert Conformal Conic Projection (PR 329)
-
Lots of documentation updates:
- Information about Bit Masking.
- A description of how data is loaded.
- Some higher level architecture documentation.
- Updates on how to index new data.
Bug Fixes
- Allow creation of
datacube.utils.geometry.Geometry
objects from 3d representations. The Z axis is simply thrown away. - The
datacube --config_file
option has been renamed todatacube --config
, which is shorter and more consistent with the other options. The old name can still be used for now. - Fix a severe performance regression when extracting and reprojecting a small region of data. (PR 393)
- Fix for a somewhat rare bug causing read failures by attempt to read data from a negative index into a file. (PR 376)
- Make
CRS
equality comparisons a little bit looser. Trust either a Proj.4 based comparison or a GDAL based comparison. (Closed issue 243)
New Data Support
1.5.5
- Fixes to package dependencies. No code changes.
1.5.4
-
Minor features backported from 2.0:
-
Support for
limit
in searches -
Alternative lazy search method
find_lazy
-
-
Fixes:
-
Improve native field descriptions
-
Connection should not be held open between multi-product searches
-
Disable prefetch for celery workers
-
Support jsonify-ing decimals
-
1.5.3
-
Use
cloudpickle
as thecelery
serialiser -
Allow
celery
tests to run without installing it -
Move
datacube-worker
inside the main datacube package -
Write
metadata_type
from the ingest configuration if available -
Support config parsing limitations of Python 2
-
Fix #303: resolve GDAL build dependencies on Travis
-
Upgrade
rasterio
to newer version
1.5.2
New Features
- Support for AWS S3 array storage
- Driver Manager support for NetCDF, S3, S3-file drivers.
Usability Improvements
- When
datacube dataset add
is unable to add a Dataset to the index, print
out the entire Dataset to make it easier to debug the problem. - Give
datacube system check
prettier and more readable output. - Make
celery
andredis
optional when installing. - Significantly reduced disk space usage for integration tests
Dataset
objects now have anis_active
field to mirroris_archived
.- Added
index.datasets.get_archived_location_times()
to see when each
location was archived.
Bug Fixes
- Fix bug when reading data in native projection, but outside
source
area. Often hit when runningdatacube-stats
- Fix error loading and fusing data using
dask
. (Fixes #276) - When reading data, implement
skip_broken_datasets
for thedask
case too - Fix bug #261. Unable to load Australian Rainfall Grid Data. This was as a
result of the CRS/Transformation override functionality being broken when
using the latestrasterio
version1.0a9
1.4.1
-
Support for reading multiband HDF datasets, such as MODIS collection 6
-
Workaround for rasterio issue when reprojecting stacked data
-
Bug fixes for command line arg handling
1.4.0
- Adds more convenient year/date range search expressions (see #226)
- Adds a simple replication utility (see #223)
- Fixed issue reading products without embedded CRS info, such as bom_rainfall_grid (see #224)
- Fixed issues with stacking and ncml creation for NetCDF files
- Various documentation and bug fixes
- Added CircleCI as a continuous build system, for previewing generated documenation on pull
- Require xarray >= 0.9. Solves common problems caused by losing embedded flag_def and crs attributes.
1.3.2
- Docs now refer to "Open Data Cube".
- Docs describe how to use
conda
to install datacube. - Bug fixes for the stacking process.
- Minor model changes:
- Various other bug fixes and document updates.