Skip to content

Releases: coralnet/pyspacer

0.10.0

07 Sep 01:45
ec420c6
Compare
Choose a tag to compare
  • AWS credentials can now be obtained through the following methods, in addition to spacer config values as before:

    • AWS's metadata service (STS); based on proof of concept from @michaelconnor00
    • boto's auto-detection logic when neither of STS or spacer config are used (this was intended to work before, but needed fixing)
  • Updates to pip-install dependencies:

    • numpy: >=1.19 to >=1.21.4,<2
    • boto3: nothing to >=1.26.0

0.9.0

26 Mar 21:14
2ebf61a
Compare
Choose a tag to compare
  • Python 3.8 and 3.9 support have been dropped; Python 3.11 support has been added.

  • torch and torchvision accepted versions have been relaxed to accommodate Python 3.11. (torch==1.13.1 to torch>=1.13.1,<2.3; torchvision==0.14.1 to torchvision>=0.14.1,<0.18)

  • task_utils.preprocess_labels() now has three available modes on how to split training annotations between train, ref, and val sets. Differences between the three modes - VECTORS, POINTS, and POINTS_STRATIFIED - are explained in the SplitMode Enum's comments. Additionally, all three modes now ensure that the ordering of the given training data has no effect on which data goes into train, ref, and val.

    The table below compares the three modes to the splitting functionality of earlier versions of pyspacer. Note that it's still possible to split train/ref/val yourself instead of letting pyspacer do it.

    Mode Sets split in pyspacer Order agnostic Vectors can be split Stratifies by label
    0.6.1 and earlier Train/ref No No No
    0.7.0 - 0.8.0 Train/ref/val No No No
    VECTORS Train/ref/val Yes No No
    POINTS Train/ref/val Yes Yes No
    POINTS_STRATIFIED Train/ref/val Yes Yes Yes
  • The train_classifier task now accepts label IDs as either integers or strings, not just integers.

  • The train_classifier task is now able to locally cache feature vectors which were loaded from remote storage, which can greatly speed up training from epoch 2 onward. This is optional and enabled by default; the location of the cache directory is also configurable.

0.8.0

29 Jan 09:36
6180550
Compare
Choose a tag to compare
  • ImageFeatures with valid_rowcol=False are no longer supported for training. For now they are still supported for classification.

  • S3 downloads are now always performed in the main thread, to prevent RuntimeError: cannot schedule new futures after interpreter shutdown.

  • S3Storage and storage_factory() now use the parameter name bucket_name instead of bucketname to be consistent with other usages in pyspacer (by @yeelauren).

  • URLStorage downloads and existence checks now have an explicit timeout of 20 seconds (this is a timeout for continuous unresponsiveness, not for the whole response).

  • EfficientNet feature extraction now uses CUDA if available (by @yeelauren).

  • Updates to pip-install dependencies:

    • Pillow: >=10.0.1 to >=10.2.0

0.7.0

04 Jan 03:12
77d9f8b
Compare
Choose a tag to compare
  • TrainClassifierMsg labels arguments have changed. Instead of train_labels and val_labels, it now takes a single argument labels, which is a TrainingTaskLabels object (basically a set of 3 ImageLabels objects: training set, reference set, and validation set).

  • The new function task_utils.preprocess_labels() can be called in advance of building a TrainClassifierMsg, to 1) split a single ImageLabels instance into reasonably-proportioned train/ref/val sets, 2) filter labels to only a desired set of classes, and 3) run error checks.

  • Removed MIN_TRAINIMAGES config var. Minimum number of images for training is now 1 train set, 1 ref set, and 1 val set; or 3 total if leaving the split to pyspacer.

  • Added LOG_DESTINATION and LOG_LEVEL config vars, providing configurable logging for test-suite runs or quick scripts.

  • Logging statements throughout pyspacer's codebase now use module-name loggers rather than the root logger, allowing end-applications to keep their logs organized.

  • Fixed bug where int config vars couldn't be configured through environment vars or secrets.json.

  • Updated various error cases (mainly SpacerInputErrors, asserts, and ValueErrors) with more descriptive error classes. The SpacerInputError class is no longer available.

0.6.1

11 Nov 00:32
510e9a6
Compare
Choose a tag to compare
  • In 0.5.0, the hash check when loading a feature extractor was broken in two ways. First, it got an error when trying to check the hash. Second, if the hash check failed for a remote-loaded extractor file, then a second attempt at loading would still allow extraction to proceed. This release fixes both problems.

0.6.0

09 Nov 08:34
6428514
Compare
Choose a tag to compare
  • Fixed DummyExtractor constructor so that data_locations defaults to an empty dict, not an empty list. This fixes serialization of an ExtractFeaturesMsg containing DummyExtractor.

  • Updates to pip-install dependencies:

    • Pillow: >=9.0.1 to >=10.0.1

0.5.0

02 Oct 20:47
b35efa8
Compare
Choose a tag to compare
  • Generalized feature extractor support by allowing use of any FeatureExtractor subclass instance, and extractor files loaded from anywhere (not just from CoralNet's S3 bucket, which requires CoralNet auth).

  • In ExtractFeaturesMsg and ClassifyImageMsg, the parameter feature_extractor_name (a string) has been replaced with extractor (a FeatureExtractor instance).

  • In ExtractFeaturesReturnMsg, model_was_cached has been replaced by extractor_loaded_remotely, because now filesystem-caching doesn't apply to some extractor files (they may originally be from the filesystem).

  • Config variable LOCAL_MODEL_PATH is now EXTRACTORS_CACHE_DIR. This is now used by any remote-loaded (S3 or URL based) extractor files. If extractor files are loaded from the filesystem, then it's now possible to run PySpacer without defining any config variable values.

  • Added AWS_REGION config var, which is now required for S3 usage.

  • Added TEST_EXTRACTORS_BUCKET and TEST_BUCKET config vars for unit tests, but these are not really usable by anyone besides core devs at the moment.

  • Some raised errors' types have changed to PySpacer's own ConfigError or HashMismatchError, and there are cases where error-raising semantics/timing have changed slightly.

0.4.1

05 Aug 03:20
1b092b6
Compare
Choose a tag to compare
  • Allow configuration of MAX_IMAGE_PIXELS, MAX_POINTS_PER_IMAGE, and MIN_TRAINIMAGES.

  • Previously, if secrets.json was present but missing a config value, then pyspacer would go on to look for that config value in Django settings. This is no longer the case; pyspacer now only respects at most one of secrets.json or Django settings (secrets take precedence).

  • Update repo URL from beijbom/pyspacer to coralnet/pyspacer.

First EfficientNet integrated.

07 May 03:21
b106de7
Compare
Choose a tag to compare

Code refactored from ground up. Added first EfficientNet extractor.