Releases: coralnet/pyspacer
0.10.0
-
AWS credentials can now be obtained through the following methods, in addition to spacer config values as before:
- AWS's metadata service (STS); based on proof of concept from @michaelconnor00
- boto's auto-detection logic when neither of STS or spacer config are used (this was intended to work before, but needed fixing)
-
Updates to pip-install dependencies:
- numpy: >=1.19 to >=1.21.4,<2
- boto3: nothing to >=1.26.0
0.9.0
-
Python 3.8 and 3.9 support have been dropped; Python 3.11 support has been added.
-
torch and torchvision accepted versions have been relaxed to accommodate Python 3.11. (torch==1.13.1 to torch>=1.13.1,<2.3; torchvision==0.14.1 to torchvision>=0.14.1,<0.18)
-
task_utils.preprocess_labels()
now has three available modes on how to split training annotations between train, ref, and val sets. Differences between the three modes -VECTORS
,POINTS
, andPOINTS_STRATIFIED
- are explained in theSplitMode
Enum's comments. Additionally, all three modes now ensure that the ordering of the given training data has no effect on which data goes into train, ref, and val.The table below compares the three modes to the splitting functionality of earlier versions of pyspacer. Note that it's still possible to split train/ref/val yourself instead of letting pyspacer do it.
Mode Sets split in pyspacer Order agnostic Vectors can be split Stratifies by label 0.6.1 and earlier Train/ref No No No 0.7.0 - 0.8.0 Train/ref/val No No No VECTORS Train/ref/val Yes No No POINTS Train/ref/val Yes Yes No POINTS_STRATIFIED Train/ref/val Yes Yes Yes -
The
train_classifier
task now accepts label IDs as either integers or strings, not just integers. -
The
train_classifier
task is now able to locally cache feature vectors which were loaded from remote storage, which can greatly speed up training from epoch 2 onward. This is optional and enabled by default; the location of the cache directory is also configurable.
0.8.0
-
ImageFeatures
withvalid_rowcol=False
are no longer supported for training. For now they are still supported for classification. -
S3 downloads are now always performed in the main thread, to prevent
RuntimeError: cannot schedule new futures after interpreter shutdown
. -
S3Storage
andstorage_factory()
now use the parameter namebucket_name
instead ofbucketname
to be consistent with other usages in pyspacer (by @yeelauren). -
URLStorage
downloads and existence checks now have an explicit timeout of 20 seconds (this is a timeout for continuous unresponsiveness, not for the whole response). -
EfficientNet feature extraction now uses CUDA if available (by @yeelauren).
-
Updates to pip-install dependencies:
- Pillow: >=10.0.1 to >=10.2.0
0.7.0
-
TrainClassifierMsg
labels arguments have changed. Instead oftrain_labels
andval_labels
, it now takes a single argumentlabels
, which is aTrainingTaskLabels
object (basically a set of 3ImageLabels
objects: training set, reference set, and validation set). -
The new function
task_utils.preprocess_labels()
can be called in advance of building a TrainClassifierMsg, to 1) split a single ImageLabels instance into reasonably-proportioned train/ref/val sets, 2) filter labels to only a desired set of classes, and 3) run error checks. -
Removed
MIN_TRAINIMAGES
config var. Minimum number of images for training is now 1 train set, 1 ref set, and 1 val set; or 3 total if leaving the split to pyspacer. -
Added
LOG_DESTINATION
andLOG_LEVEL
config vars, providing configurable logging for test-suite runs or quick scripts. -
Logging statements throughout pyspacer's codebase now use module-name loggers rather than the root logger, allowing end-applications to keep their logs organized.
-
Fixed bug where int config vars couldn't be configured through environment vars or secrets.json.
-
Updated various error cases (mainly SpacerInputErrors, asserts, and ValueErrors) with more descriptive error classes. The
SpacerInputError
class is no longer available.
0.6.1
- In 0.5.0, the hash check when loading a feature extractor was broken in two ways. First, it got an error when trying to check the hash. Second, if the hash check failed for a remote-loaded extractor file, then a second attempt at loading would still allow extraction to proceed. This release fixes both problems.
0.6.0
-
Fixed
DummyExtractor
constructor so thatdata_locations
defaults to an empty dict, not an empty list. This fixes serialization of anExtractFeaturesMsg
containingDummyExtractor
. -
Updates to pip-install dependencies:
- Pillow: >=9.0.1 to >=10.0.1
0.5.0
-
Generalized feature extractor support by allowing use of any
FeatureExtractor
subclass instance, and extractor files loaded from anywhere (not just from CoralNet's S3 bucket, which requires CoralNet auth). -
In
ExtractFeaturesMsg
andClassifyImageMsg
, the parameterfeature_extractor_name
(a string) has been replaced withextractor
(aFeatureExtractor
instance). -
In
ExtractFeaturesReturnMsg
,model_was_cached
has been replaced byextractor_loaded_remotely
, because now filesystem-caching doesn't apply to some extractor files (they may originally be from the filesystem). -
Config variable
LOCAL_MODEL_PATH
is nowEXTRACTORS_CACHE_DIR
. This is now used by any remote-loaded (S3 or URL based) extractor files. If extractor files are loaded from the filesystem, then it's now possible to run PySpacer without defining any config variable values. -
Added
AWS_REGION
config var, which is now required for S3 usage. -
Added
TEST_EXTRACTORS_BUCKET
andTEST_BUCKET
config vars for unit tests, but these are not really usable by anyone besides core devs at the moment. -
Some raised errors' types have changed to PySpacer's own
ConfigError
orHashMismatchError
, and there are cases where error-raising semantics/timing have changed slightly.
0.4.1
-
Allow configuration of
MAX_IMAGE_PIXELS
,MAX_POINTS_PER_IMAGE
, andMIN_TRAINIMAGES
. -
Previously, if
secrets.json
was present but missing a config value, then pyspacer would go on to look for that config value in Django settings. This is no longer the case; pyspacer now only respects at most one of secrets.json or Django settings (secrets take precedence). -
Update repo URL from
beijbom/pyspacer
tocoralnet/pyspacer
.
First EfficientNet integrated.
Code refactored from ground up. Added first EfficientNet extractor.