Skip to content
This repository has been archived by the owner on Jun 18, 2023. It is now read-only.

Config file algorithm specification #29

Closed
ceholden opened this issue Aug 9, 2015 · 1 comment
Closed

Config file algorithm specification #29

ceholden opened this issue Aug 9, 2015 · 1 comment

Comments

@ceholden
Copy link
Owner

ceholden commented Aug 9, 2015

Make way for more timeseries algorithms within module by changing configuration file to be able to point to many different algorithms:

  1. Add new submodule, algorithms
  2. Rename yatsm to ccdc and place into algorithms submodule. YATSM class to CCDCesque
  3. Change configuration file by adding algorithm key under YATSM section. The algorithm specified by algorithm key will be searched for as the section title from which to extract algorithm parameterization information.
  4. Add new YATSM section for options generic to all timeseries algorithms, like reverse or robust.
  5. Remove robust results and omission and commission tests from current YATSM (future, CCDCesque) and place into yatsm.algorithms.yatsm. These will be parameterized in YATSM metadata section.
  6. Add section for regression/predictive model method configuration (see Model prediction / regression selection #26)

Propose change example:

[metadata]
version = 0.5

[YATSM]
algorithm = CCDCesque
regression = LassoCV
design_matrix = 1 + x + harm(x, 1)
reverse = False
robust = False
commission_alpha = 
...

[CCDCesque]
consecutive = 5

[LassoCV]
pickle = somefile.pkl
...

It is very difficult to imagine specifying all arguments to a sklearn classifier or regression estimator via a config file. Things like n_alpha could play well, but how would we specify alphas = np.logspace(0.001, 30, 50)? This proposed format sidesteps these concerns by requiring that regression options provide a pickled file from sklearn.external.joblib that already contains the parameterization desired. If the pickle item is not provided, but the section is labeled, default to a pickle of an existing regression object packaged with yatsm.

Target v0.5.0 as milestone to coincide with another major rehaul (#28).

@ceholden ceholden added this to the v0.5.0 milestone Aug 9, 2015
ceholden added a commit that referenced this issue Aug 21, 2015
- YATSM is now a baseclass for code reuse (plotting, predictions, etc)
- YATSM also defines timeseries model interface resembling sklearn
    - `__init__` contains 'hyperparameters'
    - `fit` runs model; predict/plot/score methods for results
    - `__iter__` yields segment records over all segments
    - `__len__` defines how many segments in model
- Move comission/omission/robust re-fits to postprocess.py
- Temporarily breaks postprocessing in line/pixel CLI
@ceholden
Copy link
Owner Author

Also now using YAML! See #30

ceholden added a commit that referenced this issue Aug 21, 2015
ceholden added a commit that referenced this issue Aug 25, 2015
- Add min_values/max_values in place of valid_range
- CSV file has header
- Use Pandas to parse CSV (so add as requirement)
- Update examples
- Bump version
- Check for implementation of YATSM algorithm
- Put YATSM algo class in config
ceholden added a commit that referenced this issue Aug 25, 2015
ceholden added a commit that referenced this issue Aug 25, 2015
@ceholden ceholden closed this as completed Sep 9, 2015
ceholden added a commit that referenced this issue Sep 11, 2015
- fit_indices were never used; fit all of Y as it is passed for a reason
- pass `dates` to fit() rather than relying on ordinal dates in X
    - should be faster and less confusing
- design_info isn't needed anymore; remove tie to X
- test_indices lingers as not so hyper hyperparameter
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant