Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move less mature extensions to their own repo #946

Closed
2 of 10 tasks
cholmes opened this issue Jan 22, 2021 · 8 comments
Closed
2 of 10 tasks

Move less mature extensions to their own repo #946

cholmes opened this issue Jan 22, 2021 · 8 comments
Labels
prio: must-have required for release associated with
Milestone

Comments

@cholmes
Copy link
Contributor

cholmes commented Jan 22, 2021

(@m-mohr & @matthewhanson - I'm forgetting the exact conclusion we reached on this, so please sound in if I got it wrong).

Once we release 1.0 our aim is to have a very stable core spec that is not releasing very often. But we also want to have a robust ecosystem of extensions that continue to evolve. Currently most all extensions live in the same repository as the core spec, which would mean that we'd need to do a 'core' release for any young extension that improves.

The current idea to balance those two tensions is to move all less mature extensions (as measured by our extension maturity) to be in their own repository, and it can release more often (like every 2-3 months). The core spec will aim to release only once or twice a year.

We need to update the maturity of the extensions, with help from #652, so we can designate which ones are stable and will be in the main release.

  • Make template repo in stac-extensions org
  • Make a 'central' repo in stac-extensions org that is the main 'list' of non-core extensions
  • Update extensions repo in main readme to link to 'central' extensions repo, and the template repo, explain to people how to make their repo
  • Port over 3-4 extensions to make sure template is working right
  • remove all non-core extensions from dev branch
  • port over all extensions
  • put extension PR's (like cloud storage) into their own repos
  • (after 1.0.0-RC1 release) - make sure CI works, and probably release each extension so they can be used?
  • Review all the new extension repos to make sure they are in good shape after migration (someone other than the person who ported)
  • update code owners file
@jisantuc
Copy link
Contributor

I have a radical opinion that every extension should live in its own repo (forced versioning independent of changes Feels Bad ™️ to me), but certainly less mature extensions would be a great start. There's always an option to separate an extension that ends up with too much activity for the main spec after the fact or migrate things back to the main repo as they achieve maturity benchmarks.

@m-mohr
Copy link
Collaborator

m-mohr commented Jan 22, 2021

It could be a good idea to make a spac-spec-extensions repo, in this case we wouldn't have the shortcut issue as we would just deploy all from there. Not sure on the separate repo per extension, this would basically remove all shortcuts. Splitting them will lead to issues I think as some are loosing their shortcut. I can't remember the decision from the call, too.

@cholmes
Copy link
Contributor Author

cholmes commented Jan 25, 2021

@jisantuc - I don't think that's such a radical idea. But I think we're not yet ready to embrace it for the most mature extensions, as having the shortcut is really nice, and our hope is that the most stable extensions won't change too much. But yes, like you say, an extension could move out of the main spec. Or I could also see a new extension, like eo-plus or something that has more 'eo' fields.

@m-mohr and I discussed this today, and we think the path forward should be to make a stac-extensions organization in github, and then each extension can have its own repo. We'd ideally make a forkable template repo that has github actions set up to easily 'publish' an extension version to be used. I might just move all extensions to one repo in a stac-extensions org to start, and work to get them each in their own after 1.0, or at least after RC1.

@cholmes cholmes added the prio: must-have required for release associated with label Feb 1, 2021
@cholmes cholmes added this to the 1.0.0-RC.1 milestone Feb 1, 2021
@cholmes
Copy link
Contributor Author

cholmes commented Feb 4, 2021

Ok, the very first pass of the totals of implementations is some crawling @m-mohr did of catalogs listed in stacindex:

"projection" 5
"eo" 3
"view" 3
"pointcloud" 2
"datacube" 1
"landsat" 1

I started a spreadsheet to track / total up ones that stac index can't crawl:

Counting both of those brings our totals to:

eo: 8
projection: 7
view: 6
sat: 2
scientific: 2
label: 2
pointcloud: 2
sar: 2
datacube: 1

With our original extensions maturity, that puts eo, proj and view at 'stable' (6 or more), none at 'candidate' (3), and all the above at 'pilot'.

This seems quite reasonable and potentially easier than I was anticipating? I figured we'd discuss the ones on the 'edge', but was thinking there'd be more at 'candidate'. We may well be missing some catalogs, but I think we've got a majority of the ones that are at beta.2. My take on the ones at '2':

  • I think 'label' may have a bit more use that pops up (ground work uses it, there was a world bank project with it), but with Separate label extensions #938 I don't see an argument that it's 'stable'.
  • 'sat' seems useful, but most of the key satellite fields are covered in the core, and doesn't seem like these fields really need to be in the core
  • scientific - does seem useful, but not something everyone needs. I'd see this picking up steam, but I think it could evolve to a core extension later.
  • sar - also a super useful one, but it does feel like we're still figuring out exactly what goes in there, with like SAR backscatter measurement and convention #961 So would be weird to call it 'stable', but I think it perhaps gets there soon.

The other two to consider would be 'file' and 'processing'. Processing I do think needs more time to 'bake', though I think it's a very promising extension. File I think could be a good one, but I also see little harm in not having it in core, and it's a bit weird to have an almost fully 'new' one get in. It is a 'child' of checksum, and if checksum were widely adopted I'd see an argument to bring it in as stable. But it didn't show up in our totals at all.

To reiterate - stable extensions that are included in the repo should really be ones we don't anticipate moving. We don't want to keep doing releases of the core, so having most of the extensions outside of the core spec repo makes a lot of sense.

So I think we should just go ahead and make eo, projection and view as 'stable', in the core spec. And then move the rest out to a stac-extensions organization. I think for ease, to get the core spec out, I may just fork the whole extensions folder into one repo, and then we can work to make each one its own repo.

@cholmes
Copy link
Contributor Author

cholmes commented Feb 4, 2021

We had been thinking we'd put a 'call' to have people submit their catalogs and make a case for certain extensions they really want to see in the core. But seeing these numbers I'm less inclined to do that.

It was a bit of a pain to total up the numbers. I think we should consider trying to utilize https://github.com/stac-utils/stac-examples as the place where extension maturity gets 'counted', at least until we have some tool that magically crawls all catalogs and gives us total counts.

@m-mohr
Copy link
Collaborator

m-mohr commented Feb 5, 2021

I think I mostly agree with the exception is that this only talks about Items. I still think it's a good idea to ask people to submit their private implementations as examples. That would give a better overview as private implementations can never be crawled. Also, in openEO for example we generate STACs for processed data, but of course, this can't be crawled as this is user-data and not public by default. Thus, for example, there are in principle seven "timestamp" extensions implementations that don't show up here. (Not saying that this should be stable though, I'm counting openEO implementations as a single group for now).

For Catalogs/Collections there are also extensions, which may also increase the counts above. All counts are based mostly on STAC Index.

  • scientific works also for Collections and has at least two additional implementations (GEE, openEO) there, which would make it a candidate.
  • Collection Assets seems pretty stable, I can't think of anything that should ever change in this extension (as long as Item Assets are stable). There are a number of implementations, at least GEE, openEO, SpaceBel, the zarr folks have built on top of it, ...
  • Item Asset Definition seems like a good candidate with 3 implementations.
  • Data Cube has 8 implementations, but 7 of them are through openEO. Also, there's Data Cube Extension: Variables and more #713, which seems to be required to be more general.
  • Version has one implementation. It has fields that are required at a later stage (deprecation, version numbers) so it's somewhat clear that it has not so many implementations yet.

@cholmes
Copy link
Contributor Author

cholmes commented Feb 5, 2021

Ah, good point on collections. I think we should include at least one catalog/collection extension, to make clear that's 'a thing'. I think scientific plus collection assets make sense. Scientific if the collections tip it over is great. And I agree collections assets is quite valuable. There's maybe an argument to make it 'core' though if it's stable enough to be a core extension...

@cholmes
Copy link
Contributor Author

cholmes commented Feb 5, 2021

We made some decisions about this on the call.

  • scientific will become a 'core' extension, and will be the one that is used at the collection level.
  • Collection assets we decided should be part of core, as it has proven itself as an extension. We decided it was stable enough, well used. But it wouldn't go into 'common metadata', as its not really a content extension - its a mechanism for collections. So we decided it's better to just get it in core, as it'd be a bit weird to migrate it.
  • We talked about Item Assets, as it is somewhat used. But we decided we'd like to see a bit more adoption. Planet and potentially microsoft will check it out - we felt it might need some changes to accommodate different use cases (while collection assets we couldn't think of what would actually change or was in need of clarification, and it has wider adoption).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
prio: must-have required for release associated with
Projects
None yet
Development

No branches or pull requests

3 participants