-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metadata file #202
Comments
Great idea:
|
|
For posterity, here's what Apple folks say about indicating Linux support in
I hope that SPI allows rich metadata for indicating platform support, not just a single |
Another metadata field could indicate whether a package is looking for funding/sponsorship with a link to the sponsorship page, which NPM currently supports, as far as I'm aware. Thing I'd quite like to see is security advisories, again from my experience with NPM. Interestingly enough, the Package Registry pitch (see the "Security auditing" section) suggests storing that data in the registry. It's an interesting question overall, whether you want to store some metadata in the package, in the registry or purely in the index? (cc @mattt) |
@MaxDesiatov How registries communicate and how SPM acts upon advisories is still up in the air. @daveverwer @finestructure A few things to consider with your metadata file:
|
Absolutely, our database design already stores metadata next to every version and we'd want this additional metadata to be stored in exactly the same way. I agree that the history of this is important.
This is fantastic, thank you @mattt! |
More metadata:
We need to think carefully about this. Is it a predefined list? Is it just any keyword the authors want to include? Is it supplemented by GitHub tags for a project? I think it'd be really nice to do a mix of both. Maybe have a broad category which is a fixed list, but then also let people include tags, and bring the GitHub tags into that data. |
In case it's not clear, I didn't mean both of those in a single field. They'd be separated. There are also exciting ideas around here for auto-categorisation based on:
I think figuring out the frameworks could lead to some interesting auto-categorisation though. |
Moving the conversation here as you requested, @daveverwer For me, the priorities are an array of tags, which are strings, probably kebab (which can be freeform or documented or whatever), and to a much lesser degree a single string abstract. Those two tweaks massively enhance discoverability and documentation.
Addressed with tag:
The hosting repo IMO is the home page.
Ditto
Tag:
Tag:
Tags:
Not a big fan of this
Tags
Tags: Sometimes a single good hammer is as good as a few dozen individual fields. |
Thank you Erica! I wasn't necessarily thinking all of these would be independent fields. I realise that I did start my original suggestions with data types, but it was more about capturing all the different things we might want to think about. Your point is very well taken though! |
Think about this too, try to search for my package |
What @erica's proposing here can be described as a Folksonomy. And short of developing a more comprehensive taxonomy / ontology, I agree that this would probably be the best-fit solution for the problems you're most interested in solving. You get 80% of the benefit of classification with 20% of the effort. @daveverwer @finestructure As you consider adopting this, I'd encourage you to take a look at that Wikipedia page for Folksonomies to understand the specific trade-offs you're making and challenges you're most likely to encounter. |
I don't have a huge amount of time right now, but just to chime in. If search were the only issue here, that'd be one thing, but it's about more than search. Being able to give rich, actionable data on the package pages will require more structure to some of these bits of metadata. |
Do you have a list of goals driving your need for metadata? For example, take the mention of listing Authors -- which I think has the worthy outcomes of being able to give due credit, being able to contact individuals who participated in the development, being able to see what an individual has contributed to. What would be the driving use of this specifically within the SPI project? |
Yes, there are two:
We want to bring as much of the information that's needed to judge the quality of a package into one place. For example, instead of having to check how many pull requests/issues there are and when the last one was closed, we bring that in automatically, right alongside information about what versions of Swift the package supports, and whether the stable release is the right one to target, or if there's actually a beta which would better suit your needs. All of that data so far is structured as it comes from the manifest, from GitHub, and from the repository itself. There's a place for unstructured/tag-based data, but I don't think it completely replaces the need for structure. We also want to use some of this structured data to drive a "quality score" for a package. I don't think it's clear yet whether this quality score is made public, or just used internally for search ranking (we have a version of this already) there are pros and cons to both. But, if metadata is just tag-based, it's much harder to do that. Especially when tags can be typed incorrectly or interpreted in different ways (do
This is absolutely one of the areas where I'd want structured metadata. Allowing names of people who have not necessarily committed code to be credited, to allow people to define a custom link so they can be credited in the way they'd like to be credited rather than just assuming they want a link to their GitHub profile. But it's more than that, we get platform data for Apple platforms from the package manifest, so if we are able to define Linux support as structured data rather than just a tag then we can place that information on the package page next to the Apple platforms rather than having the Apple platforms in one place, and Linux mixed in with the other tag data. That's not a comprehensive list of where I think structured data will be necessary, and I think we can make that decision when we have a better sense of what metadata people would find useful. I'd expect some of it to be represented as tags/searchable metadata We also need to be careful not to prematurely go towards everything being a tag, once people start filling in this file it's going to be hard to effect widespread changes to it, so I'd rather get it right before giving it a push in terms of getting it adopted. At the same time, we don't want this metadata file to be overly onerous to fill in. I think we'll find a balance. But starting with defining what data people might like to see feels like a good place to start. |
Then I think the most feasible way forward is to co-locate a metadata file with Package.swift (in the same level one normally finds README, CHANGELOG, LICENSE) whose structure we define and hope it takes off. I can see where tags alone aren't going to get you where you need to go, even though they are the path of least resistance for modifying the Package spec from the formal review process. |
As an aside, the Rust ecosystem's package index crates.io has a distinction between homepage, documentation and repository, which I personally find quite valuable. Of course many crates don't have a specific homepage or just list their repo there as well, but the option of pointing to a specific landing page for a project is quite comfortable. See the page for the popular crate serde for an example. |
Just a note here that I've done some clean up, aggregation and work on this today and started a new issue to track it - #435 Before launch, Sven and I were working mostly alone on this project, and chatting on a call most days. We were very much on the same page, so many of the issues here start with nothing more than a word or two. The first couple of posts in this issue are a perfect example of that, and they can be confusing to new people coming into the thread. Now that more people are involved those few words can seem short/abrupt as they have no context. That's why I'm making a new issue to clarify our original intent. I've linked relevant comments in this thread from that new issue. I'll close this issue, but please do feel free to move the conversation to #435. |
We've been referring to the need of a metadata file here and there so I thought I'd add a dumping ground for things we'd like it to include
The text was updated successfully, but these errors were encountered: