Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider removing ability to filter on file extensions #74

Closed
mgiuca opened this issue Nov 21, 2018 · 39 comments
Closed

Consider removing ability to filter on file extensions #74

mgiuca opened this issue Nov 21, 2018 · 39 comments

Comments

@mgiuca
Copy link
Collaborator

mgiuca commented Nov 21, 2018

See discussion starting on this Chromium code review. Pasting relevant quotes:

@mounirlamouri:

Interesting. I don't think we should actually implement this. Using a file extension for the accept field seems to be against the current platform patterns. I understand that this is trying to follow the accept attribute from forms (ie. https://html.spec.whatwg.org/multipage/input.html#attr-input-accept) but this is carrying baggage from another decade.

@ericwilligers:

I collected the following stats while working on File Handling:

For some years, Chrome packaged apps have been able to register one or more file handlers using a Chrome-specific API where each handler may handle specific MIME types and/or file extensions. This provides some insight into the demand for such functionality. As of August 2018, 619 file handlers handle MIME types, while 509 file handlers handle file extensions. Some of the packages apps have more than one file handler. Overall, 580 packaged apps handle MIME types and 337 packaged apps handle file extensions. This usage, and the use cases above, demonstrate the value of being able to associate web applications with MIME types and/or file extensions.

@mgiuca:

In addition to the data, I think this is absolutely necessary, because there is no standard / reliable way to translate a file name to a MIME type.

If you're sharing content with a MIME type (e.g. it comes from a web server served with a Content-Type), then we use the MIME type matching. But if you're sharing content without a MIME type (e.g. you right-click a file in Windows Explorer -> Share To...), there isn't a reliable way for us to convert ".docx" into an appropriate MIME type. Therefore, the recommended behaviour for a site is to accept both the common MIME types and the common file extensions for that type of file.

This isn't anything to do with carrying over baggage from the "accept" attribute (in fact, we designed this option before we realised that "accept" existed in HTML forms).

My position remains the same as in the above quote.

@mounirlamouri
Copy link
Member

My apologies for assuming it came from the HTML Forms. They looked very similar.

I would rather have the callers convert the file extension to a guessed mime type rather than supporting file extensions. For example, a docx file could be converted to a mime type and be sent to the web app. Why requiring support for file extensions in the entire pipeline?

I also wonder if we could look at what Android is doing. My understanding is that the share api isn't very far from the intent mechanism in Android and it seems that it's mostly based on mime types, isn't it?

@mgiuca
Copy link
Collaborator Author

mgiuca commented Dec 12, 2018

Android intent filters do seem to only deal with MIME types, not file extensions. However, Android is much less focused on files in general (and editors for particular file types) than a desktop environment, and that is essentially what we're trying to build. (Even though Web Share Target isn't particularly desktop-focused, we will be with File Handlers which will use the same syntax.)

I would rather have the callers convert the file extension to a guessed mime type rather than supporting file extensions. For example, a docx file could be converted to a mime type and be sent to the web app.

The problem with this approach is that it centralizes registration of new file types, and also hands control of it to individual user agents (not standardized). There are two key issues here:

  1. An app can't invent its own file extension without first having it registered with all the operating systems and/or user agents. If I want to create a 3D modelling app called "Sketchly" with its own custom object format ".sket", there would be no way to associate ".sket" files with my app, unless I got an "application/sketchly" file type registered in enough places that I could be confident that the OS or UA will guess that the MIME type of a ".sket" file is "application/sketchly". Remember, most file systems have no intrinsic concept of a MIME type, so we would be relying on the guessing for core functionality.
  2. Even for very popular file extensions, like ".docx", we still have to rely on non-standard mapping to "application/vnd.openxmlformats-officedocument.wordprocessingml.document". It's possible that many systems have this in their mapping table, but this isn't in a web standard. We won't be able to guarantee that your web app can open ".docx" files, only that it will be able to open MS Word files if your OS or UA knows about that format. The alternative is that we create a web standards document that precisely specifies the mapping table, but then we will become the bottleneck for adding new file formats.

Why requiring support for file extensions in the entire pipeline?

It's not the "entire pipeline". We are proposing a very simple thing, which is a suffix match on the filename.

@mounirlamouri
Copy link
Member

I don't know what other browsers do but when you open foo.ext, Chrome will first try to find if it is aware of the MIME type associated with ".ext". If it can't find anything, it will ask the system if it is aware of something. All the work done after that is based on the MIME type. I wonder if it would be more interesting if in order to solve 1. above, we may want to add an API that registers a MIME/extension association. This may have benefits larger than just this API.

Regarding 2., as mentioned, it sounds that this is already what Chrome relies on. I would expect that this is also what Android does because in the Files application, it must convert the file name into a MIME type in order to call the right VIEW intent. I do not know if Android has an API to add extension/MIME type associations though.

Something worth considering too is if we allow extensions and mime types, we would need to support both on all platforms. With mime types being first class citizens on Android, do we know if we can implement a decent extension-based filtering on the platform?

@ewilligers
Copy link
Collaborator

Something worth considering too is if we allow extensions and mime types, we would need to support both on all platforms.

Implementations can do what makes sense on each platform.

"An implementation MUST support filtering on MIME types or filtering on file extensions, or both."

@ewilligers
Copy link
Collaborator

I wonder if it would be more interesting if in order to solve 1. above, we may want to add an API that registers a MIME/extension association. This may have benefits larger than just this API.

This is also being worked on - the explainer was presented to the Service Worker and Web Platform working groups at TPAC. This allows installed web applications to register themselves to handle specific MIME types and/or file extensions.

I will be sending an Intent to Implement soon.

@mgiuca
Copy link
Collaborator Author

mgiuca commented Dec 20, 2018

This is also being worked on - the explainer was presented to the Service Worker and Web Platform working groups at TPAC. This allows installed web applications to register themselves to handle specific MIME types and/or file extensions.

I think what @mounirlamouri means is not an API for registering an app as a MIME type handler, but an API for registering a {file extension, MIME type} association pair, so that instead of having extensions here, we just have MIME types here and rely on the dynamically updated mapping table to map extensions to MIMEs.

I don't see how that fits here, though, or how it would work in general. Would this be a global mapping (where the user agent keeps track of all the associations in a table, and sites can add new entries)? Or per-site? If it's a global mapping, I don't like it, because it means that the way share targets / file handlers behave on site A is influenced by whether you've previously been to site B. If it's per-site, then I don't really see the point of adding the extra layer of indirection when you could just say "this share target handles files with these extensions."

Regarding 2., as mentioned, it sounds that this is already what Chrome relies on. I would expect that this is also what Android does because in the Files application, it must convert the file name into a MIME type in order to call the right VIEW intent. I do not know if Android has an API to add extension/MIME type associations though.

This argument is basically "Android apps that accept specific MIME types must rely on a proprietary ext -> MIME table in the Files app, therefore why can't Web apps to too?" Answer: Because it's the Web and we shouldn't rely on any proprietary table for apps to work properly. If you write an Android app, you only have to make sure that your file extension is mapped correctly by the Android files app in order to make sure it works. With a Web Share Target, you'd have to make sure it's mapped correctly by all the OS/user agents' mapping tables, and there would be no guarantee of it mapping correctly on all platforms.

Further, Android apps using the share system aren't likely to invent their own new file extensions since sharing is all about interop between apps. Whereas when we do file handlers, that's much more likely since you might just want to associate your own proprietary file type with your app (not something you really have on Android).

@mounirlamouri
Copy link
Member

Something worth considering too is if we allow extensions and mime types, we would need to support both on all platforms.

Implementations can do what makes sense on each platform.

"An implementation MUST support filtering on MIME types or filtering on file extensions, or both."

I think it's incorrect to say that an implement must support one OR the other. An implementation must supports BOTH. Because it would be a valid for a page to only support [".mp4"] or only ["video/mp4"]. In order to allow implementations to support one OR the other, we would need to change the spec in a way that somehow entries would have to have an extension and a MIME type representation. It would still make compat hard but less hard.

@ewilligers
Copy link
Collaborator

An implementation must supports BOTH. ... It would still make compat hard but less hard.

No, the spec isn't written to achieve compat in which share targets are presented to a user, e.g.

The user agent MAY automatically register all web share targets as the user visits the site, but it is RECOMMENDED that more discretion is applied, to avoid overwhelming the user with the choice of a large number of targets.

We could emphasize with a non-normative note that if a web app supports only [".mp4"] or only ["video/mp4"], some implementations will not be able to include the app when presenting the user with a choice of share targets.

@ewilligers
Copy link
Collaborator

We could require each files entry to specify at least one mime type if it specifies any extensions, and to specify at least one extension if it specifies any MIME type more specific than */*. This would guide web authors towards supporting the widest range of implementations/platforms.

@mounirlamouri
Copy link
Member

Would it be reasonable to have something like:

{
  [ contentType: "video/mp4", extension: "mp4" ],
  [ contentType: "audio/mp3", extension: "mp3" ],
}

With a format like this, we could guarantee that each entry has at least one valid content type and a valid mime type (named contentType for consistency with other specs). It's a bit heavy handed, I realise :)

@ewilligers
Copy link
Collaborator

{
  [ contentType: "video/mp4", extension: "mp4" ],
  [ contentType: "audio/mp3", extension: "mp3" ],
}

There is no obvious extension for applications supporting */* or type/*

@mgiuca
Copy link
Collaborator Author

mgiuca commented Jan 3, 2019

If we went this route, I assume "/" would implicitly map to extension "*" without any need to specify it.

For "type/", we'd want to let you specify a list of extensions. For example, if your app is accepting "image/", on systems that don't know about MIME types, you'd want that to translate to ["gif", "jpg", "jpeg", "png"] and possibly other image formats your app supports.

@raymeskhoury
Copy link
Collaborator

@mounirlamouri @ericwilligers

Please correct me if I'm wrong, but I think that we're trying to solve a few problems here:

  1. To head toward a web platform that primary supports mime types (I think you're saying this is the direction other platforms are heading)
  2. To ensure that we support embedder platforms that primarily support mime types over file extensions
  3. We need to still support embedder platforms that only support file extensions
  4. We need to support the case where there is a custom file type/mime type the the app wants to be able to handle
  5. We don't want to bake a mapping into the web platform because then it becomes the bottleneck for adding new types.

Is that correct? Am I missing anything?

I'd like to ask if (1) is really true? Are there signs that desktop OS's are heading in this direction also? Even putting that aside though, I agree that the web is primarily mime type based (from my very limited expertise in the area) so it may be good to base things around mime types primarily.

@ewilligers
Copy link
Collaborator

Would the following approach meet all the requirements?

{
  "name": "Aggregator",
  "share_target": {
    "action": "/cgi-bin/aggregate",
    "method": "POST",
    "enctype": "multipart/form-data",
    extension_mapping: [
      { extension: ".csv", content_type: "text/csv" },
      { extension: ".svg", content_type: "image/svg+xml" },
      { extension: ".html", content_type: "text/html" },
      { extension: ".htm", content_type: "text/html" }
    ],
    "params": {
      "title": "name",
      "text": "description",
      "url": "link",
      "files": [
        {
          "name": "records",
          "accept": "text/csv"
        },
        {
          "name": "graphs",
          "accept": "image/svg+xml"
        },
        {
          "name": "pages",
          "accept": "text/html"
        }
      ]
    }
  }
}

Each extension may only appear once in the extension mapping. Every MIME type in an accept must appear in the extension mapping. If an accept contains a type/*, at least one matching content type must appear in the mapping.

This guides developers towards specifying share targets that should work on all platforms. If the platform only supports share target selection based on MIME types, then extension_mapping plays no role after validation.

@mounirlamouri
Copy link
Member

To head toward a web platform that primary supports mime types (I think you're saying this is the direction other platforms are heading)

It's not really "head toward" but as you pointed out later in your reply, the platform is rarely using extensions. I think there is a valid use case around custom files here though.

To ensure that we support embedder platforms that primarily support mime types over file extensions

👍

We need to still support embedder platforms that only support file extensions

I think the UA will handle this for regular files as it does today. The real UC is the point below.

We need to support the case where there is a custom file type/mime type the the app wants to be able to handle

👍

We don't want to bake a mapping into the web platform because then it becomes the bottleneck for adding new types.

I'm not sure what that means but in general I like having low level APIs that we can compose together. I realise that it's not easy/worth it in this case.

Every MIME type in an accept must appear in the extension mapping.

Why? I like your proposal but I wonder if we could leave extension_mapping optional. The UA will handle common MIME types and the websites can specify MIME types if they are exotic or custom. WDYT?

@ewilligers
Copy link
Collaborator

Every MIME type in an accept must appear in the extension mapping.

Why? I like your proposal but I wonder if we could leave extension_mapping optional. The UA will handle common MIME types and the websites can specify MIME types if they are exotic or custom. WDYT?

We could leave it optional, and then any web share targets that omit the mapping would not appear in the share menu on platforms like Windows that use extensions.

I don't think we should be introducing a new web standard for the common MIME types and their common extensions. Every time a browser added support for a new image or video format, they would want it added to the list. We'd need a new repo dedicated to those conversations (which could degenerate into Wikipedia notability discussions).

If the default mapping only included file types that are understood by all browsers, then "image/*" would accept more images types on platforms that use MIME types (they'd ignore the mapping and use the MIME type), than on platforms that use file extensions and the default mapping, and it wouldn't be obvious why. Developers might not be aware of the default mapping.

There are a couple of lists with the extensions people might want: MDN's "Complete list of MIME types" or Apache's "mime.types"

@ewilligers
Copy link
Collaborator

@inexorabletash what do you think?

We will want Web Share Target and File Handling to use the same approach. We have a few main options:-

  • lists containing MIME types and/or file extensions (current spec, reviewed by TAG, same syntax as HTML forms)
  • list of MIME types plus an explicit extension mapping for the benefit of platforms that don't use MIME types for their share dialogs or "Open With" menus. The mapping is initially empty.
  • list of MIME types plus a pre-populated extension mapping that authors can extend.

@raymeskhoury
Copy link
Collaborator

@benfredwells for thoughts also

Thanks @ericwilligers. So the main point of contention seems to be whether we bake in a default mapping of file extensions to mime types into the platform? The downsides you listed are:

  1. Developers may not understand what types are included in the default mapping and so may be surprised if a file extension they expect to show up for a particular mime type doesn't.
  2. It's a central list that needs to be maintained and updated. People will be incentivized to want new types to be added to the list to avoid the developer issue described above. Maintaining this may annoying.

The main pro to a default mapping of extension<->mime type is that it means that developers don't have to worry about specifying extensions for common mime types.

@inexorabletash
Copy link
Member

Long reply after a long weekend - sorry for the wall of text...

Given the issues we've seen with developers struggling with mime types due to inconsistencies across platforms, browsers, and individual devices, I'm very skeptical with attempting to elevate mime types above extensions. I think they're a lovely idea much like other additional file metadata in directories/auxiliary streams/resource forks, which breaks down rapidly in the face of sharing content across heterogeneous systems where a file being (name, bytes) is all you can count on. File extensions are widely understood by users, operating systems, and tools, even if they are often hidden from users in many cases, and you can view them as a hack - smuggling metadata for type in the name field - but it's been widely successful hack.

bake in a default mapping of file extensions to mime types into the platform?

Yeah, so I filed w3c/FileAPI#51 years ago. Seemed like a good idea at the time. Looking at it from another perspective, it's inventing a way to turn data we have (extensions) into data we're missing (mime type). This might help developers in some cases, but I think practically they're going to rely on extensions (to filter client-side in the UI) + content verification (server-side).

I'm pretty firmly in the "lists containing MIME types and/or file extensions (current spec, reviewed by TAG, same syntax as HTML forms)" camp.

My reaction to the other options: having to declare a mapping is a burden on developers; they will likely end up copying/pasting a chunk of the manifest, and shaking fists at us dumb browser developers. Building a mapping into browsers first couples this feature to one that hasn't even been designed yet, and seems like something that could take a long time to get agreement on and ship, with challenges around versioning/iterating. I think even with the "list types and/or extensions" option there's lots of room for evolution by the UA to improve the experience: if ".jpg" is listed, user agents could infer that the desired canonical type is "image/jpeg" via a UA/OS mapping and offer ".jpeg" files or even transcode PNG files.

@mounirlamouri
Copy link
Member

@inexorabletash there is already a mapping in browsers and they otherwise rely on the system mappings.

I am not sure that in practice this would be a burden for authors as they would in most cases rely on known extensions/mime types already handled by the browsers. It would only apply when they have to deal with custom files.

@inexorabletash
Copy link
Member

Current browser/system mappings are inconsistent across browsers, operating systems, and even individual user's systems. For example, on Windows, the mapping of extension to mime type depends on what applications are installed. "docx" may be known on a machine where Microsoft Office is installed, or be an unknown extension on another. Developers will not be aware that "docx" needs to be treated as a custom file type; testing that "application/vnd.openxmlformats-officedocument.wordprocessingml.document" will work on Chrome on one Windows 7 machine is insufficient, which complicates the testing matrix even beyond the pain we inflict on developers today.

@reillyeon
Copy link
Member

The goal from my perspective is to avoid cases in which files cannot be shared because the developer missed a particular MIME type or extension that is only applicable on a minority platform. In my opinion the mapping table suggested by @ewilligers mitigates this problem but only if we don't fall back to system defaults because as @inexorabletash points out those expand the testing matrix. The behavior of the manifest should be completely described by the specifications and the manifest itself.

@mounirlamouri
Copy link
Member

@inexorabletash my intent is for things like docx to be handled by the author (with a extension/mime type association in the manifest) while basic web platform types like jpg, gif, png, webm, wav wouldn't need to have an association as browsers know about these for certain. The intent here is to avoid a design that fully relies on file extensions or requires verbose mandatory mappings.

I like @reillyeon's suggestion but the cost of adding the mapping to a spec and checking compat with tests and likely change Chromium's behaviour isn't trivial.

@reillyeon
Copy link
Member

The default mapping could be empty, so all file extensions need to be manually specified by the developer. This doesn't actually seem so bad because native file selection APIs already require developers to list the file extensions they are interested in. This only adds the requirement that they must be mapped to a MIME type which helps us with the desire for all content entering the platform to be appropriately tagged with a content type.

@mounirlamouri
Copy link
Member

The mapping can be a bit tedious as it can lead to mistakes or omissions (is .jpeg included?). Also, it can make wildcard a bit hard to handle: how can one map image/*?

@reillyeon
Copy link
Member

The problem as I see it is that extensions are not standardized anywhere so it has always been up to developers to know what extensions they want to support. Yes, some will forget to include .jpeg but that's a bug I don't see a way to fix without agreeing to take on that standardization burden or inviting cross-platform inconsistencies. The solution proposed here is essentially what Windows does but without leaking mappings between applications.

We could perhaps make it less tedious by allowing multiple extensions to be mapped to the same MIME type in a single entry if there are enough cases (i.e. .htm and .html) where that is the desired behavior.

The mapping is extension → MIME type. Mapping to image/* doesn't make sense to me. If a target accepts image/* then all extensions mapped to a type matching image/* would be accepted.

@benfredwells
Copy link

@raymeskhoury asked my opinion so I'll give it, but y'all should trust the experienced specs folks more than me.

My opinion is that the original proposal "lists containing MIME types and/or file extensions (current spec, reviewed by TAG, same syntax as HTML forms)" is both simpler and clearer than the mapping that has been outlined. Chrome Apps used this and we had zero complaints: it just worked, and it worked across a variety of platforms.

I understand that MIME types are in many ways better (e.g. more flexible) than file extensions, but practically they both have their place and I worry that developers will bear the brunt of us trying to make MIME types the primary primitive for idealistic reasons.

To explain what I mean by 'clearer', it isn't obvious to me what happens to files that don't have extensions in the map but which are recognized by the system as a listed particular MIME type. I assume the match is actually an OR: things the UA recognizes as matching the MIME type or things which match the extension, but that is basically the original proposal expressed in a confusing way.

@raymeskhoury
Copy link
Collaborator

Thanks all for chiming in.

We could perhaps make it less tedious by allowing multiple extensions to be mapped to the same MIME type in a single entry if there are enough cases (i.e. .htm and .html) where that is the desired behavior.

Hmm, I think that would only make it marginally less tedious. Thinking about it some more, it seems to me that we shouldn't go down the path of forcing developers to specify a full mapping. We know that most apps are going to use the same common types and we're asking developers to put the same mapping down every time.

The goal from my perspective is to avoid cases in which files cannot be shared because the developer missed a particular MIME type or extension that is only applicable on a minority platform.

I think that's a good way to summarise the benefit of forcing a mapping. If a developer only specified .docx in the manifest, the app would receive .docx on Windows but not on Android (since it only supports mime-types). That seems bad because it makes the web app feel like it's not cross-platform.

I only really see 2 ways to address this:

  1. Let the UA provide a default mapping of its own (@inexorabletash alluded to this when suggesting that the UA could interpret .png to mean that the app also accepts image/png)
  2. Bake a default mapping into the spec (as @mounirlamouri suggested)

(1) is simpler because we don't maintain a centralized list, but it carries some risk of inconsistencies across platforms as has been mentioned. Each platform will have a different set of these and to really make sure their app works the way they want it, developers would have to test on all the platforms in theory. I think the risk is very low because the vast majority of apps are going to use standard mime types that all UAs would be able to know about. But it's still a risk.
(2) is more work in the short and long term to create/maintain this list.

I wonder if there is a way we can start with (1) and move to (2) later if it turns out we need to do the work (due to too much platform inconsistency).

We could start with a similar proposal to @ericwilligers. You would only specify mime types in the manifest and the UA would guess the extensions which correspond when needed, based on its own list:
"accept": ["image/gif", "image/png"]

The vast majority of developers would only need to specify this syntax and would not need to specify extensions ever.

In rare cases where weird mime types are being handled, they could also specify extensions:
accept: ["image/gif", "image/png", "application/x-photoshop"]
accept_extensions: [
{ content_type: "application/x-photoshop", extension: "psd" }
]

Later down the road, if we find that there is too much inconsistency across platforms, we could spec out the default mapping.

Maybe this is still all overkill though and the risk for (1) is low enough that we don't need to every do (2) and we can just stick with the current spec. Thoughts?

@reillyeon
Copy link
Member

If we do attempt (2) a look into the history of kPrimaryMappings in Chromium's net/base/mime_util.cc may be a good starting point as it is "for mappings that are critical to the web platform." If they are indeed critical then they should probably be in a spec somewhere.

@mounirlamouri
Copy link
Member

Bake a default mapping into the spec (as @mounirlamouri suggested)

To clarify, I didn't suggest to have a mapping in the spec but to have UA have a default mapping (I do not believe having it defined in spec is mandatory given that the Web already need this to work). What I suggested is more for the website to be able to specify its custom mapping. In other words, when a file is received, the UA could check its hard coded mapping, the system mapping and the site mapping.

I think the added amount of work is mostly if this spec has to maintain the "official" mapping.

@raymeskhoury
Copy link
Collaborator

raymeskhoury commented Feb 5, 2019

@marcoscaceres @aliams do other browser vendors have thoughts on this?

I'm going to number the proposals we've had so far:

  1. (Current spec): Let the app specify a list of mixed extensions/mime types. e.g.
    .gif, .png, image/gif
    Pro: Simple
    Con: Means that the developer might only specify mime types OR might only specify extensions, which means that things might not work well cross-platform. e.g. in the example above, an Android app wouldn't know that the app handles png files because image/png isn't listed.

  2. A variation on (1) is to still let the app specify a list of mixed extensions/mime types but for the browser to guess a translation between mime type and extension for common types. For example, with:
    .gif, .png, image/gif
    browsers would be allowed to guess that the app intended to support image/png because .png was specified. This would solve the problem of things working differently across different platforms.
    Pro: Still simple for developers in the common cases, and things will work across platforms that support either mime types or extensions.
    Con: The mapping of mime type <-> extension isn't standardized, so apps may have different behavior on different browsers, even when running on the same OS. For example, Chrome may have a mapping of .docx <-> application/vnd.openxmlformats-officedocument.wordprocessingml.document but FF may not. That would mean that an app that lists .docx as an accept type would work in Chrome on Android, but not on FF on Android. Developers would have no easy way to know about this.

  3. A variation on (2) is to put the mime type <-> extension mapping into a standard somewhere. If a developer listed a type that wasn't in the standard mapping, they would have to explicitly state the mapping for those types, extensions. e.g.

accept: ["image/gif", "image/png", "application/x-photoshop"]
accept_extensions: [
  { content_type: "application/x-photoshop", extension: "psd" }
]

In the example above, the mime types image/gif and image/png are in the standard mapping, so browsers would all know what types they map to. application/x-photoshop isn't in the mapping so listing it would raise an error unless the developer also specified what extensions mapped to it.
Pro: Simple for developers in the common case where the mime type is in standard mapping. Only complicated when they need to specify a custom mapping
Con: Exactly what types are in the standard may be confusing to developers. Further, maintaining the standard would be complicated and add overhead.

  1. Developers must specify a full mapping of mime type <-> extension. For every mime type they list, they must list 1 or more extensions that map to that type, e.g.:
    extension_mapping: [
      { extension: ".csv", content_type: "text/csv" },
      { extension: ".svg", content_type: "image/svg+xml" },
      { extension: ".html", content_type: "text/html" },
      { extension: ".htm", content_type: "text/html" }
    ],

Pro: We ensure developers consider both platforms that only deal with mime types as well as platforms that only deal with extensions. No need to maintain the standard mapping.
Con: It's very verbose and potentially confusing to developers how to specify this. It also means that for common cases, developers will end up listing the same types over and over.

@aliams
Copy link

aliams commented Feb 9, 2019

Thank you @raymeskhoury for enumerating the options discussed as well as providing summaries for the pros and cons. I think I like option 1 as it seems to keep things simple and does not involve automatically determining values on behalf of the developer as you would get with option 2. I think options 3 and 4 can make it challening for web developers to keep track of the various extensions and content type matches and they may not always get them right. It may perhaps be useful to get some insight from various web developers and see what they think about these different options.

@raymeskhoury
Copy link
Collaborator

raymeskhoury commented Feb 12, 2019

@jakearchibald can you suggest some developers who would want to use web share target/file handlers who might have an opinion on the above? Thanks!

@jakearchibald
Copy link

jakearchibald commented Feb 13, 2019

I have opinions! 😄

I've only skimmed the thread (a summary might be useful if you're looking for others to join in), so apologies if I'm retreading old ground.

Mixing mimetypes & extensions already happens with the accept attribute on input elements.

Inferring mimetypes from extensions already happens with <input type="file"> & the drag & drop API, so browsers already have this code. This mapping goes both ways.

Being able to specify file extensions feels important for types the browser doesn't understand. We hit this with squoosh.app, where we wanted the browser to be able to open .webp even if it didn't understand that WebP is an image format. And of course example.com may want to create and later open its own format .exampleml.

I think option 2 is best since the format matches <input type="file" accept> and it uses existing extension-mime mapping behaviour. It feels like the problems with option 2 are already a problem with <input type="file" accept>, so any solution should target both.

@tomayac
Copy link

tomayac commented Mar 5, 2019

+1 to @jakearchibald's proposal to align the behavior of <input type="file"> and Web Share Target, that is, option 2. The elegance of this is that it allows both for sweeping something/*, as well as for something/specific or even something/specific+json to be specified by developers, depending on their needs or their app's capability.

Great point in Jake's comment that apps through wasm or JavaScript can indeed be able to master more than the browser itself, like the webp example from squoosh.

@jakearchibald
Copy link

Btw, I like the idea of option 3, but only if it can be provided to other file-accepting APIs. It'd be terrible if we end up in a place where I can Share Target a file, but can't <input type="file"> it, despite them having the same accept instructions.

I guess you could have a format like accept="foo/bar=.foo, .jpg". This would allow the browser to warn about types/extensions that don't have an extension/type provided in the accept field or the mapping spec.

Since a solution to this problem should target file inputs and drag & drop, it seems fine for share target to do the same thing as other file inputs (option 2) until a solution can be agreed & implemented.

@raymeskhoury
Copy link
Collaborator

Thanks all, it seems like we're converging on option 2 as a step forward and working out something like option 3 in the longer term. We will need to update the spec slightly to give browsers the freedom to guess the matching mime type/extension but otherwise I think we're ok. @jakearchibald any suggestions on where we should create an issue to track that?

@jakearchibald
Copy link

@ewilligers
Copy link
Collaborator

Raised whatwg/html#4459.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants