Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: embed: remove import restriction #43217

Closed
rsc opened this issue Dec 16, 2020 · 84 comments
Closed

proposal: embed: remove import restriction #43217

rsc opened this issue Dec 16, 2020 · 84 comments

Comments

@rsc
Copy link
Contributor

rsc commented Dec 16, 2020

@mdempsky points out in #41191 (comment) that the late addition of string and []byte to the embedding proposal has some unfortunate side effects that we might want to make sure we are happy with.

There are three details of string and []byte embedding that are at least surprising. This issue is about two of them. See #43216 for the third. The two for this issue are as follows.

First, to make sure source files containing embedding directives are clearly identified (by an import), we require an import _ "embed" to write

//go:embed hello.txt
var msg string

even though there is no use of the embed package. This requires a special case in goimports. (This is noted in the proposal as accepted.)

Second, the late addition says “a plain string or []byte variable”, and I really did mean those words. Only those specific types are allowed, not []uint8 and not other user-defined aliases for those types.

I see two options here. One is to do nothing and leave things as they are. That's possible and is the "this issue gets declined" outcome. The other option I can see is to introduce new type aliases embed.String and embed.Bytes and require those to be used as the types, which is what this proposal issue is meant to discuss and decide whether to accept.

Specifically:

package embed

// A String is a string that can be initialized using a //go:embed directive.
// It is an alias so that embed.String can be used as a string without
// any explicit conversion. The only expected use of this type is in
// var declarations accompanied by //go:embed directives. Elsewhere,
// code should simply use string.
type String = string

// A Bytes is a []byte that can be initialized using a //go:embed directive.
// It is an alias so that embed.Bytes can be used as a []byte without
// any explicit conversion. The only expected use of this type is in
// var declarations accompanied by //go:embed directives. Elsewhere,
// code should simply use []byte.
type Bytes = []byte

Then instead of writing:

//go:embed hello.txt
var msg string

You have to write:

//go:embed hello.txt
var msg embed.String

That solves both of the issues: the import is now required without any special case in goimports, and there is no question about whether []uint8 or other user-defined alias is OK. Only embed.String and embed.Bytes are OK.

What do people think? (Thumbs up / thumbs down is fine.)

[One option that's not on the table is removing string and []byte embedding entirely. Many people asked for that during the pre-proposal discussions, and I think it is important to respect that process. We can fix problems, but I think it would be improper to drop the functionality completely.]

Thank you!

@SamWhited
Copy link
Member

Using a magic import import _ "embed" just feels confusing and makes me wonder what magic the Go compiler is hiding behind the scenes. I believe this sort of import was a mistake in the original design of the language and that its use in database/sql and the image libraries and the like was a mistake that we shouldn't add more of. I have run into any number of unexpected panics and other confusion due to import side effects. This specific magic import may not have any side effects that could cause problems, but I can't know that from looking at it. Please don't hide behavior behind magic imports. I'd almost rather the compiler implicitly magically import embed without that line being present just so that I don't have to see it since this would just be part of the go:embed directive behavior which is already magic that I sort of hate anyways but, per your email saying this wasn't going to change, we have to live with already.

I'm more okay with embed.{Bytes,String} because it at least gives me some reason to be importing embed without magic. Without knowing anything about the internals of the package (which will be the case for most Go users) I still immediately think "this is weird, it's just an alias, so why can't I just use a string?". This isn't the end of the world, Go has plenty of other more glaring quirks, but it's where my mind goes.

@hherman1
Copy link

Personally I like that its just a plain string and a []byte. The type wrappers aren't the end of the world, but I don't find the "_" import offensive enough to justify the expanded api personally, esp. considering there aren't actually any side effects of this import.

@earthboundkid
Copy link
Contributor

FWIW, I started writing a blog post about //go:embed and had a little section about remembering to use import _ "embed" and example showing how it wouldn't work if you left it out. This simplifies the blog post because now I don't have to explain about a possible pitfall. I think being able to simplifying teaching materials is a good sign.

@hherman1
Copy link

I’m sure there’s already a great explanation of why this is a bad idea that I missed, but what if you just had no import at all, and //go:embed worked on its own? I take it that’s off the table?

@mdempsky
Copy link
Contributor

I’m sure there’s already a great explanation of why this is a bad idea that I missed, but what if you just had no import at all, and //go:embed worked on its own? I take it that’s off the table?

That would slow down go/build (e.g., cmd/go). Currently to compute package dependencies, it only parses Go source files up through the import declarations.

With //go:embed directives, it's necessary for go/build to parse these too, to figure out which files are embedded. By requiring that package embed is imported to use them, go/build can continue parsing only the import declarations for most files; and only continue to parse the rest of the file when it saw an import for package embed.

@marwan-at-work
Copy link
Contributor

marwan-at-work commented Dec 18, 2020

One potential expansion on this proposal:

The declaration of a var <name> embed.{FS,String,Bytes} can be a hint to the Go compiler to validate the comment directive that are defined on top of the declaration.

For example,

//go;embed: file.txt 
var file embed.String

would cause a compiler error which will alleviate programmer confusion that they misspelled go:embed .

Similarly, the lack of a go directive comment could also be a compiler error, because why would a programmer declare an embed.{Type} without actually needing to embed a file?

This would address both @mdempsky and @rsc 's comments here by keeping the embedding syntax as is while providing better developer experience.

Hopefully my skimming of the comments didn't miss a similar suggestion.

@mdempsky
Copy link
Contributor

mdempsky commented Dec 18, 2020

Similarly, the lack of a go directive comment could also be a compiler error, because why would a programmer declare an embed.{Type} without actually needing to embed a file?

The draft proposal specifically states: "It is not an error to declare an embed.Files without a //go:embed directive. That variable simply contains no embedded files."

Note also that a single variable declaration can have multiple //go:embed directives. (Unlike normal Go function call syntax, which allow spreading across multiple lines, //go: directives are line oriented.) Reporting an error about no //go:embed directives wouldn't catch the case where the user typed some //go:embed directives correctly, but not all.

(That said, because all of the //go:embed lines would be clustered together, it would probably be rather visually obvious if there were any mistakes on some lines but not all.)

@hherman1
Copy link

hherman1 commented Dec 19, 2020

I've been playing with go:embed on the beta a bit, and I noticed something I'd like to add to the mix. If I start working on a file with an embedded variable, and then I decide I want to move this embedded variable to a different file or module, its very easy to forget to remove the _ "embed" import, and because the import is explicitly _ my IDE/go itself don't complain about its presence or auto-remove it.

For that reason, I think it would be valuable at the very least for go to complain about unused _ "embed" imports, but I also think this is a point for the embed.Bytes proposal.

I really like the idea of embedding simple []byte/string. I don't like needing to learn about new types or the obscurity that comes with it, although slight. That being said, I think the case for these special types is stronger than I initially gave it credit.

@hherman1
Copy link

While I'm here, I'll add on that I found it surprising that ../../resources/file.zip was an invalid pattern. This is clearly not the right place for this comment, but I'm not aware of another place to put embed feedback.

@marwan-at-work
Copy link
Contributor

I really like the idea of embedding simple []byte/string. I don't like needing to learn about new types or the obscurity that comes with it, although slight.

I also think it's really nice that you can use the plain string/[]byte. And while the drawback is that an _ "embed" import might get orphaned, tools like gopls or go vet can catch that pretty easily.

As a proof of concept, I wrote a quick commit on top of gopls that automatically adds or removes an _ "embed" import based on your file's state: marwan-at-work/tools@214d261

@rsc
Copy link
Contributor Author

rsc commented Dec 23, 2020

People seem overwhelmingly happy with this change, from what I'm seeing here.

To reply to @marwan-at-work, you can't exactly use the plain string/[]byte. You have to adorn them with import _ "embed".
It's only a little different for the adornment to be to write embed.String and embed.Bytes instead. While you can write a special case for gopls (and all the other tools that care), it seems much nicer to use a design that avoids the special case entirely: we can fit nicely into the surrounding system, instead of causing more work for it.

@zigo101
Copy link

zigo101 commented Dec 24, 2020

I have a question, why embed.FS instead of io/fs.MemoryFS?

[eidt] And is it ok to move embed.Bytes as bytes.Embed?

@rsc
Copy link
Contributor Author

rsc commented Dec 24, 2020

@go101, the explicit goal of this proposal is to define types in package embed, so that the import will be required.

@rsc
Copy link
Contributor Author

rsc commented Dec 30, 2020

Based on the discussion above, this seems like a likely accept.

@earthboundkid
Copy link
Contributor

Quick question, does the import need to be named “embed” or can it be aliased (import xembed "embed")? I assume renaming the import is okay, but I don’t recall seeing it tested in the CLs yet.

@robpike
Copy link
Contributor

robpike commented Jan 2, 2021

I find the "import _ embed" more logical than an alias. You are, truly, importing embed for its side effects. Document that if you are using embed just for strings or bytes, you must use the underscore.

If you use the _ import, the problem is solved, only two keystrokes (space, underscore) are required, and all is smooth now.

If you use the alias, you need to type more and you need to understand that it is an alias and deal with the fact that your variable although declared as embed.String is really just a plain string, despite its declaration. That troubles me, as its effects could ripple through the file.

If an underscore import is good enough for image/jpeg, it's good enough for embed.

@robpike
Copy link
Contributor

robpike commented Jan 2, 2021

To reiterate: A big thumbs down for me. There is zero need for a new mechanism.

@earthboundkid
Copy link
Contributor

If import _ “embed” worked like import _ “image/jpeg”, it would be enough to just do the import in package main. As far as I know, this is not the case.

I think if embed.String/embed.Bytes aren’t the chosen form, then other type aliases for string and []byte should be allowed (e.g. []uint8), which was the original bug report that lead to this proposal.

@randall77
Copy link
Contributor

The big benefit I see with the aliases is that goimports will insert (and remove) the import for you. That makes all the changes you actually need to type in one location (2 lines, the //go:embed line and the var x embed.String line).

@robpike
Copy link
Contributor

robpike commented Jan 3, 2021

I understand the apparent convenience of the proposal, and how tools can use it to manage things for you, but no such argument - or tooling - is necessary if the documentation and practice say to use use an underscored import. I cannot stress enough that two keystrokes are all that are required. TWO! A third of what appear in the word "string".

All the mechanism necessary already exists and is widely used. There's no need to provide anything else. No documentation and explanation of the special aliases required. (No ugly aliases required, either, but the sentiment underlying that may hold a bias.)

Should embed acquire other types and other mechanisms, which is certainly possible, the underscore import will continue to solve the problem without further changes, but to go with the proposal would require expanding the space of magic names.

To put it another way, if the embed type is bringing in a string, let me declare it as a regular string.

We already have all we need to solve this. Please don't add more complexity.

@zikaeroh
Copy link
Contributor

zikaeroh commented Jan 8, 2021

I find that idea a little weird, as it requires devs to know at which version some particular feature they're using was introduced. If I was a newcomer who didn't know about the expanded struct tag syntax, I wouldn't know that I had to mark the file with my struct as a different version to use that feature. I can also see it being annoying (about as annoying as build tags) for features where maybe you do want it an old way and a new way, but I realize that's unlikely.

I think ideally, tag names would have needed a specific syntax such that they errored before with spaces, and then the new behavior is only new and is allowed (like other syntax expansions in the language), but too late for that (which is sort of why I think that particular change is doomed).

@zikaeroh
Copy link
Contributor

zikaeroh commented Jan 8, 2021

This is moderately related, but one benefit of dropping the import requirement would be that code could conditionally use embed directives protected by build tags without hitting #40067; right now the introduction of embed and io/fs are going to fracture the module ecosystem a bit, because anyone who wants to use a module that imports either new package will also need to move up to 1.16, even if build tags gate their use.

If an embed import isn't required (and only the directive), then the older compiler wouldn't even know something was wrong, then a build tag would take care of the rest. Of course, this violates my previous comment about users knowing when the feature was added, since they then need to know that go:embed needs to be behind a build tag too.

Unfortunately, I think the most useful bit is embed.FS, so using the elimination of embed as an import for bytes/strings is probably a bit lopsided in terms of the errors #40067 will lead to.

@ianthehat
Copy link

If we make the compiler require the "go 1.16" line to allow //go:embed to be used used then I think it covers both 1 and 2 (unless you are git cloning a repository that never worked in the first place), so the only issue is use case 3 which seems like a really minor issue (the user is directly using the feature and seeing it not work, they should be able to work out why very easily)

@zikaeroh
Copy link
Contributor

zikaeroh commented Jan 8, 2021

@ianthehat Unless I'm mistaken, modules are allowed to require modules of a higher Go version without erroring, so if you depend on a module that says go 1.16 to enable embed but yourself declare go 1.15, you may not know that the //go:embed lines have silently not worked on older compilers, hence #43558.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Jan 9, 2021

With the transition concerns, I've become sympathetic to the suggestion someone already made (but I can't find it) which is that we just don't permit []byte or string types at all. The only type permitted would be embed.FS.

We can then add AsString and AsBytes methods to embed.FS, which panic if the FS contains more than one file, and otherwise return the contents of the single file as a string or a []byte. In the simple implementation the AsBytes method would copy the contents, but it would be feasible for the compiler (which already knows about embed.FS anyhow) to recognize cases where AsBytes is called only once, at initialization time, and skip the copy in that case.

//go:embed file.txt
var fileFS embed.FS
var file = fileFS.AsBytes() // file is type []byte, if this is the only use of fileFS no copy is required

Certainly adds boilerplate, but sidesteps the transition issue. We can still consider supporting simple string and []byte types in later releases, if we can figure out how.

@robpike
Copy link
Contributor

robpike commented Jan 9, 2021

That seems distastefully roundabout to me, multiple layers of special cases. Strive for easy generality rather than a unique sequence of steps that yields a single solution.

@ianlancetaylor
Copy link
Contributor

It seems to me that we already have the easy generality: the general case is embed.FS, and there doesn't seem to be any particular disagreement as to how that should work.

All we are talking about here is the special cases: []byte and string. I am not at all attached to my suggestion, but I don't think it introduces any more special cases than we introduce by writing

//go:embed file.txt // required to be exactly one file
var file string

@flibustenet
Copy link

flibustenet commented Jan 11, 2021

I also appreciate the only one way to do it and explicit with embed.FS. It will also incite to have only one place with embed which i believe is better for global vars.
In my use case i don't see the use of just one file.

@rsc
Copy link
Contributor Author

rsc commented Jan 12, 2021

At this point, we now know that the import serves three purposes:

  1. It makes clear to readers (people) early on that there are //go:embed comments.
  2. It makes clear to the build system (programs) early on that there are //go:embed comments. This has a minimal performance benefit.
  3. It ensures that programs using //go:embed fail to compile on older versions of Go, rather than silently compiling with empty strings and byte slices.

We were willing to throw out (1) and (2) in favor of the simplicity of not having an import. But (3) is a show-stopper. I am now convinced that our initial instincts were correct and we should keep the import requirement.

Ian suggested discarding strings and byte slices, but I am very reluctant to do that. I think they will be the primary mode of embedding for many people (same as embed.FS will be for many other people), and they were a key part of the proposal as accepted.

It seems like the best path forward is what we had planned, namely keeping the import requirement.

@earthboundkid
Copy link
Contributor

If the import requirement is back, what about embed.String/Bytes? Even if they are just type aliases to string/[]byte and other aliases are allowed, they make it so goimports works without needing to upgrade whatever version you're currently using, so it's convenient to have. It is also convenient for documentation.

@flibustenet
Copy link

@carlmjohnson those who want automatic import can use embed.FS and those who prefer raw string/[]byte can use raw import

@marwan-at-work
Copy link
Contributor

@carlmjohnson those who want automatic import can use embed.FS and those who prefer raw string/[]byte can use raw import

Worth re-mentioning that goimports can also be taught to underscore import "embed" for you by analyzing compiler directives in a file, which I have demonstrated here: marwan-at-work/tools@214d261

I believe @rsc 's comment above was that having embed.String/Bytes will alleviate the need to teach goimports and tools new rules.

But in light of the discussion that followed to avoid type aliases, maybe updating such tools is a worthy trade-off?

@rsc
Copy link
Contributor Author

rsc commented Jan 12, 2021

@carlmjohnson I've retracted the idea of embed.String and embed.Bytes. They are more awkward than requiring the import.

I'm not convinced goimports needs to put the import in automatically, although it is fine for it to do that. (It's not that hard for people to write the import, and we've now worked out a compelling explanation for why it is there.)

@hherman1
Copy link

In that vein, it would be nice for goimports to remove the import automatically, to deal with orphaned embed lines.

@rsc
Copy link
Contributor Author

rsc commented Jan 13, 2021

For the record, I don't mind goimports adding the blank import, but removing it seems beyond the pale.
The entire point of blank imports is to override the usual import requirement logic, and that has extended to goimports itself.

Breaking the general rule "goimports leaves blank imports alone" would require very significant justification, and I don't think this is it. Suppose it were completely justified to break the rule for "embed" (it's not - maybe the user wants to provoke a compiler error if "embed" is not available - how does that user override goimports? - but suppose it were), then you still have a special case behavior now in goimports that users can observe and misunderstand. When they see goimports remove the blank import for "embed", now they have to wonder about whether their other, perhaps more important, blank imports will also be removed. And maybe the decision is it's not worth the risk and stop using goimports.

Goimports really has to be 100% safe, never changing the meaning of code that compiles. It should continue to limit itself to fixing compile errors (missing imports, unnecessary imports) and not make other changes.

@rsc
Copy link
Contributor Author

rsc commented Jan 13, 2021

This was a "likely accept" before we understood the "silent miscompilation" problem described in #43217 (comment).
Now that we understand that, it seems like we should keep (not remove) the import restriction.
That would mean declining this issue.

@rsc
Copy link
Contributor Author

rsc commented Jan 13, 2021

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

gopherbot pushed a commit that referenced this issue Jan 15, 2021
The current implementation requires saying "string" or "[]byte"
and disallows aliases, defined types, and even "[]uint8".
This was not 100% intended and mostly just fell out of when
the checks were being done in the implementation (too early,
before typechecking).

After discussion on #43217 (forked into #43602),
the consensus was to allow all string and byte slice types,
same as we do for string conversions in the language itself.
This CL does that.

It's more code than you'd expect because the decision has
to be delayed until after typechecking.

But it also more closely aligns with the version that's
already on dev.regabi.

Fixes #43602.

Change-Id: Iba919cfadfbd5d7116f2bf47e2512fb1d5c36731
Reviewed-on: https://go-review.googlesource.com/c/go/+/282715
Trust: Russ Cox <rsc@golang.org>
Reviewed-by: Jay Conrod <jayconrod@google.com>
@earthboundkid
Copy link
Contributor

Is it enough for embed to be imported anywhere in a program, or does it need to be imported in the same package as the //go:embed comment?

@rsc
Copy link
Contributor Author

rsc commented Jan 20, 2021

It needs to be in the same file as the //go:embed comment.

@rsc
Copy link
Contributor Author

rsc commented Jan 20, 2021

No change in consensus, so declined.
— rsc for the proposal review group

@zigo101
Copy link

zigo101 commented Feb 3, 2021

As every //go:embed ... directive line requires importing embed, is it better not to view embedding as a language feature? #43980

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests