Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider import library ... #1457

Open
nigeltao opened this issue Jul 20, 2022 · 19 comments
Open

Reconsider import library ... #1457

nigeltao opened this issue Jul 20, 2022 · 19 comments
Labels
leads question A question for the leads team long term Issues expected to take over 90 days to resolve.

Comments

@nigeltao
Copy link

nigeltao commented Jul 20, 2022

https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/README.md#imports says "The import library ... syntax adds all the public top-level names within the given library to the top-level scope of the current file as private names". IIUC this is sort of like Java's import foo.bar.*.

OTOH, https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/name_lookup.md#unqualified-name-lookup says "Unqualified name lookup in Carbon will always find a file-local result".

IIUC, the two statements are contradictory.

Personally, from my Go experience, I prefer the "Unqualified name lookup" principle: users of type Stamp from package time (package in the Go sense) always refer to it by a qualified name: time.Stamp. Similarly, func Decode from package jpeg is referred to as jpeg.Decode, which is clearly distinct from png.Decode even though in the source code they're both defined as func Decode.

Requiring qualified names means that adding names to a library will not cause any downstream importers (that happen to also use that same name, privately) to stop compiling due to a newly introduced ambiguity.

It also means that IWYU (Include What You Use) tools can work purely on the AST without having to follow imports.

(I'll caveat all of the above that I may be misunderstanding Carbon's package/library distinction and the details of its Name Lookup model).

@beauxq
Copy link

beauxq commented Jul 21, 2022

I like Python's import syntax.

from jpg import Decode as jpgDecode
from png import Decode as pngDecode

Being able to rename things as they're imported is important.

import long_package_name_that_i_dont_want_to_clutter_my_code_with as x;

x.foo();

@nigeltao
Copy link
Author

import long_package_name_that_i_dont_want_to_clutter_my_code_with as x;

Go lets you do that to (with a different, Go-like syntax). Handy when you want to import two packages that are both called util.

@chandlerc
Copy link
Contributor

https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/README.md#imports says "The import library ... syntax adds all the public top-level names within the given library to the top-level scope of the current file as private names". IIUC this is sort of like Java's import foo.bar.*.

OTOH, https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/name_lookup.md#unqualified-name-lookup says "Unqualified name lookup in Carbon will always find a file-local result".

IIUC, the two statements are contradictory.

I think the second simply needs to be updated to reflect a change in our plans here.

Specifically, this was discussed pretty heavily and a decision that changed the behavior was made in #1136 -- see the next to last comment for I think a good summary. I think the design overview reflects this outcome, but the unqualified name lookup hasn't been updated.

Personally, from my Go experience, I prefer the "Unqualified name lookup" principle: users of type Stamp from package time (package in the Go sense) always refer to it by a qualified name: time.Stamp. Similarly, func Decode from package jpeg is referred to as jpeg.Decode, which is clearly distinct from png.Decode even though in the source code they're both defined as func Decode.

Requiring qualified names means that adding names to a library will not cause any downstream importers (that happen to also use that same name, privately) to stop compiling due to a newly introduced ambiguity.

It also means that IWYU (Include What You Use) tools can work purely on the AST without having to follow imports.

(I'll caveat all of the above that I may be misunderstanding Carbon's package/library distinction and the details of its Name Lookup model).

I think the biggest thing I want to emphasize is that this only works for libraries within a package, which are expected to be both very local and part of a fairly uniform namespace. I don't think that has many of the concerns here, but maybe I've misunderstood? Mostly want to make sure we're on the same page about what is actually being considered here.

@beauxq
Copy link

beauxq commented Jul 22, 2022

Something that I really dislike about C/C++, and what I like that is different in other languages:

#include "a.h"
#include "b.h"

Foo f;  // Where did this `Foo` name come from? I can't tell. This is a problem.

In Python:

import a
import b

f = a.Foo()  # I can see where the name `Foo` comes from. Good.

or

from a import Foo
from b import Bar

f = Foo()  # I can see where the name `Foo` comes from. Good.

For every name I bring into scope with imports, I want to be able to see where that name comes from.

This is recognized as bad practice by the Python community:

from a import *
from b import *

f = Foo()  # Where did this `Foo` name come from? I can't tell. This is a problem.

For every name I bring into scope with imports, I want to be able to see where that name comes from.

@beauxq
Copy link

beauxq commented Jul 22, 2022

I'm not sure whether I have the Carbon syntax right, but...

import Cpp library "a.h";
import Cpp library "b.h";

var f: Cpp.Foo;  // which import line gave me `Cpp.Foo`? I can't tell. This looks like a problem.

Can we do it like this?

import Cpp library "a.h" as a;
import Cpp library "b.h" as b;

var f: a.Foo;  // better

I want to be able to tell where things came from.

If the former remains in Carbon, I would hope that it immediately gets recognized as bad practice, in the same way that from a import * is recognized as bad practice by the Python community.

@chandlerc
Copy link
Contributor

import Cpp library "a.h";
import Cpp library "b.h";

var f: Cpp.Foo;  // which import line gave me `Cpp.Foo`? I can't tell. This looks like a problem.

Can we do it like this?

import Cpp library "a.h" as a;
import Cpp library "b.h" as b;

var f: a.Foo;  // better

I want to be able to tell where things came from.

I definitely understand the desire to have this behavior, but I think it would be quite challenging to get here.

I think the first challenge here is that things behave differently (perhaps very differently) depending on where you write the code based on how names are imported. The desire for code to retain the same meaning in different contexts is in tension with having very localized names. I think both of these are valuable, but I think we'll have to choose.

The second problem is in the face of templates. While we are supporting templates, we'll want cases where code from b.h needs to find names provided by a.h, and can't rely on the local renaming. For example in C++, this comes up with argument-dependent-lookup. We could hide this in Carbon, causing C++ to use a consistent flat namespace but Carbon not, but I think that would both impact Carbon templates and create some amount of confusion between templates in Carbon and C++.

I'll also point out that this is actually not the topic of this issue (I think), as the point you're making here comes up even without the same-package import library operation. You've pointed it out with two header files in the C++ package, and we can see it with two imports of separate libraries from another package. We could force packages to only have a single imported library to remove this, but this would also force either I think a painful expansion in how many package names folks use, or to significantly more coarse grained dependencies that would hurt large-scale build performance.

@nigeltao
Copy link
Author

nigeltao commented Jul 22, 2022

Foo f; // Where did this Foo name come from? I can't tell. This is a problem.

Yeah, that's exactly the problem I'd like to avoid.

Tangentially (possibly), I'd also like to avoid the "Abseil can't name something ERROR (which would otherwise be the perfectly obvious name) because WinGDI.h says #define ERROR 0" problem.

Nonetheless, in this Carbon snippet:

import Cpp library "a.h";
import Cpp library "b.h";

var f: Cpp.Foo;  // which import line gave me `Cpp.Foo`? I can't tell. This looks like a problem.

I'm willing to let import Cpp (as opposed to import Bar generally) be a special exemption, if it'd otherwise make it too hard to migrate or wrap existing C++ code (and its names).

@nigeltao
Copy link
Author

nigeltao commented Jul 22, 2022

We could force packages to only have a single imported library to remove this, but this would also force either I think a painful expansion in how many package names folks use, or to significantly more coarse grained dependencies that would hurt large-scale build performance.

I'm still not sure if I understand the Carbon package vs library distinction correctly, but my rough model (based on Go experience) is that Carbon package ~ Go module (the coarser unit of software distribution and versioning) and Carbon library ~ Go package (the finer unit of import), with the extra note that Carbon's Cpp package is the special-cased bridge to the enormous soup of existing C/C++ code (and C code doesn't even have namespace).

"ImageMagick / LibMagick" could be the coarse thing and "JPEG codec" or "PNG codec" could be the fine thing.

"The fine thing" is still a coarser concept than C/C++ (where the source code file is the unit of import) or Java (where the class is the unit of import, plus * for globbing).

In Go, modules are a build tool or package management concept but entirely invisible to the compiler and language. Go source code always imports individual Go packages and never mentions Go modules.

If Kubernetes (or one aspect, k8s.io/client-go) is the Go module, here's an example of importing multiple Go packages from within that Go module, and this (multiple imports) hasn't been painful in practice.

There's also a demonstration of locally renaming some Go package names in the link, if you're curious, although if starting the syntax from scratch, I'd put the local name on the right instead.

@chandlerc
Copy link
Contributor

I'm still not sure if I understand the Carbon package vs library distinction correctly, but my rough model (based on Go experience) is that Carbon package ~ Go module (the coarser unit of software distribution and versioning) and Carbon library ~ Go package (the finer unit of import), with the extra note that Carbon's Cpp package is the special-cased bridge to the enormous soup of existing C/C++ code (and C code doesn't even have namespace).

I think you roughly have the right model.

"ImageMagick / LibMagick" could be the coarse thing and "JPEG codec" or "PNG codec" could be the fine thing.

The example I've been using is Abseil might be a package in Carbon.

"The fine thing" is still a coarser concept than C/C++ (where the source code file is the unit of import) or Java (where the class is the unit of import, plus * for globbing).

I think we should be relatively close to where C/C++ is here, where it is the "external header file" that is the import unit. This often corresponds to a single file, but also often corresponds to several files, from an implementation file to internal helpers.

In Go, modules are a build tool or package management concept but entirely invisible to the compiler and language. Go source code always imports individual Go packages and never mentions Go modules.

If Kubernetes (or one aspect, k8s.io/client-go) is the Go module, here's an example of importing multiple Go packages from within that Go module, and this (multiple imports) hasn't been painful in practice.

It's likely this could also work, but I think the shared top-level namespace that libraries like Abseil use in C++ also works quite well.

I pointed out above some of the downsides of needing local renaming, and I think not having the coarse level thing be what carries the unique name would put a lot of pressure on longer names or lots of local renaming. It is a tradeoff, but one that I think we've already seen work reasonably well in C++ (where followed closely), and so it seems reasonable to match that pattern in Carbon.

@nigeltao
Copy link
Author

nigeltao commented Jul 23, 2022

I think we should be relatively close to where C/C++ is here, where it is the "external header file" that is the import unit. This often corresponds to a single file, but also often corresponds to several files, from an implementation file to internal helpers.

By "external header file", do you mean:

  • a file designed for external consumption: this is the .h file that users (as opposed to the people developing that library) should #include? This could be an 'umbrella' file that #includes other 'internal' files, e.g. Harfbuzz's hb.h.
  • a file in a separate or external directory: it lives in the extdir in cc -I extdir foo.c?

@nigeltao
Copy link
Author

I pointed out above some of the downsides of needing local renaming

There might be some confusion about whether you were replying about renaming, the as a in

import Cpp library "a.h" as a;
import Cpp library "b.h" as b;

var f: a.Foo;  // better

or replying about dot-dot-dot imports, where, after I import Abseil, I can just refer to a naked Mutex instead of having to qualify it as Absl.Mutex.

Or replying about both. There's also the axis on whether you're replying about all packages or just about the special Cpp package. (Is the Cpp package special or is it valid Carbon to use any Foo in import Foo library "foo.h" when foo.h is obviously C++, not Carbon??)

In case it got lost in the subsequent discussion, my original request is that Carbon programmers must always qualify (it's always Absl.Mutex) unless there's an explicit opt-in (alias Mutex = Absl.Mutex). Renaming packages/libraries (locally, at time-of-import) is related, but separate.

Specifically, templates and ADL (Argument Dependent Lookup) should still work with qualified names (e.g. I can say std::vector<absl::Mutex>), so it wouldn't break them to require qualified names, right?

@nigeltao
Copy link
Author

Ah, I though I understood ADL but re-reading up on it...

using std::swap;
swap(obj1, obj2);

is not the same as

std::swap(obj1, obj2)

That's C++. In the Carbon world, if I import Cpp library "algorithm" then I presumably need to get swap into the top-level namespace for the same reason.

But if algorithm was re-written to be pure Carbon (and Carbon generics are not just templates), would import Carborithm also need to put things into the top-level namespace? Again, could we treat import Cpp etc as a special case?

@nigeltao
Copy link
Author

I think the biggest thing I want to emphasize is that this only works for libraries within a package, which are expected to be both very local and part of a fairly uniform namespace. I don't think that has many of the concerns here, but maybe I've misunderstood? Mostly want to make sure we're on the same page about what is actually being considered here.

Ah, I missed that, in all the excitement.

That goes a long way to alleviate my concerns. It's closer to the Go model: "local" names are "local to the Go package", not "local to the file". Note though that there are still three levels of granularity:

  • Carbon package / Go module
  • Carbon library / Go package
  • Source code file

and "local" in Go means "the bottom 2 out of 3" but IIUC in Carbon means "all 3".

@chandlerc
Copy link
Contributor

By "external header file", do you mean:

  • a file designed for external consumption: this is the .h file that users (as opposed to the people developing that library) should #include? This could be an 'umbrella' file that #includes other 'internal' files, e.g. Harfbuzz's hb.h.

This.

We even expect to have analogs to an "umbrella" library that re-exports things if people desire it, but I think we'd like the default/typical case to be pretty fine grained for better build scaling.

I think the biggest thing I want to emphasize is that this only works for libraries within a package, which are expected to be both very local and part of a fairly uniform namespace. I don't think that has many of the concerns here, but maybe I've misunderstood? Mostly want to make sure we're on the same page about what is actually being considered here.

Ah, I missed that, in all the excitement.

Ah, yes. It's a super important point. =D Without this, much chaos.

For any name not local to your package, you're going to have to use its package name as a qualifier.

That goes a long way to alleviate my concerns. It's closer to the Go model: "local" names are "local to the Go package", not "local to the file". Note though that there are still three levels of granularity:

  • Carbon package / Go module
  • Carbon library / Go package
  • Source code file

and "local" in Go means "the bottom 2 out of 3" but IIUC in Carbon means "all 3".

For names that can be used without a qualifier, yes.

But we do plan to have things that are private to a library, and private to a source file for implementation files.

Now that this is refined/clarified a bit, is there still a question around reconsidering for the leads here?

@nigeltao
Copy link
Author

I am still a little confused. I'm not confident that I understand the ADL swap example, but I might just need to stare at it for longer.

Specifically, I'm not sure if it matters that, in Carbon, IIUC now, you can't import library "algorithm" to put things into your file's top-level scope unless your file is also in package std.

Possibly easier said than done, but if swap isn't a good Carbon example of where import library ... matters, do you have a better example (ADL or otherwise)?

@nigeltao
Copy link
Author

I'll step back from focusing on ADL and 'what breaks if we don't have import library ...'.


I'll start again. Let me know if I've got this wrong. In Carbon, I can say:

import PkgName library "LibName"

A detail I missed earlier is that the names within become qualified by the package name: it's PkgName.T not LibName.T. This is different from Go, where the qualified name (in practice) is prefixed by the equivalent of the Carbon library name, not the Carbon package name. As mentioned earlier, Go modules (roughly equivalent to Carbon packages) are invisible at the Go compiler / language level.

Whether or not C++ is a special case in Carbon, one reason Carbon does this is that C++ libraries' names weren't chosen with qualification in mind, yet in Carbon we'd like to be import multiple C++ libraries that might refer to each others' names (so we can't just automatically prefix these names with the library name):

import Cpp library "freetype2/ft2build.h"
// HB code refers to "FT_Face", not "freetype2.FT_Face".
import Cpp library "harfbuzz/hb.h"

Another point is that, for same-Carbon-package imports, it's not that the package name can be omitted, it's that it must be omitted.

I had originally thought that the import library ... syntax was for when my program uses an Absl.Mutex but I want to be lazy and just type Mutex. Maybe I want to use 30 different types and functions from the library but I don't want to write out 30 alias lines.

Instead, that syntax is for when another Abseil library wants to import Abseil's synchronization library, but there is no thispackage prefix that is for 'the current package' what this is for 'the current object'.


I still feel that the Carbon equivalent of:

#include "a.h"
#include "b.h"

Foo f;  // Where did this Foo name come from? I can't tell. This is a problem.

is a problem. In abstract terms, I'd still prefer that "when Abseil's time library imports Abseil's synchronization library, there's still some sort of prefix such that the time source code refers to prefix.Mutex". But I don't have a concrete proposal. I can see now that it's not that simple.

@nikola43

This comment was marked as off-topic.

@chandlerc
Copy link
Contributor

Sorry for delay in explicitly following up here...

Let me know if I've got this wrong.

I don't think so, your summary sounds right to me.

In abstract terms, I'd still prefer that "when Abseil's time library imports Abseil's synchronization library, there's still some sort of prefix such that the time source code refers to prefix.Mutex". But I don't have a concrete proposal. I can see now that it's not that simple.

Yeah, I can see the desire for this. But I think there are real tradeoffs here as well, and the current structure ended up feeling like a good balance of the different factors.

@github-actions
Copy link

github-actions bot commented Nov 6, 2022

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive label. The long term label can also be added for issues which are expected to take time.
This issue is labeled inactive because the last activity was over 90 days ago.

@github-actions github-actions bot added the inactive Issues and PRs which have been inactive for at least 90 days. label Nov 6, 2022
@jonmeow jonmeow added long term Issues expected to take over 90 days to resolve. leads question A question for the leads team and removed inactive Issues and PRs which have been inactive for at least 90 days. labels Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
leads question A question for the leads team long term Issues expected to take over 90 days to resolve.
Projects
None yet
Development

No branches or pull requests

5 participants