Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move DelimitedFiles to a separate external package #44663

Closed
ViralBShah opened this issue Mar 17, 2022 · 17 comments · Fixed by #45540
Closed

Move DelimitedFiles to a separate external package #44663

ViralBShah opened this issue Mar 17, 2022 · 17 comments · Fixed by #45540
Labels
excision Removal of code from Base or the repository packages Package management and loading

Comments

@ViralBShah
Copy link
Member

Can we move DelimitedFiles to an external package and also perhaps remove it from the system image?

Now that we are considering moving sparse out of the system image, I thought I would ask that about DelimitedFiles as well. #44247 (comment)

Is Dates.jl worth considering moving out? Any others?

@ViralBShah ViralBShah added the excision Removal of code from Base or the repository label Mar 17, 2022
@DilumAluthge
Copy link
Member

DilumAluthge commented Mar 17, 2022

The sharpest constraint we have is that we cannot remove any stdlibs that are direct or indirect dependencies of Pkg. (Because otherwise, you cannot install any external packages.)

In my opinion, everything else seems like fair game.

The full list of direct and indirect dependencies of Pkg is: (click to expand)
  1. ArgTools
  2. Artifacts
  3. Base64
  4. Dates
  5. Downloads
  6. InteractiveUtils
  7. LibCURL
  8. LibCURL_jll
  9. LibGit2
  10. LibSSH2_jll
  11. Libdl
  12. Logging
  13. Markdown
  14. MbedTLS_jll
  15. MozillaCACerts_jll
  16. NetworkOptions
  17. Pkg
  18. Printf
  19. REPL
  20. Random
  21. SHA
  22. Serialization
  23. Sockets
  24. TOML
  25. Tar
  26. UUIDs
  27. Unicode
  28. Zlib_jll
  29. nghttp2_jll
  30. p7zip_jll

@ViralBShah
Copy link
Member Author

ViralBShah commented Mar 17, 2022

Yikes, will we really move LinearAlgebra or Random out? Technically, they will just work like regular packages, but the annoyance level (due to type piracy issues) may be quite high.

There's also a two step process: (1) move things out into separate packages while they continue to be in the system image first, and then (2) in a future release, stop having them be stdlibs.

@DilumAluthge
Copy link
Member

DilumAluthge commented Mar 17, 2022

Yikes, will we really move LinearAlgebra or Random out? Technically, they will just work like regular packages, but the annoyance level (due to type piracy issues) may be quite high.

Fair enough 😂.

In addition to the direct and indirect dependencies of Pkg, I suppose that we should make a list (a very short list) of stdlibs that (1) would be very annoying to move out and (2) would cause a lot of user backlash if we did move them out.

@giordano
Copy link
Contributor

  1. Pkg

Pkg indirectly depends on itself?

@DilumAluthge
Copy link
Member

DilumAluthge commented Mar 17, 2022

33. Pkg

Pkg indirectly depends on itself?

Serves me right for auto-generating the list and not proofreading it 😂

@ViralBShah ViralBShah added packages Package management and loading triage This should be discussed on a triage call labels Mar 17, 2022
@adkabo
Copy link
Contributor

adkabo commented Mar 22, 2022

I checked the usage of a few Pkg.jl dependencies

  • Unicode afaict isn't used
  • Serialization is barely used and could easily be replaced with TOML
  • Dates is barely used and could easily be replaced with Libc

I would be in favor of separating these. Perhaps some of the others could be removed if Pkg.jl were split into pieces, where the dependencies needed for installing other packages can be retained.

@LilithHafner
Copy link
Member

I suppose that we should make a list (a very short list) of stdlibs that (1) would be very annoying to move out and (2) would cause a lot of user backlash if we did move them out.

How frequently packages are used should be a factor in (2) because of the small but nonzero cost of pkg> add (e.g. LinearAlgebra, Test)

@ViralBShah
Copy link
Member Author

Note that should be a non-issue if using Project.toml or Manifest.toml, and in any case, there's the autodetection of uninstalled packages and Julia offering to install it for you.

@adkabo
Copy link
Contributor

adkabo commented Mar 24, 2022

To quantify the installation cost of a small package:

% JULIA_LOAD_PATH=@:@stdlib julia --project=. --startup-file=no --quiet
julia> import Pkg

julia> @time Pkg.add("IterTools")
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
    Updating `/tmp/tmp.gmAhldmGp9/Project.toml`
  [c8e1da08] + IterTools v1.4.0
    Updating `/tmp/tmp.gmAhldmGp9/Manifest.toml`
  [c8e1da08] + IterTools v1.4.0
  3.437964 seconds (2.66 M allocations: 163.276 MiB, 1.66% gc time, 33.38% compilation time)

This will depend on internet speed etc. It could be worth profiling to find out where the time is spent.

@LilithHafner
Copy link
Member

In my currently running session

(@v1.9) pkg> activate --temp
  Activating new project at `/var/folders/hc/fn82kz1j5vl8w7lwd4l079y80000gn/T/jl_ZsugUQ`

julia> import Pkg

julia> @time Pkg.add("IterTools")
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
    Updating `/private/var/folders/hc/fn82kz1j5vl8w7lwd4l079y80000gn/T/jl_ZsugUQ/Project.toml`
  [c8e1da08] + IterTools v1.4.0
    Updating `/private/var/folders/hc/fn82kz1j5vl8w7lwd4l079y80000gn/T/jl_ZsugUQ/Manifest.toml`
  [c8e1da08] + IterTools v1.4.0
Precompiling project...
  1 dependency successfully precompiled in 2 seconds
 16.597010 seconds (6.17 M allocations: 382.854 MiB, 0.82% gc time, 67.02% compilation time)

My internet speed is 218.9 Mbps download 23.2 Mbps upload 18 ms latency to google.com.

@JeffBezanson
Copy link
Member

I say yes, let's move DelimitedFiles out of the stdlib and out of the system image (unless of course there is some kind of unforeseen breakage). The rest need to be taken on a case-by-case basis.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Mar 31, 2022
@ViralBShah ViralBShah reopened this Apr 2, 2022
@ViralBShah
Copy link
Member Author

Recipe for moving thing out while retaining history: JuliaAI/MLJOpenML.jl#1

@ViralBShah ViralBShah changed the title Move DelimitedFiles (and possibly others) to a separate external package Move DelimitedFiles to a separate external package Apr 9, 2022
@ViralBShah
Copy link
Member Author

ViralBShah commented Apr 9, 2022

Making this issue be solely about moving DelimitedFiles.jl out of stdlib. Steps as recommended by @KristofferC:

In this period, we should avoid making any changes to this stdlib.

@KristofferC As I post your steps, I wonder if we should remove it from julia as step 2 (like the other stdlibs that live in external repos), and once that is stabilized, we do the registration and subsequent removal from system image.

@brenhinkeller
Copy link
Contributor

Why?

@ViralBShah
Copy link
Member Author

#45121 moves it to an external package and out of the sysimage. However, it is still an stdlib at the moment (and that will be addressed by #45540).

@A-childs-encyclopedia
Copy link

Why is it being removed?

@StefanKarpinski
Copy link
Member

There's no reason for it to be built into every Julia binary. It makes the binary larger, slower to load, causes there to be more precompiled code to be potentially invalidated when other packages are loaded. It also makes it harder than necessary to develop and improve the DelimitedFiles package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
excision Removal of code from Base or the repository packages Package management and loading
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants