Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No longer use pool size in MMTk allocation #17

Draft
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

qinsoon
Copy link
Member

@qinsoon qinsoon commented Jun 23, 2023

No description provided.

qinsoon pushed a commit to qinsoon/julia that referenced this pull request May 2, 2024
…iaLang#53012)

Stdlib: StyledStrings
URL: https://github.com/JuliaLang/StyledStrings.jl.git
Stdlib branch: main
Julia branch: master
Old commit: 61e7b10
New commit: 302a0d0
Julia version: 1.11.0-DEV
StyledStrings version: 1.11.0
Bump invoked by: @vchuravy
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
JuliaLang/StyledStrings.jl@61e7b10...302a0d0

```
$ git log --oneline 61e7b10..302a0d0
302a0d0 Directly import ScopedValue
3fab35e Fix showing AnnotatedChar with colour
44f5fd7 Restrict the Base docstrings included in the docs
4dee5d9 Add readme
c49ae82 Add documentation task to CI
38ae1b3 Setting the terminal colour to :default is special
2709150 Adjust face merge tests after inheritance change
036631f Swap inheritance processing in face merge
9b35f08 Merge branch 'jn/Statefulempty' [mmtk#21]
508ab57 Refactor zip of eachindex to just use pairs
02b3f81 Remove use of length of Stateful
41c8218 Merge branch 'lh/ci-codecov' [mmtk#15]
d581fda Disable codecov commenting in every PR
a8a25ba Merge branch 'lh/ci-old-julia' [mmtk#13]
b7fca5b Merge branch 'lh/newline' [mmtk#16]
984485e Adjust newline parsing to work with CLRF encoding
27b02d1 Test styled"" parsing of \-continued newlines
a7981fe Merge pull request mmtk#17 from caleb-allen/add-beep-face
c1ab675 Add repl_prompt_beep face
0d8f5df Merge pull request mmtk#18 from JuliaLang/whitespace-fixup
1ef0f90 Nicer whitespace alignment
91a24f8 Disable CI for older versions of Julia
63ff132 add tagbot
792fda7 add CI
506afe3 add .gitignore
74e7135 Load the JULIA_*_COLOR env vars for compat
87d1fb5 Replace within-module eval with hygienic eval
4777e60 Touchups to documented examples
```

Co-authored-by: Dilum Aluthge <dilum@aluthge.com>
qinsoon pushed a commit to qinsoon/julia that referenced this pull request May 2, 2024
`@something` eagerly unwraps any `Some` given to it, while keeping the
variable between its arguments the same. This can be an issue if a
previously unpacked value is used as input to `@something`, leading to a
type instability on more than two arguments (e.g. because of a fallback
to `Some(nothing)`). By using different variables for each argument,
type inference has an easier time handling these cases that are isolated
to single branches anyway.

This also adds some comments to the macro, since it's non-obvious what
it does.

Benchmarking the specific case I encountered this in led to a ~2x
performance improvement on multiple machines.

1.10-beta3/master:

```
[sukera@tower 01]$ jl1100 -q --project=. -L 01.jl -e 'bench()'
v"1.10.0-beta3"

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  38.670 μs … 70.350 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     43.340 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   43.395 μs ±  1.518 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                              ▆█▂ ▁▁                           
  ▂▂▂▂▂▂▂▂▂▁▂▂▂▃▃▃▂▂▃▃▃▂▂▂▂▂▄▇███▆██▄▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂ ▃
  38.7 μs         Histogram: frequency by time          48 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.
```

This PR:

```
[sukera@tower 01]$ julia -q --project=. -L 01.jl -e 'bench()'
v"1.11.0-DEV.970"

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  22.820 μs …  44.980 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     24.300 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   24.370 μs ± 832.239 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                ▂▅▇██▇▆▅▁                                       
  ▂▂▂▂▂▂▂▂▃▃▄▅▇███████████▅▄▃▃▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▂▂ ▃
  22.8 μs         Histogram: frequency by time         27.7 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.
``` 


<details>
<summary>Benchmarking code (spoilers for Advent Of Code 2023 Day 01,
Part 01). Running this requires the input of that Advent Of Code
day.</summary>

```julia
using BenchmarkTools
using InteractiveUtils

isdigit(d::UInt8) = UInt8('0') <= d <= UInt8('9')
someDigit(c::UInt8) = isdigit(c) ? Some(c - UInt8('0')) : nothing

function part1(data)
    total = 0
    may_a = nothing
    may_b = nothing

    for c in data
        digitRes = someDigit(c)
        may_a = @something may_a digitRes Some(nothing)
        may_b = @something digitRes may_b Some(nothing)
        if c == UInt8('\n')
            digit_a = may_a::UInt8
            digit_b = may_b::UInt8
            total += digit_a*0xa + digit_b
            may_a = nothing
            may_b = nothing
        end
    end

    return total
end

function bench()
    data = read("input.txt")
    display(VERSION)
    println()
    display(@benchmark part1($data))
    nothing
end
```
</details>

<details>
<summary>`@code_warntype` before</summary>

```julia
julia> @code_warntype part1(data)
MethodInstance for part1(::Vector{UInt8})
  from part1(data) @ Main ~/Documents/projects/AOC/2023/01/01.jl:7
Arguments
  #self#::Core.Const(part1)
  data::Vector{UInt8}
Locals
  @_3::Union{Nothing, Tuple{UInt8, Int64}}
  may_b::Union{Nothing, UInt8}
  may_a::Union{Nothing, UInt8}
  total::Int64
  c::UInt8
  digit_b::UInt8
  digit_a::UInt8
  val@_10::Any
  val@_11::Any
  digitRes::Union{Nothing, Some{UInt8}}
  @_13::Union{Some{Nothing}, Some{UInt8}, UInt8}
  @_14::Union{Some{Nothing}, Some{UInt8}}
  @_15::Some{Nothing}
  @_16::Union{Some{Nothing}, Some{UInt8}, UInt8}
  @_17::Union{Some{Nothing}, UInt8}
  @_18::Some{Nothing}
Body::Int64
1 ──       (total = 0)
│          (may_a = Main.nothing)
│          (may_b = Main.nothing)
│    %4  = data::Vector{UInt8}
│          (@_3 = Base.iterate(%4))
│    %6  = (@_3 === nothing)::Bool
│    %7  = Base.not_int(%6)::Bool
└───       goto mmtk#24 if not %7
2 ┄─       Core.NewvarNode(:(digit_b))
│          Core.NewvarNode(:(digit_a))
│          Core.NewvarNode(:(val@_10))
│    %12 = @_3::Tuple{UInt8, Int64}
│          (c = Core.getfield(%12, 1))
│    %14 = Core.getfield(%12, 2)::Int64
│          (digitRes = Main.someDigit(c))
│          (val@_11 = may_a)
│    %17 = (val@_11::Union{Nothing, UInt8} !== Base.nothing)::Bool
└───       goto mmtk#4 if not %17
3 ──       (@_13 = val@_11::UInt8)
└───       goto mmtk#11
4 ──       (val@_11 = digitRes)
│    %22 = (val@_11::Union{Nothing, Some{UInt8}} !== Base.nothing)::Bool
└───       goto mmtk#6 if not %22
5 ──       (@_14 = val@_11::Some{UInt8})
└───       goto mmtk#10
6 ──       (val@_11 = Main.Some(Main.nothing))
│    %27 = (val@_11::Core.Const(Some(nothing)) !== Base.nothing)::Core.Const(true)
└───       goto mmtk#8 if not %27
7 ──       (@_15 = val@_11::Core.Const(Some(nothing)))
└───       goto mmtk#9
8 ──       Core.Const(:(@_15 = Base.nothing))
9 ┄─       (@_14 = @_15)
10 ┄       (@_13 = @_14)
11 ┄ %34 = @_13::Union{Some{Nothing}, Some{UInt8}, UInt8}
│          (may_a = Base.something(%34))
│          (val@_10 = digitRes)
│    %37 = (val@_10::Union{Nothing, Some{UInt8}} !== Base.nothing)::Bool
└───       goto mmtk#13 if not %37
12 ─       (@_16 = val@_10::Some{UInt8})
└───       goto mmtk#20
13 ─       (val@_10 = may_b)
│    %42 = (val@_10::Union{Nothing, UInt8} !== Base.nothing)::Bool
└───       goto mmtk#15 if not %42
14 ─       (@_17 = val@_10::UInt8)
└───       goto mmtk#19
15 ─       (val@_10 = Main.Some(Main.nothing))
│    %47 = (val@_10::Core.Const(Some(nothing)) !== Base.nothing)::Core.Const(true)
└───       goto mmtk#17 if not %47
16 ─       (@_18 = val@_10::Core.Const(Some(nothing)))
└───       goto mmtk#18
17 ─       Core.Const(:(@_18 = Base.nothing))
18 ┄       (@_17 = @_18)
19 ┄       (@_16 = @_17)
20 ┄ %54 = @_16::Union{Some{Nothing}, Some{UInt8}, UInt8}
│          (may_b = Base.something(%54))
│    %56 = c::UInt8
│    %57 = Main.UInt8('\n')::Core.Const(0x0a)
│    %58 = (%56 == %57)::Bool
└───       goto mmtk#22 if not %58
21 ─       (digit_a = Core.typeassert(may_a, Main.UInt8))
│          (digit_b = Core.typeassert(may_b, Main.UInt8))
│    %62 = total::Int64
│    %63 = (digit_a * 0x0a)::UInt8
│    %64 = (%63 + digit_b)::UInt8
│          (total = %62 + %64)
│          (may_a = Main.nothing)
└───       (may_b = Main.nothing)
22 ┄       (@_3 = Base.iterate(%4, %14))
│    %69 = (@_3 === nothing)::Bool
│    %70 = Base.not_int(%69)::Bool
└───       goto mmtk#24 if not %70
23 ─       goto mmtk#2
24 ┄       return total
```
</details>

<details>
<summary>`@code_native debuginfo=:none` Before </summary>

```julia
julia> @code_native debuginfo=:none part1(data)
	.text
	.file	"part1"
	.globl	julia_part1_418                 # -- Begin function julia_part1_418
	.p2align	4, 0x90
	.type	julia_part1_418,@function
julia_part1_418:                        # @julia_part1_418
# %bb.0:                                # %top
	push	rbp
	mov	rbp, rsp
	push	r15
	push	r14
	push	r13
	push	r12
	push	rbx
	sub	rsp, 40
	mov	rax, qword ptr [rdi + 8]
	test	rax, rax
	je	.LBB0_1
# %bb.2:                                # %L17
	mov	rcx, qword ptr [rdi]
	dec	rax
	mov	r10b, 1
	xor	r14d, r14d
                                        # implicit-def: $r12b
                                        # implicit-def: $r13b
                                        # implicit-def: $r9b
                                        # implicit-def: $sil
	mov	qword ptr [rbp - 64], rax       # 8-byte Spill
	mov	al, 1
	mov	dword ptr [rbp - 48], eax       # 4-byte Spill
                                        # implicit-def: $al
                                        # kill: killed $al
	xor	eax, eax
	mov	qword ptr [rbp - 56], rax       # 8-byte Spill
	mov	qword ptr [rbp - 72], rcx       # 8-byte Spill
                                        # implicit-def: $cl
	jmp	.LBB0_3
	.p2align	4, 0x90
.LBB0_8:                                #   in Loop: Header=BB0_3 Depth=1
	mov	dword ptr [rbp - 48], 0         # 4-byte Folded Spill
.LBB0_24:                               # %post_union_move
                                        #   in Loop: Header=BB0_3 Depth=1
	movzx	r13d, byte ptr [rbp - 41]       # 1-byte Folded Reload
	mov	r12d, r8d
	cmp	qword ptr [rbp - 64], r14       # 8-byte Folded Reload
	je	.LBB0_13
.LBB0_25:                               # %guard_exit113
                                        #   in Loop: Header=BB0_3 Depth=1
	inc	r14
	mov	r10d, ebx
.LBB0_3:                                # %L19
                                        # =>This Inner Loop Header: Depth=1
	mov	rax, qword ptr [rbp - 72]       # 8-byte Reload
	xor	ebx, ebx
	xor	edi, edi
	movzx	r15d, r9b
	movzx	ecx, cl
	movzx	esi, sil
	mov	r11b, 1
                                        # implicit-def: $r9b
	movzx	edx, byte ptr [rax + r14]
	lea	eax, [rdx - 58]
	lea	r8d, [rdx - 48]
	cmp	al, -10
	setae	bl
	setb	dil
	test	r10b, 1
	cmovne	r15d, edi
	mov	edi, 0
	cmovne	ecx, ebx
	mov	bl, 1
	cmovne	esi, edi
	test	r15b, 1
	jne	.LBB0_7
# %bb.4:                                # %L76
                                        #   in Loop: Header=BB0_3 Depth=1
	mov	r11b, 2
	test	cl, 1
	jne	.LBB0_5
# %bb.6:                                # %L78
                                        #   in Loop: Header=BB0_3 Depth=1
	mov	ebx, r10d
	mov	r9d, r15d
	mov	byte ptr [rbp - 41], r13b       # 1-byte Spill
	test	sil, 1
	je	.LBB0_26
.LBB0_7:                                # %L82
                                        #   in Loop: Header=BB0_3 Depth=1
	cmp	al, -11
	jbe	.LBB0_9
	jmp	.LBB0_8
	.p2align	4, 0x90
.LBB0_5:                                #   in Loop: Header=BB0_3 Depth=1
	mov	ecx, r8d
	mov	sil, 1
	xor	ebx, ebx
	mov	byte ptr [rbp - 41], r8b        # 1-byte Spill
	xor	r9d, r9d
	xor	ecx, ecx
	cmp	al, -11
	ja	.LBB0_8
.LBB0_9:                                # %L90
                                        #   in Loop: Header=BB0_3 Depth=1
	test	byte ptr [rbp - 48], 1          # 1-byte Folded Reload
	jne	.LBB0_23
# %bb.10:                               # %L115
                                        #   in Loop: Header=BB0_3 Depth=1
	cmp	dl, 10
	jne	.LBB0_11
# %bb.14:                               # %L122
                                        #   in Loop: Header=BB0_3 Depth=1
	test	r15b, 1
	jne	.LBB0_15
# %bb.12:                               # %L130.thread
                                        #   in Loop: Header=BB0_3 Depth=1
	movzx	eax, byte ptr [rbp - 41]        # 1-byte Folded Reload
	mov	bl, 1
	add	eax, eax
	lea	eax, [rax + 4*rax]
	add	al, r12b
	movzx	eax, al
	add	qword ptr [rbp - 56], rax       # 8-byte Folded Spill
	mov	al, 1
	mov	dword ptr [rbp - 48], eax       # 4-byte Spill
	cmp	qword ptr [rbp - 64], r14       # 8-byte Folded Reload
	jne	.LBB0_25
	jmp	.LBB0_13
	.p2align	4, 0x90
.LBB0_23:                               # %L115.thread
                                        #   in Loop: Header=BB0_3 Depth=1
	mov	al, 1
                                        # implicit-def: $r8b
	mov	dword ptr [rbp - 48], eax       # 4-byte Spill
	cmp	dl, 10
	jne	.LBB0_24
	jmp	.LBB0_21
.LBB0_11:                               #   in Loop: Header=BB0_3 Depth=1
	mov	r8d, r12d
	jmp	.LBB0_24
.LBB0_1:
	xor	eax, eax
	mov	qword ptr [rbp - 56], rax       # 8-byte Spill
.LBB0_13:                               # %L159
	mov	rax, qword ptr [rbp - 56]       # 8-byte Reload
	add	rsp, 40
	pop	rbx
	pop	r12
	pop	r13
	pop	r14
	pop	r15
	pop	rbp
	ret
.LBB0_21:                               # %L122.thread
	test	r15b, 1
	jne	.LBB0_15
# %bb.22:                               # %post_box_union58
	movabs	rdi, offset .L_j_str1
	movabs	rax, offset ijl_type_error
	movabs	rsi, 140008511215408
	movabs	rdx, 140008667209736
	call	rax
.LBB0_15:                               # %fail
	cmp	r11b, 1
	je	.LBB0_19
# %bb.16:                               # %fail
	movzx	eax, r11b
	cmp	eax, 2
	jne	.LBB0_17
# %bb.20:                               # %box_union54
	movzx	eax, byte ptr [rbp - 41]        # 1-byte Folded Reload
	movabs	rcx, offset jl_boxed_uint8_cache
	mov	rdx, qword ptr [rcx + 8*rax]
	jmp	.LBB0_18
.LBB0_26:                               # %L80
	movabs	rax, offset ijl_throw
	movabs	rdi, 140008495049392
	call	rax
.LBB0_19:                               # %box_union
	movabs	rdx, 140008667209736
	jmp	.LBB0_18
.LBB0_17:
	xor	edx, edx
.LBB0_18:                               # %post_box_union
	movabs	rdi, offset .L_j_str1
	movabs	rax, offset ijl_type_error
	movabs	rsi, 140008511215408
	call	rax
.Lfunc_end0:
	.size	julia_part1_418, .Lfunc_end0-julia_part1_418
                                        # -- End function
	.type	.L_j_str1,@object               # @_j_str1
	.section	.rodata.str1.1,"aMS",@progbits,1
.L_j_str1:
	.asciz	"typeassert"
	.size	.L_j_str1, 11

	.section	".note.GNU-stack","",@progbits
```
</details>

<details>
<summary>`@code_warntype` After</summary>

```julia

[sukera@tower 01]$ julia -q --project=. -L 01.jl
julia> data = read("input.txt");

julia> @code_warntype part1(data)
MethodInstance for part1(::Vector{UInt8})
  from part1(data) @ Main ~/Documents/projects/AOC/2023/01/01.jl:7
Arguments
  #self#::Core.Const(part1)
  data::Vector{UInt8}
Locals
  @_3::Union{Nothing, Tuple{UInt8, Int64}}
  may_b::Union{Nothing, UInt8}
  may_a::Union{Nothing, UInt8}
  total::Int64
  val@_7::Union{}
  val@_8::Union{}
  c::UInt8
  digit_b::UInt8
  digit_a::UInt8
  #JuliaLang#215::Some{Nothing}
  #JuliaLang#216::Union{Nothing, UInt8}
  #JuliaLang#217::Union{Nothing, Some{UInt8}}
  #JuliaLang#212::Some{Nothing}
  #JuliaLang#213::Union{Nothing, Some{UInt8}}
  #JuliaLang#214::Union{Nothing, UInt8}
  digitRes::Union{Nothing, Some{UInt8}}
  @_19::Union{Nothing, UInt8}
  @_20::Union{Nothing, UInt8}
  @_21::Nothing
  @_22::Union{Nothing, UInt8}
  @_23::Union{Nothing, UInt8}
  @_24::Nothing
Body::Int64
1 ──        (total = 0)
│           (may_a = Main.nothing)
│           (may_b = Main.nothing)
│    %4   = data::Vector{UInt8}
│           (@_3 = Base.iterate(%4))
│    %6   = @_3::Union{Nothing, Tuple{UInt8, Int64}}
│    %7   = (%6 === nothing)::Bool
│    %8   = Base.not_int(%7)::Bool
└───        goto mmtk#24 if not %8
2 ┄─        Core.NewvarNode(:(val@_7))
│           Core.NewvarNode(:(val@_8))
│           Core.NewvarNode(:(digit_b))
│           Core.NewvarNode(:(digit_a))
│           Core.NewvarNode(:(#JuliaLang#215))
│           Core.NewvarNode(:(#JuliaLang#216))
│           Core.NewvarNode(:(#JuliaLang#217))
│           Core.NewvarNode(:(#JuliaLang#212))
│           Core.NewvarNode(:(#JuliaLang#213))
│    %19  = @_3::Tuple{UInt8, Int64}
│           (c = Core.getfield(%19, 1))
│    %21  = Core.getfield(%19, 2)::Int64
│    %22  = c::UInt8
│           (digitRes = Main.someDigit(%22))
│    %24  = may_a::Union{Nothing, UInt8}
│           (#JuliaLang#214 = %24)
│    %26  = Base.:!::Core.Const(!)
│    %27  = #JuliaLang#214::Union{Nothing, UInt8}
│    %28  = Base.isnothing(%27)::Bool
│    %29  = (%26)(%28)::Bool
└───        goto mmtk#4 if not %29
3 ── %31  = #JuliaLang#214::UInt8
│           (@_19 = Base.something(%31))
└───        goto mmtk#11
4 ── %34  = digitRes::Union{Nothing, Some{UInt8}}
│           (#JuliaLang#213 = %34)
│    %36  = Base.:!::Core.Const(!)
│    %37  = #JuliaLang#213::Union{Nothing, Some{UInt8}}
│    %38  = Base.isnothing(%37)::Bool
│    %39  = (%36)(%38)::Bool
└───        goto mmtk#6 if not %39
5 ── %41  = #JuliaLang#213::Some{UInt8}
│           (@_20 = Base.something(%41))
└───        goto mmtk#10
6 ── %44  = Main.Some::Core.Const(Some)
│    %45  = Main.nothing::Core.Const(nothing)
│           (#JuliaLang#212 = (%44)(%45))
│    %47  = Base.:!::Core.Const(!)
│    %48  = #JuliaLang#212::Core.Const(Some(nothing))
│    %49  = Base.isnothing(%48)::Core.Const(false)
│    %50  = (%47)(%49)::Core.Const(true)
└───        goto mmtk#8 if not %50
7 ── %52  = #JuliaLang#212::Core.Const(Some(nothing))
│           (@_21 = Base.something(%52))
└───        goto mmtk#9
8 ──        Core.Const(nothing)
│           Core.Const(:(val@_8 = Base.something(Base.nothing)))
│           Core.Const(nothing)
│           Core.Const(:(val@_8))
└───        Core.Const(:(@_21 = %58))
9 ┄─ %60  = @_21::Core.Const(nothing)
└───        (@_20 = %60)
10 ┄ %62  = @_20::Union{Nothing, UInt8}
└───        (@_19 = %62)
11 ┄ %64  = @_19::Union{Nothing, UInt8}
│           (may_a = %64)
│    %66  = digitRes::Union{Nothing, Some{UInt8}}
│           (#JuliaLang#217 = %66)
│    %68  = Base.:!::Core.Const(!)
│    %69  = #JuliaLang#217::Union{Nothing, Some{UInt8}}
│    %70  = Base.isnothing(%69)::Bool
│    %71  = (%68)(%70)::Bool
└───        goto mmtk#13 if not %71
12 ─ %73  = #JuliaLang#217::Some{UInt8}
│           (@_22 = Base.something(%73))
└───        goto mmtk#20
13 ─ %76  = may_b::Union{Nothing, UInt8}
│           (#JuliaLang#216 = %76)
│    %78  = Base.:!::Core.Const(!)
│    %79  = #JuliaLang#216::Union{Nothing, UInt8}
│    %80  = Base.isnothing(%79)::Bool
│    %81  = (%78)(%80)::Bool
└───        goto mmtk#15 if not %81
14 ─ %83  = #JuliaLang#216::UInt8
│           (@_23 = Base.something(%83))
└───        goto mmtk#19
15 ─ %86  = Main.Some::Core.Const(Some)
│    %87  = Main.nothing::Core.Const(nothing)
│           (#JuliaLang#215 = (%86)(%87))
│    %89  = Base.:!::Core.Const(!)
│    %90  = #JuliaLang#215::Core.Const(Some(nothing))
│    %91  = Base.isnothing(%90)::Core.Const(false)
│    %92  = (%89)(%91)::Core.Const(true)
└───        goto mmtk#17 if not %92
16 ─ %94  = #JuliaLang#215::Core.Const(Some(nothing))
│           (@_24 = Base.something(%94))
└───        goto mmtk#18
17 ─        Core.Const(nothing)
│           Core.Const(:(val@_7 = Base.something(Base.nothing)))
│           Core.Const(nothing)
│           Core.Const(:(val@_7))
└───        Core.Const(:(@_24 = %100))
18 ┄ %102 = @_24::Core.Const(nothing)
└───        (@_23 = %102)
19 ┄ %104 = @_23::Union{Nothing, UInt8}
└───        (@_22 = %104)
20 ┄ %106 = @_22::Union{Nothing, UInt8}
│           (may_b = %106)
│    %108 = Main.:(==)::Core.Const(==)
│    %109 = c::UInt8
│    %110 = Main.UInt8('\n')::Core.Const(0x0a)
│    %111 = (%108)(%109, %110)::Bool
└───        goto mmtk#22 if not %111
21 ─ %113 = may_a::Union{Nothing, UInt8}
│           (digit_a = Core.typeassert(%113, Main.UInt8))
│    %115 = may_b::Union{Nothing, UInt8}
│           (digit_b = Core.typeassert(%115, Main.UInt8))
│    %117 = Main.:+::Core.Const(+)
│    %118 = total::Int64
│    %119 = Main.:+::Core.Const(+)
│    %120 = Main.:*::Core.Const(*)
│    %121 = digit_a::UInt8
│    %122 = (%120)(%121, 0x0a)::UInt8
│    %123 = digit_b::UInt8
│    %124 = (%119)(%122, %123)::UInt8
│           (total = (%117)(%118, %124))
│           (may_a = Main.nothing)
└───        (may_b = Main.nothing)
22 ┄        (@_3 = Base.iterate(%4, %21))
│    %129 = @_3::Union{Nothing, Tuple{UInt8, Int64}}
│    %130 = (%129 === nothing)::Bool
│    %131 = Base.not_int(%130)::Bool
└───        goto mmtk#24 if not %131
23 ─        goto mmtk#2
24 ┄ %134 = total::Int64
└───        return %134
```
</details>


<details>
<summary>`@code_native debuginfo=:none` After </summary>

```julia

julia> @code_native debuginfo=:none part1(data)
	.text
	.file	"part1"
	.globl	julia_part1_1203                # -- Begin function julia_part1_1203
	.p2align	4, 0x90
	.type	julia_part1_1203,@function
julia_part1_1203:                       # @julia_part1_1203
; Function Signature: part1(Array{UInt8, 1})
# %bb.0:                                # %top
	#DEBUG_VALUE: part1:data <- [DW_OP_deref] $rdi
	push	rbp
	mov	rbp, rsp
	push	r15
	push	r14
	push	r13
	push	r12
	push	rbx
	sub	rsp, 40
	vxorps	xmm0, xmm0, xmm0
	#APP
	mov	rax, qword ptr fs:[0]
	#NO_APP
	lea	rdx, [rbp - 64]
	vmovaps	xmmword ptr [rbp - 64], xmm0
	mov	qword ptr [rbp - 48], 0
	mov	rcx, qword ptr [rax - 8]
	mov	qword ptr [rbp - 64], 4
	mov	rax, qword ptr [rcx]
	mov	qword ptr [rbp - 72], rcx       # 8-byte Spill
	mov	qword ptr [rbp - 56], rax
	mov	qword ptr [rcx], rdx
	#DEBUG_VALUE: part1:data <- [DW_OP_deref] 0
	mov	r15, qword ptr [rdi + 16]
	test	r15, r15
	je	.LBB0_1
# %bb.2:                                # %L34
	mov	r14, qword ptr [rdi]
	dec	r15
	mov	r11b, 1
	mov	r13b, 1
                                        # implicit-def: $r12b
                                        # implicit-def: $r10b
	xor	eax, eax
	jmp	.LBB0_3
	.p2align	4, 0x90
.LBB0_4:                                #   in Loop: Header=BB0_3 Depth=1
	xor	r11d, r11d
	mov	ebx, edi
	mov	r10d, r8d
.LBB0_9:                                # %L114
                                        #   in Loop: Header=BB0_3 Depth=1
	mov	r12d, esi
	test	r15, r15
	je	.LBB0_12
.LBB0_10:                               # %guard_exit126
                                        #   in Loop: Header=BB0_3 Depth=1
	inc	r14
	dec	r15
	mov	r13d, ebx
.LBB0_3:                                # %L36
                                        # =>This Inner Loop Header: Depth=1
	movzx	edx, byte ptr [r14]
	test	r13b, 1
	movzx	edi, r13b
	mov	ebx, 1
	mov	ecx, 0
	cmove	ebx, edi
	cmovne	edi, ecx
	movzx	ecx, r10b
	lea	esi, [rdx - 48]
	lea	r9d, [rdx - 58]
	movzx	r8d, sil
	cmove	r8d, ecx
	cmp	r9b, -11
	ja	.LBB0_4
# %bb.5:                                # %L89
                                        #   in Loop: Header=BB0_3 Depth=1
	test	r11b, 1
	jne	.LBB0_8
# %bb.6:                                # %L102
                                        #   in Loop: Header=BB0_3 Depth=1
	cmp	dl, 10
	jne	.LBB0_7
# %bb.13:                               # %L106
                                        #   in Loop: Header=BB0_3 Depth=1
	test	r13b, 1
	jne	.LBB0_14
# %bb.11:                               # %L114.thread
                                        #   in Loop: Header=BB0_3 Depth=1
	add	ecx, ecx
	mov	bl, 1
	mov	r11b, 1
	lea	ecx, [rcx + 4*rcx]
	add	cl, r12b
	movzx	ecx, cl
	add	rax, rcx
	test	r15, r15
	jne	.LBB0_10
	jmp	.LBB0_12
	.p2align	4, 0x90
.LBB0_8:                                # %L102.thread
                                        #   in Loop: Header=BB0_3 Depth=1
	mov	r11b, 1
                                        # implicit-def: $sil
	cmp	dl, 10
	jne	.LBB0_9
	jmp	.LBB0_15
.LBB0_7:                                #   in Loop: Header=BB0_3 Depth=1
	mov	esi, r12d
	jmp	.LBB0_9
.LBB0_1:
	xor	eax, eax
.LBB0_12:                               # %L154
	mov	rcx, qword ptr [rbp - 56]
	mov	rdx, qword ptr [rbp - 72]       # 8-byte Reload
	mov	qword ptr [rdx], rcx
	add	rsp, 40
	pop	rbx
	pop	r12
	pop	r13
	pop	r14
	pop	r15
	pop	rbp
	ret
.LBB0_15:                               # %L106.thread
	test	r13b, 1
	jne	.LBB0_14
# %bb.16:                               # %post_box_union47
	movabs	rax, offset jl_nothing
	movabs	rcx, offset jl_small_typeof
	movabs	rdi, offset ".L_j_str_typeassert#1"
	mov	rdx, qword ptr [rax]
	mov	rsi, qword ptr [rcx + 336]
	movabs	rax, offset ijl_type_error
	mov	qword ptr [rbp - 48], rsi
	call	rax
.LBB0_14:                               # %post_box_union
	movabs	rax, offset jl_nothing
	movabs	rcx, offset jl_small_typeof
	movabs	rdi, offset ".L_j_str_typeassert#1"
	mov	rdx, qword ptr [rax]
	mov	rsi, qword ptr [rcx + 336]
	movabs	rax, offset ijl_type_error
	mov	qword ptr [rbp - 48], rsi
	call	rax
.Lfunc_end0:
	.size	julia_part1_1203, .Lfunc_end0-julia_part1_1203
                                        # -- End function
	.type	".L_j_str_typeassert#1",@object # @"_j_str_typeassert#1"
	.section	.rodata.str1.1,"aMS",@progbits,1
".L_j_str_typeassert#1":
	.asciz	"typeassert"
	.size	".L_j_str_typeassert#1", 11

	.section	".note.GNU-stack","",@progbits
```
</details>

Co-authored-by: Sukera <Seelengrab@users.noreply.github.com>
udesou pushed a commit to udesou/julia that referenced this pull request Sep 25, 2024
With this, certain indexing operations involving a `BandIndex` may be
evaluated as constants. This isn't used directly presently, but might
allow for more performant broadcasting in the future.
With this,
```julia
julia> n = 3; T = Tridiagonal(rand(n-1), rand(n), rand(n-1));

julia> @code_warntype ((T,j) -> UpperTriangular(T)[LinearAlgebra.BandIndex(2,j)])(T, 1)
MethodInstance for (::var"mmtk#17#18")(::Tridiagonal{Float64, Vector{Float64}}, ::Int64)
  from (::var"mmtk#17#18")(T, j) @ Main REPL[12]:1
Arguments
  #self#::Core.Const(var"mmtk#17#18"())
  T::Tridiagonal{Float64, Vector{Float64}}
  j::Int64
Body::Float64
1 ─ %1 = Main.UpperTriangular(T)::UpperTriangular{Float64, Tridiagonal{Float64, Vector{Float64}}}
│   %2 = LinearAlgebra.BandIndex::Core.Const(LinearAlgebra.BandIndex)
│   %3 = (%2)(2, j)::Core.PartialStruct(LinearAlgebra.BandIndex, Any[Core.Const(2), Int64])
│   %4 = Base.getindex(%1, %3)::Core.Const(0.0)
└──      return %4
```
The indexing operation may be evaluated at compile-time, as the band
index is constant-propagated.
udesou added a commit that referenced this pull request Oct 14, 2024
* Improve type-stability in SymTridiagonal triu!/tril! (#55646)

Changing the final `elseif` branch to an `else` makes it clear that the
method definite returns a value, and the returned type is now a
`Tridiagonal` instead of a `Union{Nothing, Tridiagonal}`

* Reuse size-check function from `lacpy!` in `copytrito!` (#55664)

Since there is a size-check function in `lacpy!` that does the same
thing, we may reuse it instead of duplicating the check

* Update calling-c-and-fortran-code.md: fix ccall parameters (not a tuple) (#55665)

* Allow exact redefinition for types with recursive supertype reference (#55380)

This PR allows redefining a type when the new type is exactly identical
to the previous one (like #17618, #20592 and #21024), even if the type
has a reference to itself in its supertype. That particular case used to
error (issue #54757), whereas with this PR:
```julia
julia> struct Rec <: AbstractVector{Rec} end

julia> struct Rec <: AbstractVector{Rec} end # this used to error

julia>
```


Fix #54757 by implementing the solution proposed there. Hence, this
should also fix downstream Revise bug
https://github.com/timholy/Revise.jl/issues/813.

---------

Co-authored-by: N5N3 <2642243996@qq.com>

* Reroute Symmetric/Hermitian + Diagonal through triangular (#55605)

This should fix the `Diagonal`-related issue from
https://github.com/JuliaLang/julia/issues/55590, although the
`SymTridiagonal` one still remains.
```julia
julia> using LinearAlgebra

julia> a = Matrix{BigFloat}(undef, 2,2)
2×2 Matrix{BigFloat}:
 #undef  #undef
 #undef  #undef

julia> a[1] = 1; a[3] = 1; a[4] = 1
1

julia> a = Hermitian(a)
2×2 Hermitian{BigFloat, Matrix{BigFloat}}:
 1.0  1.0
 1.0  1.0

julia> b = Symmetric(a)
2×2 Symmetric{BigFloat, Matrix{BigFloat}}:
 1.0  1.0
 1.0  1.0

julia> c = Diagonal([1,1])
2×2 Diagonal{Int64, Vector{Int64}}:
 1  ⋅
 ⋅  1

julia> a+c
2×2 Hermitian{BigFloat, Matrix{BigFloat}}:
 2.0  1.0
 1.0  2.0

julia> b+c
2×2 Symmetric{BigFloat, Matrix{BigFloat}}:
 2.0  1.0
 1.0  2.0
```

* inference: check argtype compatibility in `abstract_call_opaque_closure` (#55672)

* Forward istriu/istril for triangular to parent (#55663)

* win: move stack_overflow_warning to the backtrace fiber (#55640)

There is not enough stack space remaining after a stack overflow on
Windows to allocate the 4k page used by `write` to call the WriteFile
syscall. This causes it to hard-crash. But we can simply run this on the
altstack implementation, where there is plenty of space.

* Check if ct is not null before doing is_addr_on_stack in the macos signal handler. (#55603)

Before the check we used to segfault while segfaulting and hang

---------

Co-authored-by: Jameson Nash <vtjnash@gmail.com>

* Profile.print: color Base/Core & packages. Make paths clickable (#55335)

Updated
## This PR
![Screenshot 2024-09-02 at 1 47
23 PM](https://github.com/user-attachments/assets/1264e623-70b2-462a-a595-1db2985caf64)


## master
![Screenshot 2024-09-02 at 1 49
42 PM](https://github.com/user-attachments/assets/14d62fe1-c317-4df5-86e9-7c555f9ab6f1)



Todo:
- [ ] ~Maybe drop the `@` prefix when coloring it, given it's obviously
special when colored~ If someone copy-pasted the profile into an issue
this would make it confusing.
- [ ] Figure out why `Profile.print(format=:flat)` is truncating before
the terminal width is used up
- [x] Make filepaths terminal links (even if they're truncated)

* better signal handling (#55623)

Instead of relying on creating a fake stack frame, and having no signals
delivered, kernel bugs, accidentally gc_collect, or other issues occur
during the delivery and execution of these calls, use the ability we
added recently to emulate a longjmp into a unw_context to eliminate any
time where there would exist any invalid states.

Secondly, when calling jl_exit_thread0_cb, we used to end up completely
smashing the unwind info (with CFI_NOUNWIND), but this makes core files
from SIGQUIT much less helpful, so we now have a `fake_stack_pop`
function with contains the necessary CFI directives such that a minimal
unwind from the debugger will likely still succeed up into the frames
that were removed. We cannot do this perfectly on AArch64 since that
platform's DWARF spec lacks the ability to do so. On other platforms,
this should be possible to implement exactly (subject to libunwind
implementation quality). This is currently thus only fully implemented for
x86_64 on Darwin Apple.

* fix `exct` for mismatched opaque closure call

* improve `exct` modeling for opaque closure calls

* fix `nothrow` modeling for `invoke` calls

* improve `exct` modeling for `invoke` calls

* show a bit more detail when finished precompiling (#55660)

* subtype: minor clean up for fast path for lhs union and rhs typevar (#55645)

Follow up #55413.
The error pattern mentioned in
https://github.com/JuliaLang/julia/pull/55413#issuecomment-2288384468
care's `∃y`'s ub in env rather than its original ub.
So it seems more robust to check the bounds in env directly.
The equivalent typevar propagation is lifted from `subtype_var` for the
same reason.

* Adding `JL_DATA_TYPE` annotation to `_jl_globalref_t` (#55684)

`_jl_globalref_t` seems to be allocated in the heap, and there is an
object `jl_globalref_type` which indicates that it is in fact, a data
type, thus it should be annotated with `JL_DATA_TYPE`??

* Make GEP when loading the PTLS an inbounds one. (#55682)

Non inbounds GEPs should only be used when doing pointer arithmethic i.e
Ptr or MemoryRef boundscheck.
Found when auditing non inbounds GEPs for
https://github.com/JuliaLang/julia/pull/55681

* codegen: make boundscheck GEP not be inbounds while the load GEP is inbounds (#55681)

Avoids undefined behavior on the boundschecking arithmetic, which is
correct only assuming overflow follows unsigned arithmetic wrap around
rules.

Also add names to the Memory related LLVM instructions to aid debugging

Closes: https://github.com/JuliaLang/julia/pull/55674

* Make `rename` public (#55652)

Fixes #41584. Follow up of #55503

I think `rename` is a very useful low-level file system operation. Many
other programming languages have this function, so it is useful when
porting IO code to Julia.

One use case is to improve the Zarr.jl package to be more compatible
with zarr-python.

https://github.com/zarr-developers/zarr-python/blob/0b5483a7958e2ae5512a14eb424a84b2a75dd727/src/zarr/v2/storage.py#L994
uses the `os.replace` function. It would be nice to be able to directly
use `Base.rename` as a replacement for `os.replace` to ensure
compatibility.

Another use case is writing a safe zip file extractor in pure Julia.
https://github.com/madler/sunzip/blob/34107fa9e2a2e36e7e72725dc4c58c9ad6179898/sunzip.c#L365
uses the `rename` function to do this in C.

Lastly in
https://github.com/medyan-dev/MEDYANSimRunner.jl/blob/67d5b42cc599670486d5d640260a95e951091f7a/src/file-saving.jl#L83
I am using `ccall(:jl_fs_rename` to save files, because I have large
numbers of Julia processes creating and reading these files at the same
time on a distributed file system on a cluster, so I don't want data to
become corrupted if one of the nodes crashes (which happens fairly
regularly). However `jl_fs_rename` is not public, and might break in a
future release.

This PR also adds a note to `mv` comparing it to the `mv` command,
similar to the note on the `cp` function.

* contrib: include private libdir in `ldflags` on macOS (#55687)

The private libdir is used on macOS, so it needs to be included in our
`ldflags`

* Profile.print: Shorten C paths too (#55683)

* [LLVMLibUnwindJLL] Update llvmlibunwind to 14.0.6 (#48140)

* Add `JL_DATA_TYPE` for `jl_line_info_node_t` and `jl_code_info_t` (#55698)

* Canonicalize names of nested functions by keeping a more fine grained counter -- per (module, method name) pair (#53719)

As mentioned in https://github.com/JuliaLang/julia/pull/53716, we've
been noticing that `precompile` statements lists from one version of our
codebase often don't apply cleanly in a slightly different version.

That's because a lot of nested and anonymous function names have a
global numeric suffix which is incremented every time a new name is
generated, and these numeric suffixes are not very stable across
codebase changes.

To solve this, this PR makes the numeric suffixes a bit more fine
grained: every pair of (module, top-level/outermost function name) will
have its own counter, which should make nested function names a bit more
stable across different versions.

This PR applies @JeffBezanson's idea of making the symbol name changes
directly in `current-julia-module-counter`.

Here is an example:

```Julia
julia> function foo(x)
           function bar(y)
               return x + y
           end
       end
foo (generic function with 1 method)

julia> f = foo(42)
(::var"#bar#foo##0"{Int64}) (generic function with 1 method)
```

* Use `uv_available_parallelism` inside `jl_effective_threads` (#55592)

* [LinearAlgebra] Initialise number of BLAS threads with `jl_effective_threads` (#55574)

This is a safer estimate than `Sys.CPU_THREADS` to avoid oversubscribing
the machine when running distributed applications, or when the Julia
process is constrained by external controls (`taskset`, `cgroups`,
etc.).

Fix #55572

* Artifacts: Improve type-stability (#55707)

This improves Artifacts.jl to make `artifact"..."` fully type-stable, so
that it can be used with `--trim`.

This is a requirement for JLL support w/ trimmed executables.

Dependent on https://github.com/JuliaLang/julia/pull/55016

---------

Co-authored-by: Gabriel Baraldi <baraldigabriel@gmail.com>

* Remove redundant conversion in structured matrix broadcasting (#55695)

The additional construction is unnecessary, as we are already
constructing a `Matrix`.
Performance:
```julia
julia> using LinearAlgebra

julia> U = UpperTriangular(rand(1000,1000));

julia> L = LowerTriangular(rand(1000,1000));

julia> @btime $U .+ $L;
  1.956 ms (6 allocations: 15.26 MiB) # nightly
  1.421 ms (3 allocations: 7.63 MiB) # This PR
```

* [Profile] fix threading issue (#55704)

I forgot about the existence of threads, so had hard-coded this to only
support one thread. Clearly that is not sufficient though, so use the
semaphore here as it is intended to be used.

Fixes #55703

---------

Co-authored-by: Ian Butterworth <i.r.butterworth@gmail.com>

* delete flaky ranges/`TwicePrecision` test (#55712)

Fixes #55710

* Avoid stack overflow in triangular eigvecs (#55497)

This fixes a stack overflow in 
```julia
julia> using LinearAlgebra, StaticArrays

julia> U = UpperTriangular(SMatrix{2,2}(1:4))
2×2 UpperTriangular{Int64, SMatrix{2, 2, Int64, 4}} with indices SOneTo(2)×SOneTo(2):
 1  3
 ⋅  4

julia> eigvecs(U)
Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
ERROR: StackOverflowError:
Stacktrace:
 [1] eigvecs(A::UpperTriangular{Float32, SMatrix{2, 2, Float32, 4}}) (repeats 79984 times)
   @ LinearAlgebra ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/LinearAlgebra/src/triangular.jl:2749
```
After this,
```julia
julia> eigvecs(U)
2×2 Matrix{Float32}:
 1.0  1.0
 0.0  1.0
```

* builtins: add `Core.throw_methoderror` (#55705)

This allows us to simulate/mark calls that are known-to-fail.

Required for https://github.com/JuliaLang/julia/pull/54972/

* Small missing tests for Irrationals (#55657)

Looks like a bunch of methods for `Irrational`s are tested but not
picked up by coverage...

* Implement faster thread local rng for scheduler (#55501)

Implement optimal uniform random number generator using the method
proposed in https://github.com/swiftlang/swift/pull/39143 based on
OpenSSL's implementation of it in
https://github.com/openssl/openssl/blob/1d2cbd9b5a126189d5e9bc78a3bdb9709427d02b/crypto/rand/rand_uniform.c#L13-L99

This PR also fixes some bugs found while developing it. This is a
replacement for https://github.com/JuliaLang/julia/pull/50203 and fixes
the issues found by @IanButterworth with both rngs

C rng
<img width="1011" alt="image"
src="https://github.com/user-attachments/assets/0dd9d5f2-17ef-4a70-b275-1d12692be060">

New scheduler rng
<img width="985" alt="image"
src="https://github.com/user-attachments/assets/4abd0a57-a1d9-46ec-99a5-535f366ecafa">

~On my benchmarks the julia implementation seems to be almost 50% faster
than the current implementation.~
With oscars suggestion of removing the debiasing this is now almost 5x
faster than the original implementation. And almost fully branchless

We might want to backport the two previous commits since they
technically fix bugs.

---------

Co-authored-by: Valentin Churavy <vchuravy@users.noreply.github.com>

* Add precompile signatures to Markdown to reduce latency. (#55715)

Fixes #55706 that is seemingly a 4472x regression, not just 16x (was my
first guess, based on CondaPkg, also fixes or greatly mitigates
https://github.com/JuliaPy/CondaPkg.jl/issues/145), and large part of 3x
regression for PythonCall.

---------

Co-authored-by: Kristoffer Carlsson <kcarlsson89@gmail.com>

* Fix invalidations for FileIO (#55593)

Fixes https://github.com/JuliaIO/FileIO.jl/issues/396

* Fix various issues with PGO+LTO makefile (#55581)

This fixes various issues with the PGO+LTO makefile
- `USECCACHE` doesn't work throwing an error at
https://github.com/JuliaLang/julia/blob/eb5587dac02d1f6edf486a71b95149139cc5d9f7/Make.inc#L734
This is because setting `CC` and `CCX` by passing them as arguments to
`make` prevents `Make.inc` from prepending these variables with `ccache`
as `Make.inc` doesn't use override. To workaround this I instead set
`USECLANG` and add the toolchain to the `PATH`.
- To deal with similar issues for the other make flags, I pass them as
environment variables which can be edited in `Make.inc`.
- I add a way to build in one go by creating the `all` target, now you
can just run `make` and a PGO+LTO build that profiles Julia's build will
be generated.
- I workaround `PROFRAW_FILES` not being reevaluated after `stage1`
builds, this caused the generation of `PROFILE_FILE` to run an outdated
command if `stage1` was built and affected the profraw files. This is
important when building in one go.
- I add a way to run rules like `binary-dist` which are not defined in
this makefile with the correct toolchain which for example prevents
`make binary-dist` from unnecessarily rebuilding `sys.ji`.
- Include `-Wl,--undefined-version` till
https://github.com/JuliaLang/julia/issues/54533 gets fixed.

These changes need to be copied to the PGO+LTO+BOLT makefile and some to
the BOLT makefile in a later pr.

---------

Co-authored-by: Zentrik <Zentrik@users.noreply.github.com>

* Fix `pkgdir` for extensions (#55720)

Fixes https://github.com/JuliaLang/julia/issues/55719

---------

Co-authored-by: Max Horn <241512+fingolfin@users.noreply.github.com>

* Avoid materializing arrays in bidiag matmul (#55450)

Currently, small `Bidiagonal`/`Tridiagonal` matrices are materialized in
matrix multiplications, but this is wasteful and unnecessary. This PR
changes this to use a naive matrix multiplication for small matrices,
and fall back to the banded multiplication for larger ones.
Multiplication by a `Bidiagonal` falls back to a banded matrix
multiplication for all sizes in the current implementation, and iterates
in a cache-friendly manner for the non-`Bidiagonal` matrix.

In certain cases, the matrices were being materialized if the
non-structured matrix was small, even if the structured matrix was
large. This is changed as well in this PR.

Some improvements in performance:
```julia
julia> B = Bidiagonal(rand(3), rand(2), :U); A = rand(size(B)...); C = similar(A);

julia> @btime mul!($C, $A, $B);
  193.152 ns (6 allocations: 352 bytes) # nightly v"1.12.0-DEV.1034"
  18.826 ns (0 allocations: 0 bytes) # This PR

julia> T = Tridiagonal(rand(99), rand(100), rand(99)); A = rand(2, size(T,2)); C = similar(A);

julia> @btime mul!($C, $A, $T);
  9.398 μs (8 allocations: 79.94 KiB) # nightly
  416.407 ns (0 allocations: 0 bytes) # This PR

julia> B = Bidiagonal(rand(300), rand(299), :U); A = rand(20000, size(B,2)); C = similar(A);

julia> @btime mul!($C, $A, $B);
  33.395 ms (0 allocations: 0 bytes) # nightly
  6.695 ms (0 allocations: 0 bytes) # This PR (cache-friendly)
```

Closes https://github.com/JuliaLang/julia/pull/55414

---------

Co-authored-by: Daniel Karrasch <daniel.karrasch@posteo.de>

* Fix `@time_imports` extension recognition (#55718)

* drop typed GEP calls (#55708)

Now that we use LLVM 18, and almost have LLVM 19 support, do cleanup to
remove LLVM 15/16 type pointer support. LLVM now slightly prefers that
we rewrite our complex GEP to use a simple emit_ptrgep call instead,
which is also much simpler for julia to emit also.

* minor fixup for JuliaLang/julia#55705 (#55726)

* [REPL] prevent silent hang if precompile script async blocks fail (#55685)

* Various fixes to byte / bytearray search (#54579)

This was originally intended as a targeted fix to #54578, but I ran into
a bunch of smaller issues with this code that also needed to be solved
and it turned out to be difficult to fix them with small, trivial PRs.

I would also like to refactor this whole file, but I want these
correctness fixes to be merged first, because a larger refactoring has
higher risk of getting stuck without getting reviewed and merged.

## Larger things that needs decisions
* The internal union `Base.ByteArray` has been deleted. Instead, the
unions `DenseInt8` and `DenseUInt8` have been added. These more
comprehensively cover the types that was meant, e.g. `Memory{UInt8}` was
incorrectly not covered by the former. As stated in the TODO, the
concept of a "memory backed dense byte array" is needed throughout
Julia, so this ideally needs to be implemented as a single type and used
throughout Base. The fix here is a decent temporary solution. See #53178
#54581
* The `findall` docstring between two arrays was incorrectly not
attached to the method - now it is. **Note that this change _changes_
the documentation** since it includes a docstring that was previously
missed. Hence, it's an API addition.
* Added a new minimal `testhelpers/OffsetDenseArrays.jl` which provide a
`DenseVector` with offset axes for testing purposes.

## Trivial fixes
* `findfirst(==(Int8(-1)), [0xff])` and similar findlast, findnext and
findprev is no longer buggy, see #54578
* `findfirst([0x0ff], Int8[-1])` is similarly no longer buggy, see
#54578
* `findnext(==('\xa6'), "æ", 1)` and `findprev(==('\xa6'), "æa", 2)` no
longer incorrectly throws an error
* The byte-oriented find* functions now work correctly with offset
arrays
* Fixed incorrect use of `GC.@preserve`, where the pointer was taken
before the preserve block.
* More of the optimised string methods now also apply to
`SubString{String}`


Closes #54578
Co-authored-by: Martin Holters <martin.holters@hsu-hh.de>

* codegen: deduplicate code for calling a specsig (#55728)

I am tired of having 3 gratuitously different versions of this code to
maintain.

* Fix "Various fixes to byte / bytearray search"  (#55734)

Fixes the conflict between #54593 and #54579
`_search` returns `nothing` instead of zero as a sentinal in #54579

* Fix `make binary-dist` when using `USE_BINARYBUILDER_LLVM=0` (#55731)

`make binary-dist` expects lld to be in usr/tools but it ends up in
usr/bin so I copied it into usr/tools. Should fix the scheduled source
tests which currently fail at linking.

I think this is also broken with `USE_BINARYBUILDER_LLVM=0` and
`BUILD_LLD=0`, maybe
https://github.com/JuliaLang/julia/commit/ceaeb7b71bc76afaca2f3b80998164a47e30ce33
is the fix?

---------

Co-authored-by: Zentrik <Zentrik@users.noreply.github.com>

* Precompile the `@time_imports` printing so it doesn't confuse reports (#55729)

Makes functions for the report printing that can be precompiled into the
sysimage.

* codegen: some cleanup of layout computations (#55730)

Change Alloca to take an explicit alignment, rather than relying on LLVM
to guess our intended alignment from the DataLayout.

Eventually we should try to change this code to just get all layout data
from julia queries (jl_field_offset, julia_alignment, etc.) instead of
relying on creating an LLVM element type for memory and inspecting it
(CountTrackedPointers, DataLayout, and so on).

* Add some loading / LazyArtifacts precompiles to the sysimage (#55740)

Fixes https://github.com/JuliaLang/julia/issues/55725

These help LazyArtifacts mainly but seem beneficial for the sysimage.

* Update stable version number in readme to v1.10.5 (#55742)

* Add `invokelatest` barrier to `string(...)` in `@assert` (#55739)

This change protects `@assert` from invalidations to `Base.string(...)`
by adding an `invokelatest` barrier.

A common source of invalidations right now is `print(io,
join(args...))`. The problem is:
1. Inference concludes that `join(::Any...)` returns
`Union{String,AnnotatedString}`
2. The `print` call is union-split to `String` and `AnnotatedString`
3. This code is now invalidated when StyledStrings defines `print(io,
::AnnotatedString)`

The invalidation chain for `@assert` is similar: ` @assert 1 == 1` calls
into `string(::Expr)` which calls into `print(io, join(args::Any...))`.
Unfortunately that leads to the invalidation of almost all `@assert`s
without an explicit error message

Similar to
https://github.com/JuliaLang/julia/pull/55583#issuecomment-2308969806

* Don't show string concatenation error hint with zero arg `+` (#55749)

Closes #55745

* Don't leave trailing whitespace when printing do-block expr (#55738)

Before, when printing a `do`-block, we'd print a white-space after `do`
even if no arguments follow. Now we don't print that space.

---------

Co-authored-by: Lilith Orion Hafner <lilithhafner@gmail.com>

* Don't pass lSystem to the linker since macos always links it (#55722)

This stops it complaing about duplicated libs. 

For libunwind there isn't much we can do because it's part of lsystem
and we also need out own.

* define `numerator` and `denominator` for `Complex` (#55694)

Fixes #55693

* More testsets for SubString and a few missing tests (#55656)

Co-authored-by: Simeon David Schaub <simeon@schaub.rocks>

* Reorganize search tests into testsets (#55658)

Some of these tests are nearly 10 years old! Organized some of them into
testsets just in case one breaks in the future, should make it easier to
find the problem.

---------

Co-authored-by: Simeon David Schaub <simeon@schaub.rocks>

* fix #45494, error in ssa conversion with complex type decl (#55744)

We were missing a call to `renumber-assigned-ssavalues` in the case
where the declared type is used to assert the type of a value taken from
a closure box.

* Revert "Avoid materializing arrays in bidiag matmul" (#55737)

Reverts JuliaLang/julia#55450. @jishnub suggested reverting this PR to
fix #55727.

* Add a docs section about loading/precomp/ttfx time tuning (#55569)

* Add compat entry for `Base.donotdelete` (#55773)

* REPL: precompile in its own module because Main is closed. Add check for unexpected errors. (#55759)

* Try to put back previously flakey addmul tests (#55775)

Partial revert of #50071, inspired by conversation in
https://github.com/JuliaLang/julia/issues/49966#issuecomment-2350935477

Ran the tests 100 times to make sure we're not putting back
something that's still flaky.

Closes #49966

* Print results of `runtests` with `printstyled` (#55780)

This ensures escape characters are used only if `stdout` can accept
them.

* move null check in `unsafe_convert` of RefValue (#55766)

LLVM can optimize out this check but our optimizer can't, so this leads
to smaller IR in most cases.

* Fix hang in tmerge_types_slow (#55757)

Fixes https://github.com/JuliaLang/julia/issues/55751

Co-authored-by: Jameson Nash <jameson@juliacomputing.com>

* trace-compile: color recompilation yellow (#55763)

Marks recompilation of a method that produced a `precompile` statement
as yellow, or if color isn't supported adds a trailing comment: `#
recompilation`.

The coloring matches the `@time_imports` coloring. i.e. an excerpt of
```
% ./julia --start=no --trace-compile=stderr --trace-compile-timing -e "using InteractiveUtils; @time @time_imports using Plots"
```
![Screenshot 2024-09-13 at 5 04
24 PM](https://github.com/user-attachments/assets/85bd99e0-586e-4070-994f-2d845be0d9e7)

* Use PrecompileTools mechanics to compile REPL (#55782)

Fixes https://github.com/JuliaLang/julia/issues/55778

Based on discussion here
https://github.com/JuliaLang/julia/issues/55778#issuecomment-2352428043

With this `?reinterpret` feels instant, with only these precompiles at
the start.
![Screenshot 2024-09-16 at 9 49
39 AM](https://github.com/user-attachments/assets/20dc016d-c6f7-4870-acd7-0e795dcf541b)

* use `inferencebarrier` instead of `invokelatest` for 1-arg `@assert` (#55783)

This version would be better as per this comment:
<https://github.com/JuliaLang/julia/pull/55739#pullrequestreview-2304360447>
I confirmed this still allows us to avoid invalidations reported at
JuliaLang/julia#55583.

* Inline statically known method errors. (#54972)

This replaces the `Expr(:call, ...)` with a call of a new builtin
`Core.throw_methoderror`

This is useful because it makes very clear if something is a static
method error or a plain dynamic dispatch that always errors.
Tools such as AllocCheck or juliac can notice that this is not a genuine
dynamic dispatch, and prevent it from becoming a false positive
compile-time error.

Dependent on https://github.com/JuliaLang/julia/pull/55705

---------

Co-authored-by: Cody Tapscott <topolarity@tapscott.me>

* Fix shell `cd` error when working dir has been deleted (#41244)

root cause:
if current dir has been deleted, then pwd() will throw an IOError:
pwd(): no such file or directory (ENOENT)

---------

Co-authored-by: Ian Butterworth <i.r.butterworth@gmail.com>

* codegen: fix bits compare for UnionAll (#55770)

Fixes #55768 in two parts: one is making the type computation in
emit_bits_compare agree with the parent function and two is not using
the optimized egal code for UnionAll kinds, which is different from how
the egal code itself works for kinds.

* use libuv to measure maxrss (#55806)

Libuv has a wrapper around rusage on Unix (and its equivalent on
Windows).

We should probably use it.

* REPL: use atreplinit to change the active module during precompilation (#55805)

* 🤖 [master] Bump the Pkg stdlib from 299a35610 to 308f9d32f (#55808)

* Improve codegen for `Core.throw_methoderror` and `Core.current_scope` (#55803)

This slightly improves our (LLVM) codegen for `Core.throw_methoderror`
and `Core.current_scope`

```julia
julia> foo() = Core.current_scope()
julia> bar() = Core.throw_methoderror(+, nothing)
```

Before:
```llvm
; Function Signature: foo()
define nonnull ptr @julia_foo_2488() #0 {
top:
  %0 = call ptr @jl_get_builtin_fptr(ptr nonnull @"+Core.#current_scope#2491.jit")
  %Builtin_ret = call nonnull ptr %0(ptr nonnull @"jl_global#2492.jit", ptr null, i32 0)
  ret ptr %Builtin_ret
}
; Function Signature: bar()
define void @julia_bar_589() #0 {
top:
  %jlcallframe1 = alloca [2 x ptr], align 8
  %0 = call ptr @jl_get_builtin_fptr(ptr nonnull @"+Core.#throw_methoderror#591.jit")
  %jl_nothing = load ptr, ptr @jl_nothing, align 8
  store ptr @"jl_global#593.jit", ptr %jlcallframe1, align 8
  %1 = getelementptr inbounds ptr, ptr %jlcallframe1, i64 1
  store ptr %jl_nothing, ptr %1, align 8
  %Builtin_ret = call nonnull ptr %0(ptr nonnull @"jl_global#592.jit", ptr nonnull %jlcallframe1, i32 2)
  call void @llvm.trap()
  unreachable
}
```

After:
```llvm
; Function Signature: foo()
define nonnull ptr @julia_foo_713() #0 {
top:
  %thread_ptr = call ptr asm "movq %fs:0, $0", "=r"() #5
  %tls_ppgcstack = getelementptr inbounds i8, ptr %thread_ptr, i64 -8
  %tls_pgcstack = load ptr, ptr %tls_ppgcstack, align 8
  %current_scope = getelementptr inbounds i8, ptr %tls_pgcstack, i64 -72
  %0 = load ptr, ptr %current_scope, align 8
  ret ptr %0
}
; Function Signature: bar()
define void @julia_bar_1581() #0 {
top:
  %jlcallframe1 = alloca [2 x ptr], align 8
  %jl_nothing = load ptr, ptr @jl_nothing, align 8
  store ptr @"jl_global#1583.jit", ptr %jlcallframe1, align 8
  %0 = getelementptr inbounds ptr, ptr %jlcallframe1, i64 1
  store ptr %jl_nothing, ptr %0, align 8
  %jl_f_throw_methoderror_ret = call nonnull ptr @jl_f_throw_methoderror(ptr null, ptr nonnull %jlcallframe1, i32 2)
  call void @llvm.trap()
  unreachable
}
```

* a minor improvement for EA-based `:effect_free`-ness refinement (#55796)

* fix #52986, regression in `@doc` of macro without REPL loaded (#55795)

fix #52986

* Assume that docstring code with no lang is julia (#55465)

* Broadcast binary ops involving strided triangular (#55798)

Currently, we evaluate expressions like `(A::UpperTriangular) +
(B::UpperTriangular)` using broadcasting if both `A` and `B` have
strided parents, and forward the summation to the parents otherwise.
This PR changes this to use broadcasting if either of the two has a
strided parent. This avoids accessing the parent corresponding to the
structural zero elements, as the index might not be initialized.

Fixes https://github.com/JuliaLang/julia/issues/55590

This isn't a general fix, as we still sum the parents if neither is
strided. However, it will address common cases.

This also improves performance, as we only need to loop over one half:
```julia
julia> using LinearAlgebra

julia> U = UpperTriangular(zeros(100,100));

julia> B = Bidiagonal(zeros(100), zeros(99), :U);

julia> @btime $U + $B;
  35.530 μs (4 allocations: 78.22 KiB) # nightly
  13.441 μs (4 allocations: 78.22 KiB) # This PR
```

* Reland " Avoid materializing arrays in bidiag matmul #55450" (#55777)

This relands #55450 and adds tests for the failing case noted in
https://github.com/JuliaLang/julia/issues/55727. The `addmul` tests that
were failing earlier pass with this change.

The issue in the earlier PR was that we were not exiting quickly for
`iszero(alpha)` in `_bibimul!` for small matrices, and were computing
the result as `C .= A * B * alpha + C * beta`. The problem with this is
that if `A * B` contains `NaN`s, this propagates to `C` even if `alpha
=== 0.0`. This is fixed now, and the result is only computed if
`!iszero(alpha)`.

* move the test case added in #50174 to test/core.jl (#55811)

Also renames the name of the test function to avoid name collision.

* [Random] Avoid conversion to `Float32` in `Float16` sampler (#55819)

* simplify the fields of `UnionSplitInfo` (#55815)

xref:
<https://github.com/JuliaLang/julia/pull/54972#discussion_r1766187771>

* Add errorhint for nonexisting fields and properties (#55165)

I played a bit with error hints and crafted this:
```julia
julia> (1+2im).real
ERROR: FieldError: type Complex has no field real, available fields: `re`, `im`

julia> nothing.xy
ERROR: FieldError: type Nothing has no field xy; Nothing has no fields at all.

julia> svd(rand(2,2)).VV
ERROR: FieldError: type SVD has no field VV, available fields: `U`, `S`, `Vt`
Available properties: `V`
```

---------

Co-authored-by: Lilith Orion Hafner <lilithhafner@gmail.com>

* Improve printing of several arguments (#55754)

Following a discussion on
[Discourse](https://discourse.julialang.org/t/string-optimisation-in-julia/119301/10?u=gdalle),
this PR tries to improve `print` (and variants) for more than one
argument.
The idea is that `for` is type-unstable over the tuple `args`, while
`foreach` unrolls.

---------

Co-authored-by: Steven G. Johnson <stevenj@mit.edu>

* Markdown: support `parse(::AbstractString)` (#55747)

`Markdown.parse` is documented to accept `AbstractString` but it was
implemented by calling `IOBuffer` on the string argument. `IOBuffer`,
however, is documented only for `String` arguments.

This commit changes the current `parse(::AbstractString)` to
`parse(::String)` and implements `parse(::AbstractString)` by converting
the argument to `String`.

Now, even `LazyString`s can be parsed to Markdown representation.

Fixes #55732

* better error for esc outside of macro expansion (#55797)

fixes #55788

---------

Co-authored-by: Jeff Bezanson <jeff.bezanson@gmail.com>

* allow kronecker product between recursive triangular matrices (#55527)

Using the recently introduced recursive `zero` I can remove the
specialization to `<:Number` as @dkarrasch wanted to do in #54413.

---------

Co-authored-by: Jishnu Bhattacharya <jishnub.github@gmail.com>

* [Dates] Make test more robust against non-UTC timezones (#55829)

`%M` is the format specifier for the minutes, not the month (which
should be `%m`), and it was used twice.

Also, on macOS `Libc.strptime` internally calls `mktime` which depends
on the local timezone. We now temporarily set `TZ=UTC` to avoid
depending on the local timezone.

Fix #55827.

* 🤖 [master] Bump the Pkg stdlib from 308f9d32f to ef9f76c17 (#55838)

* lmul!/rmul! for banded matrices (#55823)

This adds fast methods for `lmul!` and `rmul!` between banded matrices
and numbers.
Performance impact:
```julia
julia> T = Tridiagonal(rand(999), rand(1000), rand(999));

julia> @btime rmul!($T, 0.2);
  4.686 ms (0 allocations: 0 bytes) # nightly v"1.12.0-DEV.1225"
  669.355 ns (0 allocations: 0 bytes) # this PR
```

* Specialize indexing triangular matrices with BandIndex (#55644)

With this, certain indexing operations involving a `BandIndex` may be
evaluated as constants. This isn't used directly presently, but might
allow for more performant broadcasting in the future.
With this,
```julia
julia> n = 3; T = Tridiagonal(rand(n-1), rand(n), rand(n-1));

julia> @code_warntype ((T,j) -> UpperTriangular(T)[LinearAlgebra.BandIndex(2,j)])(T, 1)
MethodInstance for (::var"#17#18")(::Tridiagonal{Float64, Vector{Float64}}, ::Int64)
  from (::var"#17#18")(T, j) @ Main REPL[12]:1
Arguments
  #self#::Core.Const(var"#17#18"())
  T::Tridiagonal{Float64, Vector{Float64}}
  j::Int64
Body::Float64
1 ─ %1 = Main.UpperTriangular(T)::UpperTriangular{Float64, Tridiagonal{Float64, Vector{Float64}}}
│   %2 = LinearAlgebra.BandIndex::Core.Const(LinearAlgebra.BandIndex)
│   %3 = (%2)(2, j)::Core.PartialStruct(LinearAlgebra.BandIndex, Any[Core.Const(2), Int64])
│   %4 = Base.getindex(%1, %3)::Core.Const(0.0)
└──      return %4
```
The indexing operation may be evaluated at compile-time, as the band
index is constant-propagated.

* Replace regex package module checks with actual code checks (#55824)

Fixes https://github.com/JuliaLang/julia/issues/55792
Replaces https://github.com/JuliaLang/julia/pull/55822
Improves what https://github.com/JuliaLang/julia/pull/51635 was trying
to do

i.e.
```
ERROR: LoadError: `using/import Printf` outside of a Module detected. Importing a package outside of a module is not allowed during package precompilation.
```

* fall back to slower stat filesize if optimized filesize fails (#55641)

* Use "index" instead of "subscript" to refer to indexing in performance tips (#55846)

* privatize annotated string API, take two (#55845)

https://github.com/JuliaLang/julia/pull/55453 is stuck on StyledStrings
and Base documentation being entangled and there isn't a good way to
have the documentation of Base types / methods live in an stdlib. This
is a stop gap solution to finally be able to move forwards with 1.11.

* 🤖 [master] Bump the Downloads stdlib from 1061ecc to 89d3c7d (#55854)

Stdlib: Downloads
URL: https://github.com/JuliaLang/Downloads.jl.git
Stdlib branch: master
Julia branch: master
Old commit: 1061ecc
New commit: 89d3c7d
Julia version: 1.12.0-DEV
Downloads version: 1.6.0(It's okay that it doesn't match)
Bump invoked by: @KristofferC
Powered by:
[BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl)

Diff:
https://github.com/JuliaLang/Downloads.jl/compare/1061ecc377a053fce0df94e1a19e5260f7c030f5...89d3c7dded535a77551e763a437a6d31e4d9bf84

```
$ git log --oneline 1061ecc..89d3c7d
89d3c7d fix cancelling upload requests (#259)
df33406 gracefully cancel a request (#256)
```

Co-authored-by: Dilum Aluthge <dilum@aluthge.com>

* docs: Small edits to noteworthy differences (#55852)

- The first line edit changes it so that the Julia example goes first,
not the Python example, keeping with the general flow of the lines
above.
- The second adds a "the" that is missing.

* Add filesystem func to transform a path to a URI (#55454)

In a few places across Base and the stdlib, we emit paths that we like
people to be able to click on in their terminal and editor. Up to this
point, we have relied on auto-filepath detection, but this does not
allow for alternative link text, such as contracted paths.

Doing so (via OSC 8 terminal links for example) requires filepath URI
encoding.

This functionality was previously part of a PR modifying stacktrace
printing (#51816), but after that became held up for unrelated reasons
and another PR appeared that would benefit from this utility (#55335),
I've split out this functionality so it can be used before the
stacktrace printing PR is resolved.

* constrain the path argument of `include` functions to `AbstractString` (#55466)

Each `Module` defined with `module` automatically gets an `include`
function with two methods. Each of those two methods takes a file path
as its last argument. Even though the path argument is unconstrained by
dispatch, it's documented as constrained with `::AbstractString`:

https://docs.julialang.org/en/v1.11-dev/base/base/#include

Furthermore, I think that any invocation of `include` with a
non-`AbstractString` path will necessarily throw a `MethodError`
eventually. Thus this change should be harmless.

Adding the type constraint to the path argument is an improvement
because any possible exception would be thrown earlier than before.

Apart from modules defined with `module`, the same issue is present with
the anonymous modules created by `evalfile`, which is also addressed.

Sidenote: `evalfile` seems to be completely untested apart from the test
added here.

Co-authored-by: Florian <florian.atteneder@gmail.com>

* Mmap: fix grow! for non file IOs (#55849)

Fixes https://github.com/JuliaLang/julia/issues/54203
Requires #55641

Based on
https://github.com/JuliaLang/julia/pull/55641#issuecomment-2334162489
cc. @JakeZw @ronisbr

---------

Co-authored-by: Jameson Nash <vtjnash@gmail.com>

* codegen: split gc roots from other bits on stack (#55767)

In order to help avoid memory provenance issues, and better utilize
stack space (somewhat), and use FCA less, change the preferred
representation of an immutable object to be a pair of
`<packed-data,roots>` values. This packing requires some care at the
boundaries and if the expected field alignment exceeds that of a
pointer. The change is expected to eventually make codegen more flexible
at representing unions of values with both bits and pointer regions.

Eventually we can also have someone improve the late-gc-lowering pass to
take advantage of this increased information accuracy, but currently it
will not be any better than before at laying out the frame.

* Refactoring to be considered before adding MMTk

* Removing jl_gc_notify_image_load, since it's a new function and not part of the refactoring

* Moving gc_enable code to gc-common.c

* Addressing PR comments

* Push resolution of merge conflict

* Removing jl_gc_mark_queue_obj_explicit extern definition from scheduler.c

* Don't need the getter function since it's possible to use jl_small_typeof directly

* WIP: Adding support for MMTk/Immix

* Refactoring to be considered before adding MMTk

* Adding fastpath allocation

* Fixing removed newlines

* Refactoring to be considered before adding MMTk

* Adding a few comments; Moving some functions to be closer together

* Fixing merge conflicts

* Applying changes from refactoring before adding MMTk

* Update TaskLocalRNG docstring according to #49110 (#55863)

Since #49110, which is included in 1.10 and 1.11, spawning a task no
longer advances the parent task's RNG state, so this statement in the
docs was incorrect.

* Root globals in toplevel exprs (#54433)

This fixes #54422, the code here assumes that top level exprs are always
rooted, but I don't see that referenced anywhere else, or guaranteed, so
conservatively always root objects that show up in code.

* codegen: fix alignment typos (#55880)

So easy to type jl_datatype_align to get the natural alignment instead
of julia_alignment to get the actual alignment. This should fix the
Revise workload.

Change is visible with
```
julia> code_llvm(Random.XoshiroSimd.forkRand, (Random.TaskLocalRNG, Base.Val{8}))
```

* Fix some corner cases of `isapprox` with unsigned integers (#55828)

* 🤖 [master] Bump the Pkg stdlib from ef9f76c17 to 51d4910c1 (#55896)

* Profile: fix order of fields in heapsnapshot & improve formatting (#55890)

* Profile: Improve generation of clickable terminal links (#55857)

* inference: add missing `TypeVar` handling for `instanceof_tfunc` (#55884)

I thought these sort of problems had been addressed by d60f92c, but it
seems some were missed. Specifically, `t.a` and `t.b` from `t::Union`
could be `TypeVar`, and if they are passed to a subroutine or recursed
without being unwrapped or rewrapped, errors like JuliaLang/julia#55882
could occur.

This commit resolves the issue by calling `unwraptv` in the `Union`
handling within `instanceof_tfunc`. I also found a similar issue inside
`nfields_tfunc`, so that has also been fixed, and test cases have been
added. While I haven't been able to make up a test case specifically for
the fix in `instanceof_tfunc`, I have confirmed that this commit
certainly fixes the issue reported in JuliaLang/julia#55882.

- fixes JuliaLang/julia#55882

* Install terminfo data under /usr/share/julia (#55881)

Just like all other libraries, we don't want internal Julia files to
mess with system files.

Introduced by https://github.com/JuliaLang/julia/pull/55411.

* expose metric to report reasons why full GCs were triggered (#55826)

Additional GC observability tool.

This will help us to diagnose why some of our servers are triggering so
many full GCs in certain circumstances.

* Revert "Improve printing of several arguments" (#55894)

Reverts JuliaLang/julia#55754 as it overrode some performance heuristics
which appeared to be giving a significant gain/loss in performance:
Closes https://github.com/JuliaLang/julia/issues/55893

* Do not trigger deprecation warnings in `Test.detect_ambiguities` and `Test.detect_unbound_args` (#55869)

#55868

* do not intentionally suppress errors in precompile script from being reported or failing the result (#55909)

I was slightly annoying that the build was set up to succeed if this
step failed, so I removed the error suppression and fixed up the script
slightly

* Remove eigvecs method for SymTridiagonal (#55903)

The fallback method does the same, so this specialized method isn't
necessary

* add --trim option for generating smaller binaries (#55047)

This adds a command line option `--trim` that builds images where code
is only included if it is statically reachable from methods marked using
the new function `entrypoint`. Compile-time errors are given for call
sites that are too dynamic to allow trimming the call graph (however
there is an `unsafe` option if you want to try building anyway to see
what happens).

The PR has two other components. One is changes to Base that generally
allow more code to be compiled in this mode. These changes will either
be merged in separate PRs or moved to a separate part of the workflow
(where we will build a custom system image for this purpose). The branch
is set up this way to make it easy to check out and try the
functionality.

The other component is everything in the `juliac/` directory, which
implements a compiler driver script based on this new option, along with
some examples and tests. This will eventually become a package "app"
that depends on PackageCompiler and provides a CLI for all of this
stuff, so it will not be merged here. To try an example:

```
julia contrib/juliac.jl --output-exe hello --trim test/trimming/hello.jl
```

When stripped the resulting executable is currently about 900kb on my
machine.

Also includes a lot of work by @topolarity

---------

Co-authored-by: Gabriel Baraldi <baraldigabriel@gmail.com>
Co-authored-by: Tim Holy <tim.holy@gmail.com>
Co-authored-by: Cody Tapscott <topolarity@tapscott.me>

* fix rawbigints OOB issues (#55917)

Fixes issues introduced in #50691 and found in #55906:
* use `@inbounds` and `@boundscheck` macros in rawbigints, for catching
OOB with `--check-bounds=yes`
* fix OOB in `truncate`

* prevent loading other extensions when precompiling an extension (#55589)

The current way of loading extensions when precompiling an extension
very easily leads to cycles. For example, if you have more than one
extension and you happen to transitively depend on the triggers of one
of your extensions you will immediately hit a cycle where the extensions
will try to load each other indefinitely. This is an issue because you
cannot directly influence your transitive dependency graph so from this
p.o.v the current system of loading extension is "unsound".

The test added here checks this scenario and we can now precompile and
load it without any warnings or issues.

Would have made https://github.com/JuliaLang/julia/issues/55517 a non
issue.

Fixes https://github.com/JuliaLang/julia/issues/55557

---------

Co-authored-by: KristofferC <kristoffer.carlsson@juliacomputing.com>

* TOML: Avoid type-pirating `Base.TOML.Parser` (#55892)

Since stdlibs can be duplicated but Base never is, `Base.require_stdlib`
makes type piracy even more complicated than it normally would be.

To adapt, this changes `TOML.Parser` to be a type defined by the TOML
stdlib, so that we can define methods on it without committing
type-piracy and avoid problems like Pkg.jl#4017

Resolves
https://github.com/JuliaLang/Pkg.jl/issues/4017#issuecomment-2377589989

* [FileWatching] fix PollingFileWatcher design and add workaround for a stat bug

What started as an innocent fix for a stat bug on Apple (#48667) turned
into a full blown investigation into the design problems with the libuv
backend for PollingFileWatcher, and writing my own implementation of it
instead which could avoid those singled-threaded concurrency bugs.

* [FileWatching] fix FileMonitor similarly and improve pidfile reliability

Previously pidfile used the same poll_interval as sleep to detect if
this code made any concurrency mistakes, but we do not really need to do
that once FileMonitor is fixed to be reliable in the presence of
parallel concurrency (instead of using watch_file).

* [FileWatching] reorganize file and add docs

* Add `--trace-dispatch` (#55848)

* relocation: account for trailing path separator in depot paths (#55355)

Fixes #55340

* change compiler to be stackless (#55575)

This change ensures the compiler uses very little stack, making it
compatible with running on any arbitrary system stack size and depths
much more reliably. It also could be further modified now to easily add
various forms of pause-able/resumable inference, since there is no
implicit state on the stack--everything is local and explicit now.

Whereas before, less than 900 frames would crash in less than a second:
```
$ time ./julia -e 'f(::Val{N}) where {N} = N <= 0 ? 0 : f(Val(N - 1)); f(Val(1000))'
Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
Internal error: during type inference of
f(Base.Val{1000})
Encountered stack overflow.
This might be caused by recursion over very long tuples or argument lists.

[23763] signal 6: Abort trap: 6
in expression starting at none:1
__pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
Allocations: 1 (Pool: 1; Big: 0); GC: 0
Abort trap: 6

real	0m0.233s
user	0m0.165s
sys	0m0.049s
````

Now: it is effectively unlimited, as long as you are willing to wait for
it:
```
$ time ./julia -e 'f(::Val{N}) where {N} = N <= 0 ? 0 : f(Val(N - 1)); f(Val(50000))'
info: inference of f(Base.Val{50000}) from f(Base.Val{N}) where {N} exceeding 2500 frames (may be slow).
info: inference of f(Base.Val{50000}) from f(Base.Val{N}) where {N} exceeding 5000 frames (may be slow).
info: inference of f(Base.Val{50000}) from f(Base.Val{N}) where {N} exceeding 10000 frames (may be slow).
info: inference of f(Base.Val{50000}) from f(Base.Val{N}) where {N} exceeding 20000 frames (may be slow).
info: inference of f(Base.Val{50000}) from f(Base.Val{N}) where {N} exceeding 40000 frames (may be slow).
real	7m4.988s

$ time ./julia -e 'f(::Val{N}) where {N} = N <= 0 ? 0 : f(Val(N - 1)); f(Val(1000))'
real	0m0.214s
user	0m0.164s
sys	0m0.044s

$ time ./julia -e '@noinline f(::Val{N}) where {N} = N <= 0 ? GC.safepoint() : f(Val(N - 1)); f(Val(5000))'
info: inference of f(Base.Val{5000}) from f(Base.Val{N}) where {N} exceeding 2500 frames (may be slow).
info: inference of f(Base.Val{5000}) from f(Base.Val{N}) where {N} exceeding 5000 frames (may be slow).
real	0m8.609s
user	0m8.358s
sys	0m0.240s
```

* optimizer: simplify the finalizer inlining pass a bit (#55934)

Minor adjustments have been made to the algorithm of the finalizer
inlining pass. Previously, it required that the finalizer registration
dominate all uses, but this is not always necessary as far as the
finalizer inlining point dominates all the uses. So the check has been
relaxed. Other minor fixes have been made as well, but their importance
is low.

* Limit `@inbounds` to indexing in the dual-iterator branch in `copyto_unaliased!` (#55919)

This simplifies the `copyto_unalised!` implementation where the source
and destination have different `IndexStyle`s, and limits the `@inbounds`
to only the indexing operation. In particular, the iteration over
`eachindex(dest)` is not marked as `@inbounds` anymore. This seems to
help with performance when the destination uses Cartesian indexing.
Reduced implementation of the branch:
```julia
function copyto_proposed!(dest, src)
    axes(dest) == axes(src) || throw(ArgumentError("incompatible sizes"))
    iterdest, itersrc = eachindex(dest), eachindex(src)
    for (destind, srcind) in zip(iterdest, itersrc)
        @inbounds dest[destind] = src[srcind]
    end
    dest
end

function copyto_current!(dest, src)
    axes(dest) == axes(src) || throw(ArgumentError("incompatible sizes"))
    iterdest, itersrc = eachindex(dest), eachindex(src)
    ret = iterate(iterdest)
    @inbounds for a in src
        idx, state = ret::NTuple{2,Any}
        dest[idx] = a
        ret = iterate(iterdest, state)
    end
    dest
end

function copyto_current_limitinbounds!(dest, src)
    axes(dest) == axes(src) || throw(ArgumentError("incompatible sizes"))
    iterdest, itersrc = eachindex(dest), eachindex(src)
    ret = iterate(iterdest)
    for isrc in itersrc
        idx, state = ret::NTuple{2,Any}
        @inbounds dest[idx] = src[isrc]
        ret = iterate(iterdest, state)
    end
    dest
end
```
```julia
julia> a = zeros(40000,4000); b = rand(size(a)...);

julia> av = view(a, UnitRange.(axes(a))...);

julia> @btime copyto_current!($av, $b);
  617.704 ms (0 allocations: 0 bytes)

julia> @btime copyto_current_limitinbounds!($av, $b);
  304.146 ms (0 allocations: 0 bytes)

julia> @btime copyto_proposed!($av, $b);
  240.217 ms (0 allocations: 0 bytes)

julia> versioninfo()
Julia Version 1.12.0-DEV.1260
Commit 4a4ca9c8152 (2024-09-28 01:49 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × Intel(R) Core(TM) i5-10310U CPU @ 1.70GHz
  WORD_SIZE: 64
  LLVM: libLLVM-18.1.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
Environment:
  JULIA_EDITOR = subl
```
I'm not quite certain why the proposed implementation here
(`copyto_proposed!`) is even faster than
`copyto_current_limitinbounds!`. In any case, `copyto_proposed!` is
easier to read, so I'm not complaining.

This fixes https://github.com/JuliaLang/julia/issues/53158

* Strong zero in Diagonal triple multiplication (#55927)

Currently, triple multiplication with a `LinearAlgebra.BandedMatrix`
sandwiched between two `Diagonal`s isn't associative, as this is
implemented using broadcasting, which doesn't assume a strong zero,
whereas the two-term matrix multiplication does.
```julia
julia> D = Diagonal(StepRangeLen(NaN, 0, 3));

julia> B = Bidiagonal(1:3, 1:2, :U);

julia> D * B * D
3×3 Matrix{Float64}:
 NaN  NaN  NaN
 NaN  NaN  NaN
 NaN  NaN  NaN

julia> (D * B) * D
3×3 Bidiagonal{Float64, Vector{Float64}}:
 NaN    NaN       ⋅ 
    ⋅   NaN    NaN
    ⋅      ⋅   NaN

julia> D * (B * D)
3×3 Bidiagonal{Float64, Vector{Float64}}:
 NaN    NaN       ⋅ 
    ⋅   NaN    NaN
    ⋅      ⋅   NaN
```
This PR ensures that the 3-term multiplication is evaluated as a
sequence of two-term multiplications, which fixes this issue. This also
improves performance, as only the bands need to be evaluated now.
```julia
julia> D = Diagonal(1:1000); B = Bidiagonal(1:1000, 1:999, :U);

julia> @btime $D * $B * $D;
  656.364 μs (11 allocations: 7.63 MiB) # v"1.12.0-DEV.1262"
  2.483 μs (12 allocations: 31.50 KiB) # This PR
```

* Fix dispatch on `alg` in Float16 Hermitian eigen (#55928)

Currently,
```julia
julia> using LinearAlgebra

julia> A = Hermitian(reshape(Float16[1:16;], 4, 4));

julia> eigen(A).values |> typeof
Vector{Float16} (alias for Array{Float16, 1})

julia> eigen(A, LinearAlgebra.QRIteration()).values |> typeof
Vector{Float32} (alias for Array{Float32, 1})
```
This PR moves the specialization on the `eltype` to an internal method,
so that firstly all `alg`s dispatch to that method, and secondly, there
are no ambiguities introduce by specializing the top-level `eigen`. The
latter currently causes test failures in `StaticArrays`
(https://github.com/JuliaArrays/StaticArrays.jl/actions/runs/11092206012/job/30816955210?pr=1279),
and should be fixed by this PR.

* Remove specialized `ishermitian` method for `Diagonal{<:Real}` (#55948)

The fallback method for `Diagonal{<:Number}` handles this already by
checking that the `diag` is real, so we don't need this additional
specialization.

* Fix logic in `?` docstring example (#55945)

* fix `unwrap_macrocalls` (#55950)

The implementation of `unwrap_macrocalls` has assumed that what
`:macrocall` wraps is always an `Expr` object, but that is not
necessarily correct:
```julia
julia> Base.@assume_effects :nothrow @show 42
ERROR: LoadError: TypeError: in typeassert, expected Expr, got a value of type Int64
Stacktrace:
 [1] unwrap_macrocalls(ex::Expr)
   @ Base ./expr.jl:906
 [2] var"@assume_effects"(__source__::LineNumberNode, __module__::Module, args::Vararg{Any})
   @ Base ./expr.jl:756
in expression starting at REPL[1]:1
```
This commit addresses this issue.

* make faster BigFloats (#55906)

We can coalesce the two required allocations for the MFPR BigFloat API
design into one allocation, hopefully giving a easy performance boost.
It would have been slightly easier and more efficient if MPFR BigFloat
was already a VLA instead of containing a pointer here, but that does
not prevent the optimization.

* Add propagate_inbounds_meta to atomic genericmemory ops (#55902)

`memoryref(mem, i)` will otherwise emit a boundscheck.

```
; │ @ /home/vchuravy/WorkstealingQueues/src/CLL.jl:53 within `setindex_atomic!` @ genericmemory.jl:329
; │┌ @ boot.jl:545 within `memoryref`
    %ptls_field = getelementptr inbounds i8, ptr %tls_pgcstack, i64 16
    %ptls_load = load ptr, ptr %ptls_field, align 8
    %"box::GenericMemoryRef" = call noalias nonnull align 8 dereferenceable(32) ptr @ijl_gc_small_alloc(ptr %ptls_load, i32 552, i32 32, i64 23456076646928) #9
    %"box::GenericMemoryRef.tag_addr" = getelementptr inbounds i64, ptr %"box::GenericMemoryRef", i64 -1
    store atomic i64 23456076646928, ptr %"box::GenericMemoryRef.tag_addr" unordered, align 8
    store ptr %memoryref_data, ptr %"box::GenericMemoryRef", align 8
    %.repack8 = getelementptr inbounds { ptr, ptr }, ptr %"box::GenericMemoryRef", i64 0, i32 1
    store ptr %memoryref_mem, ptr %.repack8, align 8
    call void @ijl_bounds_error_int(ptr nonnull %"box::GenericMemoryRef", i64 %7)
    unreachable
```

For the Julia code:

```julia
function Base.setindex_atomic!(buf::WSBuffer{T}, order::Symbol, val::T, idx::Int64) where T
    @inbounds Base.setindex_atomic!(buf.buffer, order, val,((idx - 1) & buf.mask) + 1)
end
```

from
https://github.com/gbaraldi/WorkstealingQueues.jl/blob/0ebc57237cf0c90feedf99e4338577d04b67805b/src/CLL.jl#L41

* fix rounding mode in construction of `BigFloat` from pi (#55911)

The default argument of the method was outdated, reading the global
default rounding directly, bypassing the `ScopedValue` stuff.

* fix `nonsetable_type_hint_handler` (#55962)

The current implementation is wrong, causing it to display inappropriate
hints like the following:
```julia
julia> s = Some("foo");

julia> s[] = "bar"
ERROR: MethodError: no method matching setindex!(::Some{String}, ::String)
The function `setindex!` exists, but no method is defined for this combination of argument types.
You attempted to index the type String, rather than an instance of the type. Make sure you create the type using its constructor: d = String([...]) rather than d = String
Stacktrace:
 [1] top-level scope
   @ REPL[2]:1
```

* REPL: make UndefVarError aware of imported modules (#55932)

* fix test/staged.jl (#55967)

In particular, the implementation of `overdub_generator54341` was
dangerous. This fixes it up.

* Explicitly store a module's location (#55963)

Revise wants to know what file a module's `module` definition is in.
Currently it does this by looking at the source location for the
implicitly generated `eval` method. This is terrible for two reasons:

1. The method may not exist if the module is a baremodule (which is not
particularly common, which is probably why we haven't seen it).
2. The fact that the implicitly generated `eval` method has this
location information is an implementation detail that I'd like to get
rid of (#55949).

This PR adds explicit file/line info to `Module`, so that Revise doesn't
have to use the hack anymore.

* mergewith: add single argument example to docstring (#55964)

I ran into this edge case. I though it should be documented.
---------

Co-authored-by: Lilith Orion Hafner <lilithhafner@gmail.com>

* [build] avoid libedit linkage and align libccalllazy* SONAMEs (#55968)

While building the 1.11.0-rc4 in Homebrew[^1] in preparation for 1.11.0
release (and to confirm Sequoia successfully builds) I noticed some odd
linkage for our Linux builds, which included of:

1. LLVM libraries were linking to `libedit.so`, e.g.
    ```
    Dynamic Section:
      NEEDED       libedit.so.0
      NEEDED       libz.so.1
      NEEDED       libzstd.so.1
      NEEDED       libstdc++.so.6
      NEEDED       libm.so.6
      NEEDED       libgcc_s.so.1
      NEEDED       libc.so.6
      NEEDED       ld-linux-x86-64.so.2
      SONAME       libLLVM-16jl.so
    ```
    CMakeCache.txt showed
    ```
    //Use libedit if available.
    LLVM_ENABLE_LIBEDIT:BOOL=ON
    ```
Which might be overriding `HAVE_LIBEDIT` at
https://github.com/JuliaLang/llvm-project/blob/julia-release/16.x/llvm/cmake/config-ix.cmake#L222-L225.
So just added `LLVM_ENABLE_LIBEDIT`

2. Wasn't sure if there was a reason for this but `libccalllazy*` had
mismatched SONAME:
    ```console
    ❯ objdump -p lib/julia/libccalllazy* | rg '\.so'
    lib/julia/libccalllazybar.so:	file format elf64-x86-64
      NEEDED       ccalllazyfoo.so
      SONAME       ccalllazybar.so
    lib/julia/libccalllazyfoo.so:	file format elf64-x86-64
      SONAME       ccalllazyfoo.so
    ```
    Modifying this, but can drop if intentional.

---

[^1]: https://github.com/Homebrew/homebrew-core/pull/192116

* Add missing `copy!(::AbstractMatrix, ::UniformScaling)` method (#55970)

Hi everyone! First PR to Julia here.

It was noticed in a Slack thread yesterday
that `copy!(A, I)` doesn't work, but `copyto!(A, I)` does. This PR adds
the missing method for `copy!(::AbstractMatrix, ::UniformScaling)`,
which simply defers to `copyto!`, and corresponding tests.

I added a `compat` notice for Julia 1.12.

---------

Co-authored-by: Lilith Orion Hafner <lilithhafner@gmail.com>

* Add forward progress update to NEWS.md (#54089)

Closes #40009 which was left open because of the needs news tag.

---------

Co-authored-by: Ian Butterworth <i.r.butterworth@gmail.com>

* Fix an intermittent test failure in `core` test (#55973)

The test wants to assert that `Module` is not resolved in `Main`, but
other tests do resolve this identifier, so the test can fail depending
on test order (and I've been seeing such failures on CI recently). Fix
that by running the test in a fresh subprocess.

* fix comma logic in time_print (#55977)

Minor formatting fix

* optimizer: fix up the inlining algorithm to use correct `nargs`/`isva` (#55976)

It appears that inlining.jl was not updated in JuliaLang/julia#54341.
Specifically, using `nargs`/`isva` from `mi.def::Method` in
`ir_prepare_inlining!` causes the following error to occur:
```julia
function generate_lambda_ex(world::UInt, source::LineNumberNode,
                            argnames, spnames, @nospecialize body)
    stub = Core.GeneratedFunctionStub(identity, Core.svec(argnames...), Core.svec(spnames...))
    return stub(world, source, body)
end
function overdubbee54341(a, b)
    return a + b
end
const overdubee_codeinfo54341 = code_lowered(overdubbee54341, Tuple{Any, Any})[1]
function overdub_generator54341(world::UInt, source::LineNumberNode, selftype, fargtypes)
    if length(fargtypes) != 2
        return generate_lambda_ex(world, source,
            (:overdub54341, :args), (), :(error("Wrong number of arguments")))
    else
        return copy(overdubee_codeinfo54341)
    end
end
@eval function overdub54341(args...)
    $(Expr(:meta, :generated, overdub_generator54341))
    $(Expr(:meta, :generated_only))
end
topfunc(x) = overdub54341(x, 2)
```
```julia
julia> topfunc(1)
Internal error: during type inference of
topfunc(Int64)
Encountered unexpected error in runtime:
BoundsError(a=Array{Any, 1}(dims=(2,), mem=Memory{Any}(8, 0x10632e780)[SSAValue(2), SSAValue(3), #<null>, #<null>, #<null>, #<null>, #<null>, #<null>]), i=(3,))
throw_boundserror at ./essentials.jl:14
getindex at ./essentials.jl:909 [inlined]
ssa_substitute_op! at ./compiler/ssair/inlining.jl:1798
ssa_substitute_op! at ./compiler/ssair/inlining.jl:1852
ir_inline_item! at ./compiler/ssair/inlining.jl:386
...
```

This commit updates the abstract interpretation and inlining algorithm
to use the `nargs`/`isva` values held by `CodeInfo`. Similar
modifications have also been made to EscapeAnalysis.jl.

@nanosoldier `runbenchmarks("inference", vs=":master")`

* Add `.zed` directory to `.gitignore` (#55974)

Similar to the `vscode` config directory, we may ignore the `zed`
directory as well.

* typeintersect: reduce unneeded allocations from `merge_env`

`merge_env` and `final_merge_env` could be skipped
for emptiness test or if we know there's only 1 valid Union state.

* typeintersect: trunc env before nested `intersect_all` if valid.

This only covers the simplest cases. We might want a full dependence analysis and keep env length minimum in the future.

* `@time` actually fix time report commas & add tests (#55982)

https://github.com/JuliaLang/julia/pull/55977 looked simple but wasn't
quite right because of a bad pattern in the lock conflicts report
section.

So fix and add tests.

* adjust EA to JuliaLang/julia#52527 (#55986)

`Ent…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant