Consider allowing variable length arrays #3952

ByLiZhao · 2019-12-20T06:06:53Z

I know some people don't like the idea, but Variable Length Arrays (VLA) are really useful for scientific computation. Below is a C99 code snippet using VLA:

// m, and n are runtime variables.
double (*matrix_ptr)[m][n] = malloc(sizeof(double[m][n]));

This kind of thing will be very awkward to do without VLA.

The text was updated successfully, but these errors were encountered:

JesseRMeyer · 2019-12-20T12:38:09Z

VLA's are handled in my C code base with a temporary arena with the classic begin() and end() pattern. Memory is sourced from a permanent arena so very often an allocation is not necessary. I don't have to concern myself with stack issues, and the code-gen is much better than what you get from VLAs.

That said, if the Frame situation can get sorted out, and the compiler can ensure that there is enough stack space for an array allocation, then I would expect the ability to use it that way.

ghost · 2019-12-20T20:04:44Z

Related closed issues: #225, #1374

See also #1006 (safe recursion)

andrewrk · 2019-12-20T20:19:04Z

I'm confident in the decision to not have runtime-bounded stack allocation in the language. Use the heap, or a stack allocation with a fixed upper bound.

andrewrk · 2019-12-20T20:26:11Z

Here is some zig code to do what you are suggesting:

const matrix_ptr = try std.heap.c_allocator.create([n][m]f64);

It works as long as n and m are comptime known. I do not think this is awkward. If n and m are runtime known, then it is:

const matrix_ptr = try std.heap.c_allocator.alloc(f64, n * m);

ByLiZhao · 2019-12-21T01:52:54Z

@andrewrk I am only beginning with Zig. With your sample code, when m and n are runtime known, one can allocate heap memory as:

const matrix_ptr = try std.heap.c_allocator.alloc(f64, n * m);

With C, one can access the matrix elements like

(*matrix_ptr)[i][j]

How do you achieve that in Zig, can the pointer be cast to a pointer to 2-dimensional array?

ByLiZhao · 2019-12-21T02:25:30Z

@JesseRMeyer @andrewrk I want to make three points:

If VLAs are disallowed on the stack, it can still be very useful with heap allocation, as shown in my example.
If VLAs are allowed on the stack, it actually means an optimization opportunity. Say a program is receiving strings with small but varying size via TCP, do something with each string, followed by sending a feedback string which are also of varying size. If all this can be done without heap allocation, it will be faster.
If Zig wants to be radical, it can even make returning a VLA from functions possible. With this feature, one can do stack-only string processing. C++ gets quite some performance boost by Small String Optimization. But SSO only works for strings less than 24 or 32 chars (depending on which stdlib used) This feature will make Zig faster than C++ in many cases.

pixelherodev · 2019-12-21T07:01:10Z

Given

const matrix_ptr = try std.heap.c_allocator.alloc(f64, n * m);

You should be able to access the memory as matrix_ptr.*[i * n + j]; with the same index calculation that would be used in C.

daurnimator · 2019-12-21T07:13:25Z

You should be able to access the memory as matrix_ptr.*[i * n + j];

I suppose that's the heart of the issue: if there was support for runtime-sized arrays then the much easier to review matrix_ptr[i][j] could be used.

JesseRMeyer · 2019-12-21T13:25:56Z

@ByLiZhao I agree with your points. Stack memory is almost certainly in the L1 cache, and so its access times are statistically very fast.

The issue with VLA from Zig's point of view is that it tugs at Zig's safety promises. Tracking how much memory is left on the stack is non-trivial with async functions, and in general impossible with recursive functions. So to have VLAs in the language both complicates the language to support what degree of safety is possible while also compromising Zig's safety umbrella.

For VLA's Zig could safe-guard against out-of-bounds, but not against invalid memory access from stack overflows. But hey, Zig doesn't offer protection against accessing unmapped memory either through a pointer. So at some level this doesn't feel as much as a compromise now that I've spelt it out some.

JesseRMeyer · 2019-12-23T17:18:58Z

You should be able to access the memory as matrix_ptr.*[i * n + j];

I suppose that's the heart of the issue: if there was support for runtime-sized arrays then the much easier to review matrix_ptr[i][j] could be used.

I think the first 3 staples of the Zen of Zig support this.

Does matrix_ptr.*[i * n + j]; communicate intent precisely? Is it an edge case that matters? Does it favor reading code over writing it?

marler8997 · 2019-12-25T14:54:09Z

I think OP's reason for supporting VLA in the description is incorrect. It's not awkward to allocate on the heap instead of the stack (as @andrewrk showed).

The reason for suppprting VLA is so that a function can control its "memory locality', which is one of the most important factors in performance for modern processors. Without VLA, all stack allocations are forced to reserve the maximum size they might use on the stack rather than only taking what they need. This increases the memory footprint causing more cache misses than necessary. We should certainly consider how this affects Zig's safety guarantees but I see Zig as a competitor to C, I dont see how it could say that without supporting VLAs.

EDIT: I'm really talking about alloca. VLAs are an extra feature on top that probably isn't necessary.

fengb · 2019-12-27T15:36:54Z

@JesseRMeyer its not too difficult to have userland support for multi-dimensional arrays and slices: https://github.com/fengb/fundude/blob/master/src/util.zig

JesseRMeyer · 2019-12-27T16:14:08Z

@fengb Thanks for sharing! With some labor it can be made to work, although I think this approach does appear to violate some of Zig's Zen.

Has it been discussed why careful tracking of a thread_local variable that tracks each thread's stack frame size is not suitable to support alloca? We don't need full safety guarantees to enable first class language support for run-time known stack allocations in non-async, non-recursive functions. While having a discontinuity in support feels inelegant from a language-theoretic point of view, the pragmatist would notice that alloca in recursive functions is pulling on the sleeping dragon's tail. I'm not sure what the ramifications are for async functions are just yet.

andrewrk closed this as completed Dec 20, 2019

iguessthislldo mentioned this issue May 13, 2021

Imported C Structs with VLAs #8759

Closed

MadLittleMods mentioned this issue Sep 12, 2023

Making the window transparent (X Window System, X11) MadLittleMods/fps-aim-analyzer#1

Closed

silversquirl mentioned this issue Jun 4, 2024

ability to allocate fixed buffer runtime in stack frame initiation #20185

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider allowing variable length arrays #3952

Consider allowing variable length arrays #3952

ByLiZhao commented Dec 20, 2019

JesseRMeyer commented Dec 20, 2019 •

edited

Loading

ghost commented Dec 20, 2019

andrewrk commented Dec 20, 2019 •

edited

Loading

andrewrk commented Dec 20, 2019

ByLiZhao commented Dec 21, 2019

ByLiZhao commented Dec 21, 2019

pixelherodev commented Dec 21, 2019

daurnimator commented Dec 21, 2019

JesseRMeyer commented Dec 21, 2019

JesseRMeyer commented Dec 23, 2019

marler8997 commented Dec 25, 2019 •

edited

Loading

fengb commented Dec 27, 2019

JesseRMeyer commented Dec 27, 2019 •

edited

Loading

Consider allowing variable length arrays #3952

Consider allowing variable length arrays #3952

Comments

ByLiZhao commented Dec 20, 2019

JesseRMeyer commented Dec 20, 2019 • edited Loading

ghost commented Dec 20, 2019

andrewrk commented Dec 20, 2019 • edited Loading

andrewrk commented Dec 20, 2019

ByLiZhao commented Dec 21, 2019

ByLiZhao commented Dec 21, 2019

pixelherodev commented Dec 21, 2019

daurnimator commented Dec 21, 2019

JesseRMeyer commented Dec 21, 2019

JesseRMeyer commented Dec 23, 2019

marler8997 commented Dec 25, 2019 • edited Loading

fengb commented Dec 27, 2019

JesseRMeyer commented Dec 27, 2019 • edited Loading

JesseRMeyer commented Dec 20, 2019 •

edited

Loading

andrewrk commented Dec 20, 2019 •

edited

Loading

marler8997 commented Dec 25, 2019 •

edited

Loading

JesseRMeyer commented Dec 27, 2019 •

edited

Loading