Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider allowing variable length arrays #3952

Closed
ByLiZhao opened this issue Dec 20, 2019 · 13 comments
Closed

Consider allowing variable length arrays #3952

ByLiZhao opened this issue Dec 20, 2019 · 13 comments

Comments

@ByLiZhao
Copy link

I know some people don't like the idea, but Variable Length Arrays (VLA) are really useful for scientific computation. Below is a C99 code snippet using VLA:

// m, and n are runtime variables.
double (*matrix_ptr)[m][n] = malloc(sizeof(double[m][n]));

This kind of thing will be very awkward to do without VLA.

@JesseRMeyer
Copy link

JesseRMeyer commented Dec 20, 2019

VLA's are handled in my C code base with a temporary arena with the classic begin() and end() pattern. Memory is sourced from a permanent arena so very often an allocation is not necessary. I don't have to concern myself with stack issues, and the code-gen is much better than what you get from VLAs.

That said, if the Frame situation can get sorted out, and the compiler can ensure that there is enough stack space for an array allocation, then I would expect the ability to use it that way.

@ghost
Copy link

ghost commented Dec 20, 2019

Related closed issues: #225, #1374

See also #1006 (safe recursion)

@andrewrk
Copy link
Member

andrewrk commented Dec 20, 2019

I'm confident in the decision to not have runtime-bounded stack allocation in the language. Use the heap, or a stack allocation with a fixed upper bound.

@andrewrk
Copy link
Member

Here is some zig code to do what you are suggesting:

const matrix_ptr = try std.heap.c_allocator.create([n][m]f64);

It works as long as n and m are comptime known. I do not think this is awkward. If n and m are runtime known, then it is:

const matrix_ptr = try std.heap.c_allocator.alloc(f64, n * m);

@ByLiZhao
Copy link
Author

@andrewrk I am only beginning with Zig. With your sample code, when m and n are runtime known, one can allocate heap memory as:

const matrix_ptr = try std.heap.c_allocator.alloc(f64, n * m);

With C, one can access the matrix elements like

(*matrix_ptr)[i][j] 

How do you achieve that in Zig, can the pointer be cast to a pointer to 2-dimensional array?

@ByLiZhao
Copy link
Author

@JesseRMeyer @andrewrk I want to make three points:

  • If VLAs are disallowed on the stack, it can still be very useful with heap allocation, as shown in my example.

  • If VLAs are allowed on the stack, it actually means an optimization opportunity. Say a program is receiving strings with small but varying size via TCP, do something with each string, followed by sending a feedback string which are also of varying size. If all this can be done without heap allocation, it will be faster.

  • If Zig wants to be radical, it can even make returning a VLA from functions possible. With this feature, one can do stack-only string processing. C++ gets quite some performance boost by Small String Optimization. But SSO only works for strings less than 24 or 32 chars (depending on which stdlib used) This feature will make Zig faster than C++ in many cases.

@pixelherodev
Copy link
Contributor

Given

const matrix_ptr = try std.heap.c_allocator.alloc(f64, n * m);

You should be able to access the memory as matrix_ptr.*[i * n + j]; with the same index calculation that would be used in C.

@daurnimator
Copy link
Contributor

You should be able to access the memory as matrix_ptr.*[i * n + j];

I suppose that's the heart of the issue: if there was support for runtime-sized arrays then the much easier to review matrix_ptr[i][j] could be used.

@JesseRMeyer
Copy link

@ByLiZhao I agree with your points. Stack memory is almost certainly in the L1 cache, and so its access times are statistically very fast.

The issue with VLA from Zig's point of view is that it tugs at Zig's safety promises. Tracking how much memory is left on the stack is non-trivial with async functions, and in general impossible with recursive functions. So to have VLAs in the language both complicates the language to support what degree of safety is possible while also compromising Zig's safety umbrella.

For VLA's Zig could safe-guard against out-of-bounds, but not against invalid memory access from stack overflows. But hey, Zig doesn't offer protection against accessing unmapped memory either through a pointer. So at some level this doesn't feel as much as a compromise now that I've spelt it out some.

@JesseRMeyer
Copy link

You should be able to access the memory as matrix_ptr.*[i * n + j];

I suppose that's the heart of the issue: if there was support for runtime-sized arrays then the much easier to review matrix_ptr[i][j] could be used.

I think the first 3 staples of the Zen of Zig support this.

Does matrix_ptr.*[i * n + j]; communicate intent precisely? Is it an edge case that matters? Does it favor reading code over writing it?

@marler8997
Copy link
Contributor

marler8997 commented Dec 25, 2019

I think OP's reason for supporting VLA in the description is incorrect. It's not awkward to allocate on the heap instead of the stack (as @andrewrk showed).

The reason for suppprting VLA is so that a function can control its "memory locality', which is one of the most important factors in performance for modern processors. Without VLA, all stack allocations are forced to reserve the maximum size they might use on the stack rather than only taking what they need. This increases the memory footprint causing more cache misses than necessary. We should certainly consider how this affects Zig's safety guarantees but I see Zig as a competitor to C, I dont see how it could say that without supporting VLAs.

EDIT: I'm really talking about alloca. VLAs are an extra feature on top that probably isn't necessary.

@fengb
Copy link
Contributor

fengb commented Dec 27, 2019

@JesseRMeyer its not too difficult to have userland support for multi-dimensional arrays and slices: https://github.com/fengb/fundude/blob/master/src/util.zig

@JesseRMeyer
Copy link

JesseRMeyer commented Dec 27, 2019

@fengb Thanks for sharing! With some labor it can be made to work, although I think this approach does appear to violate some of Zig's Zen.

Has it been discussed why careful tracking of a thread_local variable that tracks each thread's stack frame size is not suitable to support alloca? We don't need full safety guarantees to enable first class language support for run-time known stack allocations in non-async, non-recursive functions. While having a discontinuity in support feels inelegant from a language-theoretic point of view, the pragmatist would notice that alloca in recursive functions is pulling on the sleeping dragon's tail. I'm not sure what the ramifications are for async functions are just yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants