Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

printf proof of concept #167

Closed
andrewrk opened this issue Jul 29, 2016 · 8 comments
Closed

printf proof of concept #167

andrewrk opened this issue Jul 29, 2016 · 8 comments
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Jul 29, 2016

This is the next big hurdle for Zig. It solves one of the most tricky use cases that we've been as yet unable to tackle elegantly: printf.

This proposal changes the meaning of inline functions. It supersedes #132.

With this proposal, marking a function inline tells the compiler to attempt to evaluate the function at compile time - to the extent possible. What it will do is, per callsite, codegen a unique LLVM function marked with the alwaysinline attribute, and then begin to interpret the function at compile time. Obviously if any of the arguments are known at compile time, that makes this interpretation probably do more work.

As Zig interprets the function, any expressions that cannot be interpreted, because they depend on runtime data or have runtime side effects, are emitted into the function's codegen.

Worst case scenario, nothing could be interpreted, and the function is codegened identical to a normal function, marked with the alwaysinline LLVM attribute.

Best case scenario, large swaths of control structures and loops are eliminated and what is left is a series of straightforward expressions that LLVM can further optimize.

You can see how this is relevant for printf. This example uses variable argument functions (#77) which are not yet implemented, but you can see how they would work together.

pub struct OutStream {
    fd: i32,
    buffer: [buffer_size]u8,
    index: usize,

    pub fn printf(os: &OutStream, inline format: []const u8, args: []var) -> %usize {
        enum State {
            Start,
            Percent,
        }
        var bytes_written: usize = 0;
        var start_index: usize = 0;
        var state = State.Start;
        var next_arg: usize = 0;
        inline for (format) |c, i| {
            switch (state) {
                Start => {
                    if (c == '%') {
                        bytes_written += %return os.write(format[start_index..i]);
                        state = State.Percent,
                    }
                },
                Percent => switch (c) {
                    '%' => {
                        bytes_written += %return os.write_byte('%');
                        state = State.Start;
                        start_index = i;
                    },
                    'd' => {
                        const arg = args[next_arg];
                        next_arg += 1;
                        bytes_written += %return os.print_int(@typeof(arg), arg);
                        state = State.Start;
                        start_index = i;
                    },
                    'f' => {
                        const arg = args[next_arg];
                        next_arg += 1;
                        bytes_written += %return os.print_float(@typeof(arg), arg);
                        state = State.Start;
                        start_index = i;
                    },
                    else => @compile_err("unknown percent escape: " ++ c),
                },
            }
        }
        if (format.len > start_index) {
            bytes_written += %return os.write(format[start_index]..format.len);
        }
        assert(args.len == next_arg);
        %return os.flush();
        return bytes_written;
    }
}

The unique function zig would generate for os.print("percent sign: %%, integer: %d, float: %f\n", a_u32_value, a_float_value) while interpreting this function would look like:

fn anonymous_print_invocation(os: &OutStream, a: u32, b: f32) -> %usize {
    var bytes_written: usize = 0;
    bytes_written += %return os.write("percent sign: ");
    bytes_written += %return os.write("%");
    bytes_written += %return os.write(", integer: ");
    bytes_written += %return os.write_int(a);
    bytes_written += %return os.write(", float: ");
    bytes_written += %return os.write_float(b);
    bytes_written += %return os.write("\n");
    %return os.flush();
    return bytes_written;
}

Careful readers will notice that this implementation could be more efficient if it figured out how to merge these strings together "percent sign: ", "%", and ", integer: ".

One can easily imagine how this concept would also be used to have compile time checked regular expressions. Make the regex_execute function inline with the regex string inline as well and the compiler does all the work.

Semi related, we currently have #attribute("noinline") but let's move that to an actual noinline keyword that can be used in place of inline, for consistency.

@andrewrk andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Jul 29, 2016
@andrewrk andrewrk added this to the 0.1.0 milestone Jul 29, 2016
@andrewrk andrewrk changed the title inline functions printf proof of concept Jan 16, 2017
andrewrk added a commit that referenced this issue Jan 24, 2017
See #167

Need to troubleshoot when we send 2 slices to printf. It goes
into an infinite loop.

This commit introduces 4 builtin functions:

 * `@isInteger`
 * `@isFloat`
 * `@canImplictCast`
 * `@typeName`
@andrewrk
Copy link
Member Author

Small example of the current thing preventing this from working:

fn foo(b: bool) {
    inline while (true) {
        const result = if (b) false else true;
    }
}

The problem is that we generate basic blocks for the Then and Else parts of the if statement, since b is known at runtime.

EndIf_14:
    #20 | (unknown)   | 1 | $Then_12:false $Else_13:true
    #21 | void        | - | const result = #20 // comptime = false
    #22 | void        | 0 | {}
    #23 | unreachable | - | goto $WhileCond_1 // comptime = true

When we analyze the EndIf basic block, we inline the goto back to the while condition since the while loop is inlined, so we put this in the new basic block:

WhileCond_1:
    #6  | bool        | 1 | true
    #7  | unreachable | - | if (true) $WhileBody_2 else $WhileEnd_3 // comptime = true

We follow and inline the while body:

WhileBody_2:
    #8  | bool        | 0 | false
    #9  | (unknown)   | 1 | &b
    #10 | (unknown)   | 2 | *#9
    #11 | (unknown)   | 3 | @testComptime(#10)
    #15 | unreachable | - | if (#10) $Then_12 else $Else_13 // comptime = #11

Now we're done inlining because we're going to branch to Then_12 or Else_13 based on a runtime value. And luckily we've already generated those basic blocks. So we're left with the analyzed IR:

Entry_0:
    #3  | &const bool | 1 | &b
    #4  | bool        | 1 | *#3
    #8  | unreachable | - | if (#4) $Then_6 else $Else_7
Then_6:
    #11 | unreachable | - | goto $EndIf_10
Else_7:
    #13 | unreachable | - | goto $EndIf_10
EndIf_10:
    #14 | bool        | 1 | $Then_6:false $Else_7:true
    #15 | void        | - | const result = #14 // comptime = false
    #17 | &const bool | 1 | &b
    #18 | bool        | 1 | *#17
    #20 | unreachable | - | if (#18) $Then_6 else $Else_7

So the EndIf block gets stuff inlined into it that maybe shouldn't be. At some point we need to have a separation of runtime and compile time.

@andrewrk
Copy link
Member Author

OK here's the idea I'm going to try. When a basic block inlines another basic block, for example the goto $WhileCond_1 at the end of the EndIf_14, it invalidates itself as the block that corresponds to the EndIf block from the older IR. So then next time we want to go to this EndIf block, we will regenerate it, which I believe is the behavior we want.

@andrewrk
Copy link
Member Author

This code works now:

const io = @import("std").io;

pub fn main(args: [][]u8) -> %void {
    %%io.stdout.printf("hello\n");
    %%io.stdout.printf("here is a number: {} and here is a string: {}\n", i32(1234), args[0]);
}

@tuket
Copy link

tuket commented Sep 17, 2019

What happened to printf? Has it been deleted? I can't find it in the source code

@andrewrk
Copy link
Member Author

std.fmt.format is the guts of the implementation. You'll typically use std.debug.warn the most, or std.io.OutStream.print.

@tuket
Copy link

tuket commented Sep 17, 2019

Thank you!
This seems to work:

const stdoutFile = try std.io.getStdOut();
const stdout = &stdoutFile.outStream().stream;
stdout.print("hello!");

The problem I was having with warn is that I couldn't redirect the output to a file:

./hello > out.txt

@andrewrk
Copy link
Member Author

https://ziglang.org/documentation/master/#Hello-World

using a stream for stdout is the right approach because it will force you to deal with errors writing to stdout.

@tuket
Copy link

tuket commented Sep 17, 2019

That link doesn't mention anything about streams or print

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Projects
None yet
Development

No branches or pull requests

2 participants