Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial const evaluations! (targetted const recompilation at runtime) #16524

Open
asdfgasdfsafgsdfa opened this issue Jan 15, 2017 · 15 comments
Open

Comments

@asdfgasdfsafgsdfa
Copy link

asdfgasdfsafgsdfa commented Jan 15, 2017

Why?

The discussions and ideas in code generators and constexpr ( #16503 and
#15079 ) always leave me with some bland aftertaste... I always had the feeling that there is a LOT more than can be done to drastically improve performance. At first I thought that code-generators would be the silver bullet to save us from bad performance; but it quickly became apparent to me that we are not at the core of the problem yet.
Fortunately I have figured out some good examples.

What?

Most of the time as programmers we are perfectly aware of what code is time-critical and will be the bottleneck in our applications, but improving that can be hard, really hard actually.
I propose a combination of compiler and language features that have the potential to increase performance drastically.

By telling the compiler and runtime that some arguments to a function will remain constant for some time, the compiler and/or jit compiler can recompile a method (and all the methods it calls in turn).

Example 1 - Serialization

Here we'll serialize some type from/to JSON.
Traditionally we'd use Newtonsoft.Json or some other library and then JsonConvert.SerializeObject(obj).

This is bad. At every SerializeObject call the program executes code again even though there is no reason to.
Tons of if comparisons if some options are set, null checks, ... all sorts of things that we as a human would instantly know to be always true or always false in a given context.

If there would be a way to "specialize" code it could improve performance a lot.
For example, if it were possible to do the following instead of the JsonConvert call above:

// We want to serialize the following class
class Person { public string Name; public int Age; }


var specializedSerializeMethod = Specialize(
                                    method: JsonConvert.SerializeObject,
                                    genericTypeArgs: new[] { typeof(Person) },
                                    methodArgs: null,
                                    hints: new[] { new NeverNullHint(".Name") }); 

var jsonText = specializedSerializeMethod(somePersonObject);

Now why is it not possible for the code inside specializedSerializeMethod to be something like

return "{\"Name\":\"" + obj.Name + \"",\"Age\":" + obj.Age + "}";

The NeverNullHint tells the Specialize method that the Name member will never be null, so if(... != null) checks can be safely replaced with if(false), which can be completely optimized away.
I imagine that 90% of the code inside SerializeObject could be removed just because the Type is known beforehand (plus some hints).

The "targetted const" part from the title of this issue would be the genericTypeArgs, methodArgs and hints in the Specialize method.

Example 2 - Regex

More general than just serialization would be regex, where you have patterns.

System.Text.RegularExpressions.Regex already does something just like that, exactly for the reason that interpreted code is too slow.
You provide 'code' in the form of a regex string, then pass RegexOptions.Compiled and it will use dynamic methods to generate a specialized version.
The only difference is that the specialized version is not generated from the regex interpreter + some "assumed to be constant" parameter.

Example 3 - Search and advanced pattern matching on data

Even more general than Regex in Example2 would be matching all sorts of patterns in all sorts of data!
Just like Regex searches for patterns in text, there are tons of other search patterns that people use every day.
For example searching for binary patterns when patching software by using delta-patches.
Also when compressing/decompressing data!
The same is true for image recognition.

To give an example without going into too much detail:
In classical image recognition and feature detection algorithms you often have loops that iterate over thousands or millions of points (pixels) and try to match patterns there.
Now if you could just pre-compile a known pattern into specialized code, you'd get enormous performance benefits. Just like in regex.

Example 4 - All sorts of interpreters!

The most general case I can think of at all, would be interpreting code, not just patterns (like regex).

Emulators for game consoles do it all the time (Emulators for the GameBoy, N64, Playstation1/2, Wii, and many many more).
They call it "dynamic recompilation".

But the same can be done for javascript and other classical programming languages.
And surprise surprise, and all major Js engines are doing just that, they generate code from known inputs.

The input being a string in javascript syntax, and the output being code instead of data. (a pointer to code that you can call).

How?

Just like generics generate new "specialized" code for different generic types the .NET runtime could generate new optimized/specialized delegates when some parameters are known.
Or even when just parts of the parameters are known, for example new ConstantValueHint(".Age", 123).

Now the API would be pretty difficult to design to allow for all sorts of hints.
When the created delegate is not in use anymore it would be collected by the GC eventually, that would then free the jited code as well...

Disclaimer

  • I don't say this is a good idea! I want your guys opinion, so maybe we can figure out together if it is good or if its a bad idea.
  • I don't have all the answers: It's quite possible that I missed some situations that would make awesome examples. And it is just as possible that I missed some critical points that would render the whole idea moot.

The only thing I do know for a fact is that there are a number of situations (some of which I listed above) where recompiling code with assumptions would help performance enourmously,
especially in the pattern image recognition part. And I know that people get performance gains of 1000% and more by "compiling" code in emulators, or in javascript.
All things that require interpreters in some form can profit from this.

@mikedn
Copy link

mikedn commented Jan 15, 2017

I think that the following example sums it all:

Now why is it not possible for the code inside specializedSerializeMethod to be something like

return "{\"Name\":\"" + obj.Name + \"",\"Age\":" + obj.Age + "}";

Basically you want a super smart compiler than can understand what the rather complicated serializer code is doing and simplify it. The main problem is that nobody figured out how to write such a super smart compiler. Any takers?

@Pzixel
Copy link

Pzixel commented Jan 15, 2017

Your example with JSON is really nice, but I don't see how could it be solved at compile time. I think RegexOptions.Compiled was not easy to write, it has support in both C#/CLR and so on. But I believe that generation of such specialized method in general case it's impossible because of halting problem. You can manually write it (like it's done for regex in LLVM, see link), but compiler cannot do it automatically.

Because internally JsonConvert.SerializeObject calls multiple internal methods, which creates some objects, that can be depended on current system settings (at least on CurrentCulture when generating output string). So it's not just about super smart compiler as @mikedn said, but it's impossible even theoretically.

It's have even less sense, because we are about to get Roslyn post-compile transformation (see link) which allow you to do such things without hacking C# internals.

@HaloFour
Copy link

HaloFour commented Jan 15, 2017

I don't intend to distract or disagree with the above proposal. These are just a few of my more immediate thoughts on the subject.

I think that use cases 1 and 2 could be covered by source generators. For example, a specialized serializer could be defined as the following:

public partial PersonSerializer : JsonSerializer<Person> { }

And the source generator would then interrogate the generic type argument Person and automatically generate all of the methods necessary to perform the serialization. That specialized code would be generated during the compilation pipeline and compiled directly into the assembly.

As for the compiler hints, that's a little further reaching and honestly I believe it belongs more to the runtime/JIT compiler than to Roslyn. There are already a number of attributes which can be used to influence the JIT compiler, I could see how that could be expanded with more attributes (or other mechanisms) that the runtime could use as hints to optimize the generated machine code for given IL.

@asdfgasdfsafgsdfa
Copy link
Author

asdfgasdfsafgsdfa commented Jan 15, 2017

@mikedn @Pzixel
You guys are completely right, it is not an easy task, but it can be approximated!

Because internally JsonConvert.SerializeObject calls multiple internal methods, which creates some objects, that can be depended on current system settings (at least on CurrentCulture)

Correct, CurrentCulture would have to be considered a parameter to the "system" (where system means the function being specialized + all the functions it is calling + all the functions they are calling + ...).

But that does not mean that I could not write code in a way that there are no "hidden parameters".
If I would write a new Json serializer library, and I know that people would want to optimize my library code for their use case in the way I outlined above, I could just make all my functions "pure" (only only referencing parameters and no "outside" values).
Then people could easily just give all parameters to the Specialize function.

It is true that in this day and age, libraries are written in a way where they almost certainly use some kind of global state. But if library authors were aware of such a system, they could write their code in a different way.

as @mikedn said, but it's impossible even theoretically.

I am not convinced that is true for all cases. And I argue that people can engineer their code in ways that make it possible, even trivial to do this.

@asdfgasdfsafgsdfa
Copy link
Author

@HaloFour

And the source generator would then interrogate the generic type argument Person and automatically generate all of the methods necessary to perform the serialization. That specialized code would be generated during the compilation pipeline and compiled directly into the assembly.

Exactly, thats why I'm such a big fan of code generators.
They could be (ab-)used to solve the other problems as well, at least partially.
For example if you have a list of regex patterns in some file at compile time, a code generator could open it and generate methods for them. Unfortunately that would not work at runtime when new regex patterns would be needed. Which is why I thought of this idea.

I'm sure that it will be most likely too hard to implement this, but I'd still like to explore this idea.
Who knows, maybe a part of the problem can be solved easily. :)

As for the compiler hints, that's a little further reaching and honestly I believe it belongs more to the runtime/JIT compiler than to Roslyn.

Is there a different github project for the jit compiler / runtime?

There are already a number of attributes which can be used to influence the JIT compiler, I could see how that could be expanded with more attributes (or other mechanisms) that the runtime could use as hints to optimize the generated machine code for given IL.

"a number of attributes" ?
I'm only aware of the MethodImpl attribute and AgressiveInlining.
Are there more?

@HaloFour
Copy link

@asdfgasdfsafgsdfa

Is there a different github project for the jit compiler / runtime?

https://github.com/dotnet/coreclr

"a number of attributes" ?
I'm only aware of the MethodImpl attribute and AgressiveInlining.
Are there more?

Well, 1 is a number. 😁

@mikedn
Copy link

mikedn commented Jan 15, 2017

and I know that people would want to optimize my library code for their use case in the way I outlined above, I could just make all my functions "pure" (only only referencing parameters and no "outside" values).

That's practically impossible for a serializer, you have to use reflection and that counts as "outside".

Now, a compiler could treat reflection operations as intrinsics and then maybe, just maybe, the compiler could do something if the serializer code is simple enough. Trouble is, if you write a very simple serializer to please the compiler then that simple serializer won't perform well when the compiler fails to optimize the code as intended.

Wait a moment, didn't I just assume that the compiler can optimize the code? Yes, I did, but only when the object type is known at compile time. When the object type is not known (and that's a rather common situation) there's nothing that the compiler can do and your simple but inherently inefficient serialize will run at runtime.

@asdfgasdfsafgsdfa
Copy link
Author

@mikedn

Now, a compiler could treat reflection operations as intrinsics and then maybe, just maybe, the compiler could do something if the serializer code is simple enough.

Exactly :)

When the object type is not known (and that's a rather common situation) there's nothing that the compiler can do and your simple but inherently inefficient serialize will run at runtime.

Waaaait a minute, we have a huge misunderstanding here! I'm sorry!

First, when a type is known at compile time, then I could just use code generators! So that case (type to serialize known at compile time) is irrelevant.

The whole point of this thing is that the compiler (or something similar, maybe working on IL instead of c#-source-code) runs at runtime.

Now, at runtime you have some new Type SimpleInefficientSerializer.
Maybe that one was even loaded from an assembly which was just downloaded from the internet... (so zero compile time information).

The Specialize system would likely decompile the SimpleInefficientSerializer.SerializeObject method and analyze it.
It would then go and propagate constant values throughout the call tree.

While it is doing that it will also check each property access for known hints.
For instance the .Name in Type can always be considered constant.
The given parameters (generic and normal) are also considered constant,

and so on...

Sure, the code will most likely not be as pretty or small as in my ideal example, but it will definitely be smaller because many if and switch statements are now perfectly known.

@HaloFour
Copy link

HaloFour commented Jan 15, 2017

@asdfgasdfsafgsdfa

The whole point of this thing is that the compiler (or something similar, maybe working on IL instead of c#-source-code) runs at runtime.

That takes it completely outside of the realm of the compiler as at runtime you're dealing strictly with IL and not C# (or any other specific language). Roslyn only exists to convert C# into IL, it doesn't have a function beyond that point.

That said, exactly when do you propose this happen at runtime? Trying to analyze the IL to determine all potential targets for serialization is probably no easier than doing so in C#. The serialization library is better off detecting when a new type is being serialized/deserialized and at that point emitting the specialized IL for handling that type which is then cached and reused. You'd take that hit at first serialization but then you'd follow the optimized path. I'm pretty sure that describes most of the commonly used serialization libraries already.

@asdfgasdfsafgsdfa
Copy link
Author

@HaloFour

That takes it completely outside of the realm of the compiler as at runtime you're dealing strictly with IL and not C# (or any other specific language). Roslyn only exists to convert C# into IL, it doesn't have a function beyond that point.

You are completely right. But I don't think there's any better place to talk about this then here, right? People here know enough about the details of the language, the compiler, the possible advantages and disadvantages.

That said, exactly when do you propose this happen at runtime?

On request, the programmer would call make their own specialized delegates as needed.
In an example application:
See my mockup code in Example1 in the first post.

You explicitly tell the system "generate specialized code for me, here are the assumptions you can make, the parameters you can treat as constants, ..."

And you get back a delegate that only works under the assumptions you have given.
Of course the returned delegate will have a different type than the original function. Because now former parameters are missing as they are "baked into" the code already.

(Or the system would just return a delegate with the original signature, ignoring parameters that have assumptions made on them)

@HaloFour
Copy link

@asdfgasdfsafgsdfa

But I don't think there's any better place to talk about this then here, right?

If you're looking for the JIT to specialize code based on hints provided at runtime then I think that the coreclr repo is the more appropriate place. What you're describing sounds like either something built into the JIT or something built on top of dynamic assemblies.

@mikedn
Copy link

mikedn commented Jan 15, 2017

The Specialize system would likely decompile the SimpleInefficientSerializer.SerializeObject method and analyze it. It would then go and propagate constant values throughout the call tree.

Doing such a thing at runtime is very expensive. It's also problematic because the number of specializations can grow out of control so there have to be some limits. But such limits may imply that the same scenario may end up being specialized multiple times and that only add to the runtime cost.

There are compilers that do this kind of stuff but on very small scale compared to what you suggest here. JavaScript compilers do this but they have a comparatively very small problem - how to access object members efficiently. They don't optimize arbitrary amounts of code in the way that you suggest here.

@asdfgasdfsafgsdfa
Copy link
Author

@mikedn
Yeah I can imagine how that would take a huuuge amount of time.
On the order of tens of milliseconds just for medium sized functions (including all the functions that get called).

Maybe even on the order of seconds when the call tree gets big enough, I mean if you take a look at all the stuff that gets done in a good JsonSerializer...

It's also problematic because the number of specializations can grow out of control so there have to be some limits. But such limits may imply that the same scenario may end up being specialized multiple times and that only add to the runtime cost.

But I am optimistic because the thing I suggest will only create a new specialized method when the programmer calls the Specialize function.
This would only be used when it is known that there will be a performance benefit. Not automatically.

For example if I know my server application will need to serialize millions objects to json, then I would want to let the system generate a specialized method for my usecase from the existing code.

There are compilers that do this kind of stuff but on very small scale compared to what you suggest here. JavaScript compilers do this but they have a comparatively very small problem - how to access object members efficiently.

I am only vaguely aware of the optimizations in js engines, do you have some example / link / source on that? I'd love to see how that is done.

@mikedn
Copy link

mikedn commented Jan 15, 2017

For example if I know my server application will need to serialize millions objects to json, then I would want to let the system generate a specialized method for my usecase from the existing code.

The funny thing is that this happens already, but in a different way. Serializers (at least good ones) usually generate code at runtime that's specific to the type you're serializing. They know exactly what code needs to be generated and that puts them at an advantage compared to a compiler that would need to try to understand what the serializer code is doing and try to generate better code for a given type. Basically you have specialized code producing other specialized code compared to general purpose code attempting to produce specialized code.

I am only vaguely aware of the optimizations in js engines, do you have some example / link / source on that? I'd love to see how that is done.

I don't have much interesting in JS optimizations so I'm don't have any handy links. The basic idea is that they attempt to create types on the fly and generate code specialized for those types. Such specialized code includes type checks to ensure that it only runs on the types it was generated for, otherwise non-specialized code needs to be run instead or more specialized code needs to be generated.

In any case, the idea is that what JS engines do is very focused. They deal with random pieces of code but only look for and optimize certain aspects of the code.

@Pzixel
Copy link

Pzixel commented Jan 15, 2017

@asdfgasdfsafgsdfa here is some information about it

@gafter gafter added this to the Unknown milestone Jan 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants