Skip to content

Latest commit

 

History

History
27 lines (14 loc) · 6.2 KB

LESSONS-LEARNED.md

File metadata and controls

27 lines (14 loc) · 6.2 KB

Lessons learned

Marshalling

There exist hidden costs and gotchas for interoperability in C#. How to avoid them?

For C#, the Common Language Runtime (CLR) marshals data between managed and unmanaged contexts (forwards and possibly backwards). In layman's terms, marshalling is transforming the bit representation of a data structure to be correct for the target programming language. For best performance, at worse, marshalling should be minimal, and at best, marshalling should be pass-through.

Performance

Pass through is the ideal situation when considering performance because both languages agree on the bit representation of data structures without any further processing. C# calls such data structures "blittable". (The sense of the word "blit" means the rapid copying of a block of memory; the word comes from the bit-block transfer (bit-blit) data operation commonly found in computer graphics.) However, to achieve blittable data structures in C#, the garbage collector (GC) is avoided. Why? Because class instances in C# are Objects which the allocation of bits can't be controlled precisely by the developer; it's an "implementation detail."

Simplicity

The rules of Microsoft: default marshalling behaviour and the Mono: default marshalling behaviour are unintuitive for C# classes. It requires the developer to learn specific knowledge and follow the rules for correct interoperability of classes, such as avoiding a double-free. The problem with this is that most developers have not and likely will not learn the nuances because (1) P/Invoke of classes does not come up that often and (2) it adds to the list of "all the little details to know to wield the power of C# properly" which leads to complexity for the developer. What is more simple and thus reduces complexity for the developer is pass-through marshalling of blittable C# structs because it follows and leverages one's existing knowledge and expectations of how C works. The catch is that the developer is then responsible for memory management of C# blittable structs as if they were coding in C. However, the cost of manually doing memory management is worth it because there are no hidden allocations or control flow done by the Common Language Runtime (CLR) on your behalf. When something goes wrong with interoperability with a C library, making things as simple as possible to understand for as many people as possible is the fastest way to solve the problem.

For these two reasons, (1) performance, and (2) simplicity, I recommend using only blittable structs for interopability to C from C# to which C2CS follows this when generating C# code.

The garbage collector is a software industry hack

The software industry's attitude, especially business-developers and web-developers, to say that memory is an "implementation detail" and then ignore memory is often justified without knowing or caring for the consequences; it becomes ultimately dangerous.

A function call that changes the state of the system is a side effect. Humans are imperfect at reasoning about side effects, to reason about non-linear systems. An example of a side effect is calling fopen in C because it leaves a file in an open state. malloc in C is another example of a side effect because it leaves a block of memory allocated. Notice that side effects come in pairs. To close a file, fclose is called. To deallocate a block of memory, free is called. Other languages have their versions of such function pairs. Some languages went as far as inventing language-specific features, some of which become part of our software programs, so we humans don't have to deal with such pairs of functions. In theory, this is a great idea. And thus, for the specific case of malloc and free, we invented garbage collection to take us to the promised land of never having to deal with these specific pair of functions.

In practice, using garbage collection to manage your memory automatically turns out to be a horrible idea. This becomes evident if you ever worked on an extensive enough system with the need for real-time responsiveness. In fairness, most applications don't require real-time responsiveness, and it is a lot easier to write safe programs with a garbage collector. However, this is where I think the problem starts. The problem is that developers have become ignorant of why good memory management is essential. This "Oh, the system will take care of it, don't worry." attitude is like a disease that spreads like wild-fire in the industry. The reason is simple: it lowers the bar of experience + knowledge + time required to write safe software. The consequence is that a large number of developers have learned to use a Golden Hammer. (The world of finance also has a definition for Golden Hammer which is relatable.)

Developers have learned to ignore how the hardware operates when solving problems with software, even up to the extreme point that they deny that the hardware even exists. Optimizing code for performance has become an exercise of stumbling around in the pitch-black dark until you find something of interest; it's an afterthought. Even if the developer does find something of interest, it likely opposes his/her worldview of understandable code because they have lost touch with the hardware, lost touch with reality. C# is a useful tool, but you and I have to admit that people mostly use it as Golden Hammer. Just inspect the source code that this tool generates for native bindings as proof of this fact. From my experience, a fair amount of C# developers don't spend their time with such code, don't know how to use structs properly, or even know what blittable data structures are. C# developers (including myself) may need to take a hard look in the mirror, especially if we are open to critizing developers to other programming languages or other fields of business with their own Golden Hammers such as Java, JavaScript, or Electron (:scream:).