Introduction to IL (MSIL/CIL)

The Assembly Language of .NET

In my previous post, I underscored the importance of developing the skills necessary to create high-performance applications across all levels. Many of my upcoming posts will emphasize performance as a key focus.

When writing high-performance .NET code, understanding the underlying Intermediate Language (IL) can provide deep insights into how the .NET runtime executes your code. IL, also known as Microsoft Intermediate Language (MSIL) or Common Intermediate Language (CIL), is the low-level, stack-based language to which all .NET high-level languages compile before being JIT-compiled into native machine code.

For any .NET software engineer seeking optimal performance and efficiency in their applications, a solid understanding of Intermediate Language (IL) is indispensable. IL serves as the bridge between high-level .NET languages and the native machine code executed by the .NET runtime. By deciphering IL, developers can gain critical insights into the inner workings of their code, enabling them to pinpoint performance bottlenecks, understand execution flow, and validate the accuracy and efficiency of their implementations. As my upcoming posts will often delve deeply into performance-related topics, understanding IL will provide the foundational knowledge necessary to appreciate the complexities and nuances discussed, thus empowering engineers to craft high-performance, reliable applications.

In this post, I’ll introduce the basic concepts of IL, covering its stack-based nature, a couple of standard instructions, and how parameters and operations interact with the evaluation stack.

Before Going Further

For this article and any other articles discussing Intermediate Language (IL) generation within the build and execution pipeline context, I refer to most .NET applications and their typical build processes. Unless explicitly stated otherwise, I deliberately exclude cases where native code is compiled directly from source code, such as Ready-to-Run (R2R), Native Ahead-of-Time (AOT), and similar techniques.

Core Concepts

Brief Definitions

The .NET infrastructure specification contains several definitions, but only a handful of them are needed to understand how the IL operates. These definitions help us understand how the components interact and the role IL has in the .NET ecosystem.

The Common Language Infrastructure (CLI) establishes the framework within which IL operates, defining the executable code format and the runtime environment that can execute this code. This ensures that IL remains consistent and interoperable across different .NET languages.

The Common Type System (CTS) further clarifies how types are declared, used, and managed across languages, promoting cross-language integration and type safety. This unified type system is essential for IL, allowing engineers to write code in multiple .NET languages while ensuring that the underlying IL maintains high performance and reliability.

The Common Language Specification (CLS) outlines a subset of the CTS that all .NET languages must support, ensuring interoperability and consistency. This is vital for IL because it guarantees that the intermediate language can be accurately and efficiently JIT-compiled into native machine code, irrespective of the initial high-level language.

The Virtual Execution System (VES) enforces the CTS model, loads, and runs IL-compatible programs. By managing the IL evaluation stack, platform-specific compilation, and runtime services, the VES ensures that IL operates smoothly and efficiently.

The Execution Engine (EE) is the concrete implementation of the Virtual Execution System (VES) within the .NET runtime, specifically designed to meet the needs of the .NET runtime environment.

.NET IL is often referred to as CIL and MSIL interchangeably. CIL (Common Intermediate Language) is the acronym used in the CLI ECMA standard. MSIL (“Microsoft Intermediate Language”) is a term commonly used to refer to the Microsoft implementation of the CIL for .NET, but they are used interchangeably nearly everywhere. The CIL is “The instruction set understood by the VES (Virtual Execution System).”

Intermediate Language

This post focuses on MSIL, the CIL implementation for .NET. .NET compiles source code written in several languages such as C#, F#, and VB.NET into an intermediate language (IL) for portability, allowing it to run on multiple platforms and providing support for the Common Language Specification (CLS) based on that CLS definition. When the Execution Engine executes a module containing IL on a supported platform, the JIT (Just-in-Time) compiler within the EE can compile that IL code from the intermediate, portable form, to a machine specific native code form.

Two Compilation Stages

Understanding IL

With the core concepts and definitions covered, we now explore the basics of IL itself.

Evaluation Stack

In contrast to register-based machine code (I will explore machine code more in future posts), Intermediate Language (IL) is a stack-based language. This means that IL instructions do not operate directly on registers or memory; instead, they push and pop values onto an evaluation stack and subsequently store the results of operations back onto that stack. Except for those instructions specifically designed for pushing and popping values to and from the stack, all operations are exclusively carried out on values residing on the stack.

The stack serves a conceptual purpose, providing a unified understanding of the operations being executed, the operands required for those operations, and any resultant values. The typical steps for IL operations involve the following:

Push the necessary operands onto the stack for the operation.
Execute the operation.
Pop the result of the operation off the stack and store it in a designated location (e.g., local variable), if:
- The operation yields a return value.
- The result is not an intermediate operand for a subsequent operation. In this case, the result may be left on the stack as an operand for the next operation.

The Just-In-Time (JIT) compiler leverages this dynamic conceptual state to generate machine code specific to the CPU’s architecture for execution.

For most purposes outside of compiler development, the evaluation stack concept serves primarily as a mental framework for understanding the operations and operands during the execution flow of Intermediate Language (IL) instructions. One essential aspect of that framework is the transitions occurring with each IL instruction between two distinct states of the evaluation stack.

Stack Transitions

The result of every operation is a discrete stack transition with a simple stack transition description associated with it. The transition description will look something like this

…, value1, value2  ⟶  …, result

This shows the before and after state of the evaluation stack. The left side of the arrow illustrates the state of the stack before the operation, and the right side naturally displays the state of the stack after the operation has executed. This example

…, value1, value2

indicates that there are potentially existing values on the stack that don’t involve this operation (depicted with ‘…’). Then, two operand values (value1 and value2) are pushed onto the top of the evaluation stack, and the operation will pop them off the stack to perform on.

The right side of the transition shows

…, result

When the operation completes, the result value is pushed onto the stack. The original values (if any, represented by ‘…’) will still be in place on the stack underneath the result, but the operands that were pushed for the operation have been removed.

Note that there is a NOP (no operation) instruction, and the stack transition appears precisely as you would expect.

…,  ⟶  …,

Understanding stack transitions is fundamental for engineers when reviewing IL produced by their high-level language code for performance and efficiency issues. Stack transitions reveal how data is managed and manipulated during operation execution, providing a clear basis for understanding what Intermediate Language (IL) operations do.

Stack Transition Example

For example, consider this simple C# method:

static int IntegerAdd(int a, int b)
{
    return a + b;
}

Let’s examine the IL code generated from this example by stepping through the stack transitions that occur with each instruction.

We start with an evaluation stack that contains preexisting values irrelevant to the current operation. Although this illustration shows a certain number of existing values, the count can vary from 0 to n, indicating that the stack could also be empty.

Existing evaluation stack with 0..n unknown values

In the .NET framework, a left-to-right calling convention is employed for arguments. This means that the method caller arranges the arguments on the stack in sequence, from left to right, as they appear in the method signature. The Intermediate Language (IL) assigns an ordinal number to each argument in precisely this order.

That means that for this IntegerAdd method, ‘a‘ is argument 0, and ‘b‘ is argument 1.

int IntegerAdd (/* argument 0 */ int a, /* argument 1 */ int b)

To implement the a + b operation, we will execute the IL add instruction. The description of this instruction is “the add instruction adds value2 to value1 and pushes the result on the stack.”, and the stack transition is

…, value1, value2  ⟶  …, result

To execute the add instruction, it is necessary to provide the two operand values. The values of the arguments passed to our IntegerAdd method are pushed onto the evaluation stack using the ldarg (load argument) instruction. This instruction is described as follows: “the ldarg num instruction pushes onto the evaluation stack the num_’th incoming argument, where arguments are numbered 0 onwards.” The stack transition is as follows:

…,  ⟶  …, value

The syntax for specifying argument numbers is denoted by ldarg.0, ldarg.1, ldarg.2, and so forth. Consequently, we load the IntegerAdd method arguments onto the stack as value1 and value2 using these instructions for the IL add operation.

ldarg.0      // Push argument 0 ('a') onto the stack as value1 for the add IL instruction.
ldarg.1      // Push argument 1 ('b') onto the stack as value2 for the add IL instruction.

ldarg.0 Stack Transition

ldarg.1 Stack Transition

Then, invoke the add IL instruction

add         // Add the values of the top two stack elements and push the result.

add Operation Stack Transition

After the evaluation stack contains the result, the final step is to pop the value off the stack and return it as the output of the IntegerAdd method.

ret         // Return from the method and transfer execution control to the caller.

The stack transition here and the description of the ret instruction gets interesting. The stack transition description is

retVal on callee evaluation stack (not always present)  ⟶
        …, retVal on caller evaluation stack (not always present)

and the ‘ret‘ instruction description says, “Return from the current method. The return type, if any, of the current method determines the type of value to be fetched from the top of the stack and copied onto the stack of the method that called the current method. The evaluation stack for the current method shall be empty except for the value to be returned.”

In an upcoming post, I will explain the meaning of “the return type, if any, of the current method” in an IL context. However, the remaining part of this description may not be immediately intuitive. I will briefly clarify it below. For now, the IntegerAdd method in IL culminates with the following:

ldarg.0        // Push argument 0 (“a”) onto the stack
ldarg.1        // Push argument 1 (“b”) onto the stack
add            // Pop “a” and “b” from the stack, add them, and push the result
ret            // Return from the IntegerAdd method with the result on top of the stack

Stack transitions play a significant role in understanding the IL language and interpreting what happens when reviewing the IL code generated from your compilations. This simple example illustrates what those stack transition descriptions depict to help you with this interpretation.

Empty Stack

The definition for the ret instruction states: “The evaluation stack for the current method shall be empty except for the value to be returned.” This description indicates an evaluation stack for the current method. An observant reader will see that this implies each method has a discrete evaluation stack, which starts empty at the beginning of method execution. This perspective helps understand a single method. However, it may not be immediately apparent when considering the code that calls the IntegerAdd method and the state of the stack from that call. This can be especially relevant for individuals with a background in native code, such as C, C++, or Assembly.

In a subsequent post, I will provide a detailed examination of the call IL instruction. For now, I will briefly outline the requirements for invoking a method in IL. As demonstrated earlier, when executing an IL instruction that requires operands, the operands are initially pushed onto the evaluation stack. The state of the stack then transitions according to the invoked operation. Similarly, when calling a method, the necessary argument values are pushed onto the evaluation stack before the call instruction for the desired method is invoked; upon return, the result (if any) is placed at the top of the stack.

call Operation Resulting Stack

At the commencement of the IntegerAdd method execution, it is logical to consider the evaluation stack to reflect the current state of the entire method execution chain, containing pre-existing values that will serve as the initial context for the IntegerAdd method.

Two arguments are pushed onto the evaluation stack.

However, if this were the case, then when we invoke the ldarg.0 and ldarg.1 instructions, the system would perform the operations like this

Conceptual Argument Stack Copy/Push

The ldarg operations can’t work this way because the CIL evaluation stack operates as a ‘pure stack’ data structure. Consequently, it does not support pulling values from within the stack and placing them at the top. Instead, a more correct way to consider this is that the arguments are popped off the caller’s stack and placed into an argument array within the scope of the called method before the called method begins execution.

While this may not be a universally used term, a ‘pure stack’ in this context is a stack data structure that supports simple operations, including push, pop, and possibly peek. It cannot enumerate the values in the stack or retrieve values that are not on the top of the stack.

Note: The peek stack operation can be implemented using a pop followed by a push, so it is reasonable to say a ‘pure stack’ only supports push and pop.

call Operation Moves Arguments

Now, the caller’s stack is in a state to receive the call’s result, and the called method’s stack is empty.

The evaluation stack transitions for the ldarg instructions in the IntegerAdd method now more accurately resembles this. The first argument is pushed onto the stack taken from the first item in the arguments array.

Parameter ‘a’ pushed from argument array element [0]

Then, the second argument is pushed onto the stack, which is taken from the second item in the arguments array.

Parameter ‘b’ pushed from argument array element [1]

Therefore, it is appropriate to consider a unique evaluation stack for each method, and the stack is initially empty when method execution begins. In future posts, as I examine some aspects of machine code, it will become clear how the IL evaluation stack is similar to and different from the stacks used by CPUs for executing thread call chains.

Scratching the Surface

This post primarily focuses on the evaluation stack employed in the Common Intermediate Language (CIL) and introduces the concepts of managing arguments, instructions, and results through this stack. Some additional material is required to gain a comprehensive understanding of the frequently used instructions in Intermediate Language (IL) and the way the Common Language Runtime (CLR) utilizes metadata.

Understanding IL is advantageous for optimizing performance-critical code. Some high-level (syntactic sugar) .NET language constructs (e.g., C#) may produce inefficient Intermediate Language (IL), resulting in unnecessary stack operations, branching, method calls, or memory allocations. Analyzing the IL can help identify bottlenecks and make appropriate optimizations.

The primary objective is not to comprehend every detail and intricacy of IL but to develop a comfortable understanding of what to look for when reviewing generated code and being able to pinpoint and resolve performance issues. While you will refer to specific instructions for unfamiliar details as needed, it is essential to build a foundational understanding from the outset.

Call to Action

As a .NET software engineer, one of the most effective ways to gain deeper insights into the efficiency of your code is to start examining the Intermediate Language (IL) generated by your .NET source code. Begin by familiarizing yourself with IL instructions and understanding how they are used to manipulate the evaluation stack. By doing so, you identify how your high-level constructs translate to IL and pinpoint areas where performance can be optimized.

Take the time to review the IL code generated by your compiler for different parts of your application. Look for patterns and standard instructions, and note instances where unnecessary stack operations or memory allocations occur. This exercise will not only enhance your understanding of the .NET execution process but also empower you to write more efficient and high-performance code.

Additionally, explore tools and resources to help you disassemble and analyze IL code. By incorporating IL analysis into your development workflow, you’ll be better equipped to address performance bottlenecks and improve the overall efficiency of your .NET applications.

There are many references available to help you better understand IL.

The documentation for the reflection emit OpCodes Class (https://learn.microsoft.com/en-us/dotnet/api/system.reflection.emit.opcodes) is a C # reader-friendly reference for reviewing IL opcodes. This class contains fields representing individual OpCodes as struct instances. The reference page for each instance offers a comprehensive and detailed description of the respective opcode.

The most detailed and low-level document is the CLI specification itself (ECMA 335) – https://ecma-international.org/publications-and-standards/standards/ecma-335/

You can find all the standards documents related to the CLI and C# on the MS ECMA Standard reference page: https://learn.microsoft.com/en-us/dotnet/fundamentals/standards

Summary

IL is central to .NET execution, and a comprehensive understanding of it can assist in writing efficient and predictable high-performance code. Familiarizing oneself with IL’s stack-based operations, standard instructions, and the stack transitions during method execution enhances the ability to analyze and optimize .NET applications.

In the next post, several additional scenarios will be presented to introduce a few more standard instructions and concepts frequently encountered. This will include an example of writing an entire assembly from scratch using only IL. Although this exercise is seldom used in practice, it offers significant insights into the code generated by C#, F#, or VB.NET compilers.

This knowledge serves as a foundation for later examining the machine code produced from the IL by the JIT compiler and executed by the CPU, offering a deeper understanding of potential performance issues in the code. Future posts will dig further into those topics.

References

Common Language Infrastructure Specification: https://ecma-international.org/publications-and-standards/standards/ecma-335/

OpCodes Class: https://learn.microsoft.com/en-us/dotnet/api/system.reflection.emit.opcodes

.NET ECMA Standards Reference: https://learn.microsoft.com/en-us/dotnet/fundamentals/standards/

ILSpy Visual Studio Extension: https://marketplace.visualstudio.com/items?itemName=SharpDevelopTeam.ILSpy