Monitoring .NET Applications

.NET is a managed runtime . This means that it contains high-level functions that control your program for you (from Introduction to the Common Language Runtime (CLR), 2007 ):

The runtime provides many functions, so it’s convenient to divide them into the following categories:

The main functions that affect the device of others. These include:
garbage collection;
ensuring memory access security and system type security;
high-level support for programming languages.
Additional functions - work on the basis of the main ones. Many useful programs do without them. These functions include:
Isolating applications using AppDomains
application protection and sandbox isolation.
Other functions are needed by all runtimes, but they do not use the basic CLR functions. Such features reflect the desire to create a complete programming environment. These include:

version control;
debugging / profiling;
ensuring interaction.

It can be seen that although debugging and profiling are not the main or additional functions, they are on the list because of the ' desire to create a full-fledged programming environment '.

The rest of the post describes what monitoring , observability, and introspection features exist in the Core CLR, why they are useful, and how the environment provides them.

Diagnostics

First, take a look at the diagnostic information that the CLR provides us with. Traditionally, Event Tracking for Windows (ETW) has been used for this.

There are a lot of events about which the CLR provides information. They are associated with:

garbage collection (GC);
JIT compilation;
modules and application domains;
work with threads and conflicts when blocking;
as well as many others.

For example, here an event occurs during loading in AppDomain , here the event is associated with the throwing of an exception , and here with the memory allocation cycle by the garbage collector .

Perf view

If you want to see the events in the trace system (ETW) related to your .NET applications, I recommend using the excellent PerfView tool and start with these training videos or this PerfView presentation : The Ultimate .NET Performance Tool . PerfView has been widely recognized for providing invaluable information. For example, Microsoft engineers regularly use it to analyze performance .

Common infrastructure

If suddenly the name is not clear, event tracking in ETW is available only under Windows, which does not fit very well in the cross-platform world of .NET Core. You can use PerfView to analyze performance under Linux (using LTTng). However, this command line tool, called PerfCollect, only collects data. Analysis capabilities and a rich user interface (including flamegraphs ) are currently only available in Windows solutions.

But if you still want to analyze .NET performance under Linux, there are other approaches:

The second link above leads to a discussion of the new EventPipe infrastructure , which is being worked on in .NET Core (in addition to EventSources & EventListeners). Its development goals can be found in the Cross-Platform Performance Monitoring Design document. At a high level, this infrastructure will create a single place where the CLR will send events related to diagnostics and performance. Then these events will be redirected to one or more loggers, which, for example, may include ETW, LTTng, and BPF. The necessary logger will be determined, depending on the OS or platform on which the CLR is running. For a detailed explanation of the pros and cons of various logging technologies, see the .NET Cross-Plat Performance and Eventing Design .

EventPipes progress is monitored through the Performance Monitoring project and related 'EventPipe' Issues .

Future plans

Finally, there are plans to create a Performance Profiling Controller , which has the following tasks:

The controller must manage the profiling infrastructure and present performance data created by .NET components responsible for performance diagnostics in a simple and cross-platform manner.

According to the plan, the controller should provide the following functionality through the HTTP server , receiving all the necessary data from the EventPipes infrastructure:

REST APIs

Principle 1: simple profiling: profile the runtime over time period X and return the trace.
Principle 1: advanced profiling: start tracking (along with configuration)
Principle 1: advanced profiling: complete tracking (the answer to this call will be the trace itself).
Principle 2: Get statistics associated with all EventCounters or a specific EventCounter.

HTML Browsed Pages

Principle 1: A textual representation of all stacks of managed code in a process.
- Creates snapshots of running processes for use as a simple diagnostic report.
Principle 2: displaying the current state (possibly with a history) of EventCounters counters.
- Provides an overview of existing counters and their values.
- RESOLVED PROBLEM: I don’t think that there are necessary public APIs to count EventCounters.

I really want to see what ultimately comes out with a performance profiling controller (PPC?). I think if it is built into the CLR, it will bring a lot of .NET benefits. Such functionality exists in other runtimes .

Profiling

Another effective tool that the CLR has is a profiling API. It is (mainly) used by third-party tools to connect to the runtime at a low level. You can learn more about the API from this review , but at a high level, you can use it to make callbacks that are activated if:

events related to the garbage collector;
exceptions are thrown;
assemblies are loaded / unloaded;
and much more .

Image from the BOTR Profiling API Page - Overview

In addition, it has other effective features. First, you can install handlers that are called every time the .NET method is executed, whether in the environment itself or from user code. These callbacks are known as Enter / Leave handlers. Here is a good example of how to use them. However, for this you need to understand the calling conventions for different OS and CPU architectures , which is not always easy . Also, remember that the profiling API is a COM component that can only be accessed from C / C ++ code, but not from C # / F # / VB.NET.

Secondly, the profiler can rewrite the IL code of any .NET method before JIT compilation using the SetILFunctionBody () API . This API is really efficient. It underlies many APM .NET tools . You can learn more about its use from my post How to mock sealed classes and static methods and related code.

ICorProfiler API

It turns out that the profiling API works, there should be all sorts of tricks in the runtime environment. Just look at the discussion on the Allow rejit on attach page (see ReJIT: A How-To Guide for more information on ReJIT).

A complete definition of all profiling API interfaces and callbacks can be found in \ vm \ inc \ corprof.idl (see Interface description language ). It is divided into 2 logical parts. One part is the Profiler -> Runtime Environment (EE) interface, known as ICorProfilerInfo

:

 //  ,    ICorProfilerInfo*,  //     .  ,  DLL   //          ,     //    .

This is implemented in the following files:

The other main part is callbacks Runtime -> Profiler, which are grouped under the interface ICorProfilerCallback

:

 //       //  ICorProfilerCallaback* .       // ,     EEToProfInterfaceImpl.

These callbacks are implemented in the following files:

Finally, it is worth noting that profiling APIs may not work on all OS and architectures running .NET Core. Here is one example: ELT call stub issues on Linux . See the Status of CoreCLR Profiler APIs for more information.

Profiling v. Debugging

As a small digression, I must say that profiling and debugging still overlap a bit. Therefore, it is useful to understand what different APIs provide in the context of .NET Runtime (taken from CLR Debugging vs. CLR Profiling ).

The difference between debugging and profiling in the CLR

Debugging	Profiling
Designed to find code correctness issues.	Designed to diagnose and troubleshoot performance issues.
May have a very high level of interference.	It generally has a low level of intervention. Although the profiler supports modifying the IL code or installing enter / leave handlers, all of this is for instrumentation, not for radical code changes.
The main task is complete control of the target. This includes inspection, execution control (for example, the set-next-statement command), and modifications (Edit-and-Continue function).	The main task is to inspect the target. To do this, instrumentation is provided (changing the IL code, installing enter / leave handlers)
Extensive API and thick object model filled with abstractions.	A small API. There are few or no abstractions.
High level of interactivity: debugger actions are controlled by the user (or algorithm). In fact, editors and debuggers are often integrated (IDEs).	No interactivity: data is usually collected without user intervention and then analyzed.
A small number of critical changes if backward compatibility is needed. We think that migrating from version 1.1 to version 2.0 of the profiler will be a simple or not very difficult task.	A large number of critical changes if backward compatibility is needed. We think that migrating from version 1.1 to version 2.0 of the profiler will be a difficult task, identical to rewriting it completely.

Debugging

Developers differently understand what debugging is. For example, I asked on Twitter “how do you debug .NET programs” and got a lot of different answers . At the same time, the answers really contained a good list of tools and methods, so I recommend looking at them. Thank #LazyWeb

I think that the best thing about debugging is reflected in this post:

The CLR provides an extensive list of debugging related features. However, why are these funds needed? At least three reasons are mentioned in this great post Why is managed debugging different than native-debugging? :

Debugging unmanaged code can be abstracted at the hardware level, but debugging managed code must be abstracted at the IL-code level.
Debugging managed code requires a lot of information that is not available before execution.
The managed code debugger must coordinate with the garbage collector (GC)

Therefore, for ease of use, the CLR should provide a high-level debugging API known as ICorDebug

. It is shown in the figure below showing a general debugging scenario (source: BOTR):

ICorDebug API

The implementation principle and description of the various components is taken from CLR Debugging, a brief introduction :

All debugging support in .Net is implemented on top of the dll library, which we call The Dac. This file (usually called mscordacwks.dll

) is a structural element for both our public debugging API ( ICorDebug

) and two private debugging APIs: SOS-Dac API and IXCLR.

In an ideal world, everyone would use ICorDebug

, our public API. However, ICorDebug

lacks many of the features that tool developers need. This is the problem we are trying to fix where we can. However, these improvements are only present in the v.next CLR, but not in earlier versions of the CLR. In fact, crash dump debugging support appeared in the ICorDebug

API only with the release of CLR v4. Everyone who uses crash dumps for debugging in CLR v2 will not be able to apply ICorDebug

at all.

(See SOS & ICorDebug for more information)

In fact, the ICorDebug

API is divided into more than 70 interfaces. I will not give them all, but I will show by what categories they can be divided. For more information, see the Partition of ICorDebug, where this list has been published.

Top level : ICorDebug + ICorDebug2 - top-level interfaces that perfectly serve as a collection of ICorDebugProcess objects.
Callbacks : Managed code debugging events are sent via methods to the callback object implemented by the debugger.
Process : This set of interfaces represents working code and includes event-related APIs.
Code / Type Inspection : Mostly works with static PE images, but there are convenient methods for real data.
Execution Control : Ability to monitor thread execution. In practice, this means the ability to set breakpoints (F9) and step through the code (F11 code entry, F10 code bypass, S + F11 code exit). The execution control function ICorDebug only works in managed code.
Threads + call stacks : Call stacks are the basis for the inspection functions implemented by the debugger. Work with the call stack is carried out using the following interfaces. ICorDebug supports debugging of only managed code and, accordingly, you can track the stack of only managed code.
Object inspection : Object inspection is part of the API that allows you to see the values of variables in debugged code. For each interface, I give the MVP method, which, it seems to me, should briefly describe the purpose of this interface.

As with profiling APIs, the debugging API support levels vary by OS and processor architecture. For example, as of August 2018, there is still no Linux ARM solution for diagnosing and debugging managed code. For more information on Linux support, see the Debugging .NET Core on Linux with LLDB post and the Microsoft Diagnostics repository, which aims to make debugging .NET programs for Linux easier.

Finally, if you want to see how the ICorDebug

API looks in C #, take a look at the wrappers in the CLRMD library, including all available callbacks (more on CLRMD will be discussed later in this post).

SOS and DAC

The Data Access Component (DAC) is discussed in detail on the BOTR page . In essence, it provides out-of-process access to the CLR data structures so that the information inside them can be read from another process. Thus, the debugger (via ICorDebug

) or the extension 'Son of Strike' (SOS) can access the running CLR instance or memory dump and find, for example:

all running threads;
objects in the managed heap;
complete information about the method, including machine code;
current stack trace.

A small digression : if you want to find out where these strange names came from and get a little lesson in the history of .NET, check out this answer on Stack Overflow .

The complete list of SOS commands is impressive . If you use it together with WinDBG, you can find out what is happening inside your program and the CLR at a very low level. To see how everything is implemented, let's look at the !HeapStat

, which displays a description of the sizes of the various heaps that the .NET GC uses:

(Image taken from SOS: Upcoming release has a few new commands - HeapStat)

Here is a stream of code that shows how SOS and DAC work together:

SOS Complete Team !HeapStat

( link )
SOS code in the !HeapStat

that works with Workstation GC (link)
SOS Function GCHeapUsageStats(..)

, which performs the most difficult part of the work ( link )
Shared DacpGcHeapDetails

data DacpGcHeapDetails

that contains pointers to basic data in the GC heap, such as segments, DacpGcHeapDetails

, and individual generations ( reference ).
DAC GetGCHeapStaticData

Function, which populates the DacpGcHeapDetails

structure ( link )
Shared DacpHeapSegmentData

data DacpHeapSegmentData

that contains information about an individual GC heap segment ( link )
DAC GetHeapSegmentData(..)

, which populates the DacpHeapSegmentData

structure ( link )

3rd party debuggers

Since Microsoft published the debugging API, third-party developers were able to use the ICorDebug

interfaces. Here is a list of the ones I managed to find:

Samsung Debugger for .NET Core runtime
- The debugger allows you to use the GDB / MI or VSCode debugging adapter interface to fix errors in .NET applications from under the .NET Core runtime.
- Probably written as part of a project to port .NET Core to their Tizen OS
dnSpy - .NET debugger and assembly editor
- A very powerful tool . It is a debugger, assembly editor, hex editor, decompiler and much more.
MDbg.exe (.NET Framework Command-Line Debugger)
- Available as a NuGet package . It can also be downloaded from the GitHub repository or the Microsoft website.
- However, at the moment, MDBG does not seem to support .NET Core. See Port MDBG to CoreCLR and ETA for porting mdbg to coreclr for more information .
JetBrains 'Rider' allows you to debug .NET Core on Windows.
- There were some licensing issues with this tool.
- More information can be found in this HackerNews thread.

Memory dumps

The last thing we will talk about is memory dumps, which can be obtained from a working system and analyzed outside it. The .NET runtime has always supported dumping memory under Windows . And now that .NET Core has become cross-platform, tools have appeared that perform the same task on other OSs.

When using memory dumps, it is sometimes difficult to get the correct, matching versions of the SOS and DAC files. Fortunately, Microsoft recently released the dotnet symbol

CLI tool, which:

can download all the files necessary for debugging (character sets, modules, SOS and DAC files for a specific coreclr module) for any specific core dump, minidump or files of any supported platform, including ELF, MachO, Windows DLL, PDB and portable PDB

Finally, if you are doing a little bit of analysis of memory dumps, I recommend taking a look at the excellent CLR MD library that Microsoft released several years ago. I already wrote about its functions. In short, using the library, you can work with memory dumps through an intuitive C # API containing classes that provide access to ClrHeap, GC Roots, CLR Threads, Stack Frames and much more. In fact, the CLR MD can implement most (if not all) of the SOS commands.

You can find out how it works from this post :

The ClrMD Managed Library is a wrapper around debugging APIs intended for internal use only in the CLR. Despite the fact that these APIs are very effective for diagnostics, we do not support them in the form of public, documented releases, since their use is complex and closely related to other features of the CLR implementation. ClrMD solves this problem by providing an easy-to-use, manageable wrapper around these low-level debugging APIs.

All Articles