Monitoring .NET Applications

.NET is a managed runtime . This means that it contains high-level functions that control your program for you (from Introduction to the Common Language Runtime (CLR), 2007 ):







The runtime provides many functions, so itโ€™s convenient to divide them into the following categories:



  1. The main functions that affect the device of others. These include:
    1. garbage collection;
    2. ensuring memory access security and system type security;
    3. high-level support for programming languages.
  2. Additional functions - work on the basis of the main ones. Many useful programs do without them. These functions include:
    1. Isolating applications using AppDomains
    2. application protection and sandbox isolation.
  3. Other functions are needed by all runtimes, but they do not use the basic CLR functions. Such features reflect the desire to create a complete programming environment. These include:

    1. version control;
    2. debugging / profiling;
    3. ensuring interaction.

It can be seen that although debugging and profiling are not the main or additional functions, they are on the list because of the ' desire to create a full-fledged programming environment '.













The rest of the post describes what monitoring , observability, and introspection features exist in the Core CLR, why they are useful, and how the environment provides them.







Diagnostics



First, take a look at the diagnostic information that the CLR provides us with. Traditionally, Event Tracking for Windows (ETW) has been used for this.

There are a lot of events about which the CLR provides information. They are associated with:









For example, here an event occurs during loading in AppDomain , here the event is associated with the throwing of an exception , and here with the memory allocation cycle by the garbage collector .







Perf view



If you want to see the events in the trace system (ETW) related to your .NET applications, I recommend using the excellent PerfView tool and start with these training videos or this PerfView presentation : The Ultimate .NET Performance Tool . PerfView has been widely recognized for providing invaluable information. For example, Microsoft engineers regularly use it to analyze performance .







Common infrastructure



If suddenly the name is not clear, event tracking in ETW is available only under Windows, which does not fit very well in the cross-platform world of .NET Core. You can use PerfView to analyze performance under Linux (using LTTng). However, this command line tool, called PerfCollect, only collects data. Analysis capabilities and a rich user interface (including flamegraphs ) are currently only available in Windows solutions.







But if you still want to analyze .NET performance under Linux, there are other approaches:









The second link above leads to a discussion of the new EventPipe infrastructure , which is being worked on in .NET Core (in addition to EventSources & EventListeners). Its development goals can be found in the Cross-Platform Performance Monitoring Design document. At a high level, this infrastructure will create a single place where the CLR will send events related to diagnostics and performance. Then these events will be redirected to one or more loggers, which, for example, may include ETW, LTTng, and BPF. The necessary logger will be determined, depending on the OS or platform on which the CLR is running. For a detailed explanation of the pros and cons of various logging technologies, see the .NET Cross-Plat Performance and Eventing Design .







EventPipes progress is monitored through the Performance Monitoring project and related 'EventPipe' Issues .







Future plans



Finally, there are plans to create a Performance Profiling Controller , which has the following tasks:







The controller must manage the profiling infrastructure and present performance data created by .NET components responsible for performance diagnostics in a simple and cross-platform manner.







According to the plan, the controller should provide the following functionality through the HTTP server , receiving all the necessary data from the EventPipes infrastructure:







REST APIs









HTML Browsed Pages









I really want to see what ultimately comes out with a performance profiling controller (PPC?). I think if it is built into the CLR, it will bring a lot of .NET benefits. Such functionality exists in other runtimes .







Profiling



Another effective tool that the CLR has is a profiling API. It is (mainly) used by third-party tools to connect to the runtime at a low level. You can learn more about the API from this review , but at a high level, you can use it to make callbacks that are activated if:









Image from the BOTR Profiling API Page - Overview







In addition, it has other effective features. First, you can install handlers that are called every time the .NET method is executed, whether in the environment itself or from user code. These callbacks are known as Enter / Leave handlers. Here is a good example of how to use them. However, for this you need to understand the calling conventions for different OS and CPU architectures , which is not always easy . Also, remember that the profiling API is a COM component that can only be accessed from C / C ++ code, but not from C # / F # / VB.NET.







Secondly, the profiler can rewrite the IL code of any .NET method before JIT compilation using the SetILFunctionBody () API . This API is really efficient. It underlies many APM .NET tools . You can learn more about its use from my post How to mock sealed classes and static methods and related code.







ICorProfiler API



It turns out that the profiling API works, there should be all sorts of tricks in the runtime environment. Just look at the discussion on the Allow rejit on attach page (see ReJIT: A How-To Guide for more information on ReJIT).







A complete definition of all profiling API interfaces and callbacks can be found in \ vm \ inc \ corprof.idl (see Interface description language ). It is divided into 2 logical parts. One part is the Profiler -> Runtime Environment (EE) interface, known as ICorProfilerInfo



:







 //  ,    ICorProfilerInfo*,  //     .  ,  DLL   //          ,     //    .
      
      





This is implemented in the following files:









The other main part is callbacks Runtime -> Profiler, which are grouped under the interface ICorProfilerCallback



:







 //       //  ICorProfilerCallaback* .       // ,     EEToProfInterfaceImpl.
      
      





These callbacks are implemented in the following files:









Finally, it is worth noting that profiling APIs may not work on all OS and architectures running .NET Core. Here is one example: ELT call stub issues on Linux . See the Status of CoreCLR Profiler APIs for more information.







Profiling v. Debugging



As a small digression, I must say that profiling and debugging still overlap a bit. Therefore, it is useful to understand what different APIs provide in the context of .NET Runtime (taken from CLR Debugging vs. CLR Profiling ).







The difference between debugging and profiling in the CLR







Debugging Profiling
Designed to find code correctness issues. Designed to diagnose and troubleshoot performance issues.
May have a very high level of interference. It generally has a low level of intervention. Although the profiler supports modifying the IL code or installing enter / leave handlers, all of this is for instrumentation, not for radical code changes.
The main task is complete control of the target. This includes inspection, execution control (for example, the set-next-statement command), and modifications (Edit-and-Continue function). The main task is to inspect the target. To do this, instrumentation is provided (changing the IL code, installing enter / leave handlers)
Extensive API and thick object model filled with abstractions. A small API. There are few or no abstractions.
High level of interactivity: debugger actions are controlled by the user (or algorithm). In fact, editors and debuggers are often integrated (IDEs). No interactivity: data is usually collected without user intervention and then analyzed.
A small number of critical changes if backward compatibility is needed. We think that migrating from version 1.1 to version 2.0 of the profiler will be a simple or not very difficult task. A large number of critical changes if backward compatibility is needed. We think that migrating from version 1.1 to version 2.0 of the profiler will be a difficult task, identical to rewriting it completely.


Debugging



Developers differently understand what debugging is. For example, I asked on Twitter โ€œhow do you debug .NET programsโ€ and got a lot of different answers . At the same time, the answers really contained a good list of tools and methods, so I recommend looking at them. Thank #LazyWeb







I think that the best thing about debugging is reflected in this post:









The CLR provides an extensive list of debugging related features. However, why are these funds needed? At least three reasons are mentioned in this great post Why is managed debugging different than native-debugging? :







  1. Debugging unmanaged code can be abstracted at the hardware level, but debugging managed code must be abstracted at the IL-code level.
  2. Debugging managed code requires a lot of information that is not available before execution.
  3. The managed code debugger must coordinate with the garbage collector (GC)


Therefore, for ease of use, the CLR should provide a high-level debugging API known as ICorDebug



. It is shown in the figure below showing a general debugging scenario (source: BOTR):







ICorDebug API



The implementation principle and description of the various components is taken from CLR Debugging, a brief introduction :







All debugging support in .Net is implemented on top of the dll library, which we call The Dac. This file (usually called mscordacwks.dll



) is a structural element for both our public debugging API ( ICorDebug



) and two private debugging APIs: SOS-Dac API and IXCLR.

In an ideal world, everyone would use ICorDebug



, our public API. However, ICorDebug



lacks many of the features that tool developers need. This is the problem we are trying to fix where we can. However, these improvements are only present in the v.next CLR, but not in earlier versions of the CLR. In fact, crash dump debugging support appeared in the ICorDebug



API only with the release of CLR v4. Everyone who uses crash dumps for debugging in CLR v2 will not be able to apply ICorDebug



at all.

(See SOS & ICorDebug for more information)







In fact, the ICorDebug



API is divided into more than 70 interfaces. I will not give them all, but I will show by what categories they can be divided. For more information, see the Partition of ICorDebug, where this list has been published.









As with profiling APIs, the debugging API support levels vary by OS and processor architecture. For example, as of August 2018, there is still no Linux ARM solution for diagnosing and debugging managed code. For more information on Linux support, see the Debugging .NET Core on Linux with LLDB post and the Microsoft Diagnostics repository, which aims to make debugging .NET programs for Linux easier.







Finally, if you want to see how the ICorDebug



API looks in C #, take a look at the wrappers in the CLRMD library, including all available callbacks (more on CLRMD will be discussed later in this post).







SOS and DAC



The Data Access Component (DAC) is discussed in detail on the BOTR page . In essence, it provides out-of-process access to the CLR data structures so that the information inside them can be read from another process. Thus, the debugger (via ICorDebug



) or the extension 'Son of Strike' (SOS) can access the running CLR instance or memory dump and find, for example:









A small digression : if you want to find out where these strange names came from and get a little lesson in the history of .NET, check out this answer on Stack Overflow .







The complete list of SOS commands is impressive . If you use it together with WinDBG, you can find out what is happening inside your program and the CLR at a very low level. To see how everything is implemented, let's look at the !HeapStat



, which displays a description of the sizes of the various heaps that the .NET GC uses:







(Image taken from SOS: Upcoming release has a few new commands - HeapStat)







Here is a stream of code that shows how SOS and DAC work together:









3rd party debuggers



Since Microsoft published the debugging API, third-party developers were able to use the ICorDebug



interfaces. Here is a list of the ones I managed to find:









Memory dumps



The last thing we will talk about is memory dumps, which can be obtained from a working system and analyzed outside it. The .NET runtime has always supported dumping memory under Windows . And now that .NET Core has become cross-platform, tools have appeared that perform the same task on other OSs.







When using memory dumps, it is sometimes difficult to get the correct, matching versions of the SOS and DAC files. Fortunately, Microsoft recently released the dotnet symbol



CLI tool, which:







can download all the files necessary for debugging (character sets, modules, SOS and DAC files for a specific coreclr module) for any specific core dump, minidump or files of any supported platform, including ELF, MachO, Windows DLL, PDB and portable PDB

Finally, if you are doing a little bit of analysis of memory dumps, I recommend taking a look at the excellent CLR MD library that Microsoft released several years ago. I already wrote about its functions. In short, using the library, you can work with memory dumps through an intuitive C # API containing classes that provide access to ClrHeap, GC Roots, CLR Threads, Stack Frames and much more. In fact, the CLR MD can implement most (if not all) of the SOS commands.







You can find out how it works from this post :







The ClrMD Managed Library is a wrapper around debugging APIs intended for internal use only in the CLR. Despite the fact that these APIs are very effective for diagnostics, we do not support them in the form of public, documented releases, since their use is complex and closely related to other features of the CLR implementation. ClrMD solves this problem by providing an easy-to-use, manageable wrapper around these low-level debugging APIs.








All Articles