Miner virus with “Heaven's Gate”

Hello! In anticipation of the start of a new stream on the reverse engineering course , we are sharing with you a translation of a very interesting material. Enjoy reading








The last two years can be called years of ransomware hackers. Ransomware has, without a doubt, been the most popular type of malware. However, at the end of last years, we began to observe their decline in popularity and its increase in favor of miners. It is possible that in 2018 this trend will only grow.



From the point of view of the victim, this is a relief, since miners are not as dangerous as ransomware. Yes, they slow down the system, but as soon as you get rid of them, you can continue to use your computer as before. Your data is not stolen or lost, as is the case with the ransomware virus.



From the perspective of a malware researcher, miners are disappointing. They do not provide enough new material for deeper analysis, mainly because they are based on well-known open source components with little or no confusion or stealth.



However, from time to time we find miners using interesting tricks. We recently observed a technique called “Heaven's Gate,” which allows you to inject into 64-bit processes from 32-bit bootloaders. This idea is not new, its first implementation dates back to 2009, but it is interesting to see how it was implemented in a new form, obtained directly "from the wild."



Beginners in virus analysis can read a guide on what Heaven's Gate is and how to approach its analysis.



Materials for analysis





This sample was found in the continuation of the Ngay campaign (more on this here ). Checking the biography of such samples led me to the article @_qaz_qaz , which describes an earlier campaign with a similar sample. However, his analysis did not include Heaven's Gate technology.





Behavior analysis



To see the injection mentioned, we must run the sample on a 64-bit system. We see that it launches the essence of the notebook, with the parameters specific to cryptocurrency mining:







Looking at the lines in memory in ProcessExplorer, we see that this is not a real notebook, but the XMRig Monero miner .







So, at the moment, we are sure that the image of the notebook in memory was most likely replaced by the RunPE (Process Hollowing) method.



The main dropper is 32-bit, but transfers the payload to a 64-bit notebook:







Most interestingly, this type of injection is not supported by the official Windows API. We can read / write to the memory of 32-bit processes from a 64-bit application (using the WoW64 API), but not vice versa.



However, there are some unofficial solutions, such as a technique called “Heaven's Gate.”



Heaven's Gate Review



The Heaven's Gate technique was first described in 2009 by a hacker with the nickname Roy G. Biv. Later, many implementations were created, for example, the Wow64ext library or, based on it, W64oWoW64 . In his blog in 2015, Alex Ionescu described measures to combat this technique .

Let's look at how it works.



Running 32-bit processes on 64-bit Windows



Each 32-bit process running on a 64-bit version of Windows runs on a special WoW64 subsystem that emulates a 32-bit environment. You can draw an analogy with a 32-bit sandbox that is created inside a 64-bit process. So first, a 64-bit process environment is created. And already inside it a 32-bit environment is created. The application runs in this 32-bit environment, but does not have access to its 64-bit part.



If we scan a 32-bit process from the outside using a 64-bit scanner, we will see that inside it has both 32 and 64-bit DLLs. Most importantly, it has two versions of NTDLL: 32-bit (loaded from the SysWow64 directory) and 64-bit (loaded from the System32 directory):







However, the 32-bit process itself does not see the 64-bit part and is limited to using 32-bit DLLs. To inject into a 64-bit process, you need to use the 64-bit versions of the corresponding functions.



Code segments



To gain access to the restricted part of the environment, we need to understand how isolation is done. It turns out that everything is quite simple. 32-bit and 64-bit code execution is available through a different code segment address: 32-bit - 0x23 and 64-bit - 0x33.



If we call the address in the usual way, then the mode that is used to interpret it is set by default. However, we can explicitly request a change using assembly code.



Inside the miner: Heaven's Gate implementation



I will not conduct a full analysis of this miner, as it has already been described here . Let's go straight to the place where the fun begins. The malicious program checks its environment, and if it detects that it is running on a 64-bit system, it uses a different path to inject into the 64-bit process:







After some anti-analysis checks, it creates a new, suspended 64-bit process (in this case, a notepad):







This is the target in which the malicious load will be embedded.

As we learned earlier, in order to embed the payload into a 64-bit process, we need to use the appropriate 64-bit functions.



First, the bootloader passes 64-bit NTDLL processing:







What happens inside the get_ntdll



function requires a more detailed explanation. As an explanation, we can also take a look at similar code in the ReWolf library.



To access the 64-bit part of the process environment, we need to work with segment selectors. Let's see how the malware enters 64-bit mode.







It seems this code was copied directly from the open library: https://github.com/rwfpl/rewolf-wow64ext/blob/master/src/internal.h#L26



The 0x33 segment selector is pushed onto the stack. Then, the malware calls the following line: (in this way, the address of the next line is also pushed onto the stack.)







The address that was retf



stack is fixed by adding 5 bytes and is set after retf



:







At the end, the RETF statement is called. RETF is “far return”, and unlike regular RET, it allows you to specify not only the address from which you want to continue execution, but also a segment. As arguments, it takes two DWORDs from the stack. Thus, when RETF is executed, the return return address is:



0x33: 0x402A50



Thanks to the changed segment, code starting at the specified address is interpreted as 64-bit. So the code that the debugger sees is 32-bit ...







... actually 64-bit.



To quickly switch views, I use the PE-bear function:







And here is what this piece of code looks like if it is interpreted as 64-bit:







Thus, the code that is executed here is responsible for moving the contents of register R12 to a variable on the stack, and then switches back to 32-bit mode. This is done in order to get the 64-bit Thread Information Block (TEB) , from which we get the 64-bit Process Environment Block (PEB) from here - we look at a similar code .



A 64-bit PEB is used as a starting point for finding a 64-bit version of NTDLL. This part is implemented quite trivially (a “vanilla” implementation of this method can be found here ), using a pointer to the loaded libraries, which are one of the fields in the PEB structure. So, from PEB we get a field called Ldr :







Ldr is a structure of type _PEB_LDR_DATA



. It contains an entry called InMemoryOrderModuleList



:







This list contains all loaded libraries that are present in the memory of the process under study. We look through the list until we find the library that interests us, in our case it is NTDLL. This is exactly what the above get_ntdll



function get_ntdll



. To find a suitable name, it calls the following function, designated as is_ntdll_lib



, which is_ntdll_lib



library name with ntdll.dll by character. The equivalent of this code turns out.







If the names match, then the library address is returned to a couple of registers:







Once we found NTDLL, we just need to get the addresses of the corresponding functions. We can do this by looking at the library export table:







The following functions are retrieved:





As we know, these functions are typical of the RunPE technique. First, NtUnmapViewOfSection is used to unmap the original PE file. Then, in the remote process, memory is allocated and a new PE is written. At the end, the context of the process is changed so that execution starts from the embedded module.



Function addresses are stored and later called (similar to this code) to control the remote process.



Conclusion



Until now, the authors of miners have not shown much creativity. They achieve their goals by relying on open source components. The described case well reflects this trend, since a ready-made implementation was used.



Heaven's Gate has been around for several years. Some malicious programs use it to increase stealth . But in the case of this miner, the authors probably sought rather to maximize performance by using the payload that best fits the target architecture.



That's all. You can find out more about our course here .



All Articles