Embedding code and the danger of pirated software

About how you can embed code without jmp in the code section and remain invisible if you do not study thoroughly disassembled code. Who cares, please, under the cat.



I am developing a C disassembler for Linux. About myself, I would not say that I am a professional programmer. I do not work as a programmer, but what I learned is either reading books, or inventing it myself, or studying the source code of other programs. Therefore, if the code seems childish to you, do not swear.



For starters, I did a disassembler and my code now looks like this, that is, a byte is read and passed to the desired function.



void disasm_intel ( unsigned char *ptr, int size, int byte_order, int show ) { show_asm = show; virt = global_virt_text; unsigned char *start = ptr; start_op = ptr; for ( int index = 0; index < size; index++ ) { if ( show_asm == TRUE ) printf ( "%lx: ", virt ); switch ( *ptr ) { case 0x30: intel_opcode_1_0x30 ( &ptr, &index, byte_order ); break; case 0x31: intel_opcode_1_0x31 ( &ptr, &index, byte_order ); break; case 0x66: intel_opcode_1_0x66 ( &ptr, &index, byte_order ); break; case 0x67: intel_opcode_1_0x67 ( &ptr, &index, byte_order ); break; case 0x83: intel_opcode_1_0x83 ( &ptr, &index, byte_order ); break; case 0x88: intel_opcode_1_0x88 ( &ptr, &index, byte_order ); break; // mov register to register byte case 0x89: intel_opcode_1_0x89 ( &ptr, &index, byte_order ); break; case 0x8a: intel_opcode_1_0x8a ( &ptr, &index, byte_order ); break; case 0x8b: intel_opcode_1_0x8b ( &ptr, &index, byte_order ); break; // mov esp, %x : mov ebp, %x case 0x8d: intel_opcode_1_0x8d ( &ptr, &index, byte_order ); break; // lea case 0xb0: intel_opcode_1_0xb0 ( &ptr, &index ); break; // mov al, %x case 0xb1: intel_opcode_1_0xb1 ( &ptr, &index ); break; // mov cl, %x case 0xb2: intel_opcode_1_0xb2 ( &ptr, &index ); break; // mov dl, %x case 0xb3: intel_opcode_1_0xb3 ( &ptr, &index ); break; // mov bl, %x case 0xb4: intel_opcode_1_0xb4 ( &ptr, &index ); break; // mov ah, %x case 0xb5: intel_opcode_1_0xb5 ( &ptr, &index ); break; // mov ch, %x case 0xb6: intel_opcode_1_0xb6 ( &ptr, &index ); break; // mov dh, %x case 0xb7: intel_opcode_1_0xb7 ( &ptr, &index ); break; // mov bh, %x case 0xb8: intel_opcode_1_0xb8 ( &ptr, &index, byte_order ); break; // mov eax, %x case 0xb9: intel_opcode_1_0xb9 ( &ptr, &index, byte_order ); break; // mov ecx, %x case 0xba: intel_opcode_1_0xba ( &ptr, &index, byte_order ); break; // mov edx, %x case 0xbb: intel_opcode_1_0xbb ( &ptr, &index, byte_order ); break; // mov ebx, %x case 0xbe: intel_opcode_1_0xbe ( &ptr, &index, byte_order ); break; // mov esi, %x case 0xbf: intel_opcode_1_0xbf ( &ptr, &index, byte_order ); break; // mov edi, %x case 0xc3: intel_opcode_1_0xc3 ( ); break; // ret case 0xcd: intel_opcode_1_0xcd ( &ptr, &index ); break; // int 0x%x } ptr++; virt += ptr - start; start = ptr; start_op = ptr; } show_asm = FALSE; }
      
      





And such functions are already a big bunch. In some places, I made comments in order to catch the interconnection of machine instructions, and maybe later make a more competent disassembler. But in this form, in which I have the code now, I can easily set any conditions for each operator.



So while I was doing this, I had an idea, is it possible to add code in the middle of the code section? It turns out you can, but in all cases? So far, to add code, I use already prepared machine codes. If I can then, I’ll make the assembler translator into machine code to make adding code more convenient. In my case, you need to specify the offset in the code section and the bytes are copied to the right place. There was also a certain problem: addressing in memory. I added a code to the lea command that saves the necessary data in the structure, and if you insert new operators in the code section, then all offsets are aligned so that they indicate data at new offsets. Well, it’s not very difficult, if you inserted the code, the code section increased by the same number of bytes and all other sections after the code section will already contain new offsets. I made it so that there are differences in where you paste the code, all offsets work correctly. Then the problem arose that in addressing such



 mov eax, [eax + eax + 0x100]
      
      





The fact is that in such an addressing there can be ebp and point to the stack, and not to another section. I decided to make it so that if the address points to the data section, then consider the offsets when inserting the code, if it points to the stack, that is, not the address in the data section, then do not take into account the offsets.



And in this way attackers can take advantage. After all, malicious code can be inserted at the beginning of some function in the program, for example, so that a child fork is created and a special file is downloaded. In Linux, this can be done without any problems. Indeed, in / usr / include there is a file with all the system functions of the operating system. That is, you can use the network part, even if the program does not have network functions. I don’t know how in windows, but I will try to add work with pe format later. It can turn out to do the same as in Linux. So far I have a console version. But then I plan to do it on gtk.



Thank you for spending your time on my article.



All Articles