Hiding code in ELF binary

Written by aaSSfxxx -

Since I'm contributing to the radare2, I'm learning on how a disassembler works, and especially how ELF files are handled by disassemblers. I saw that almost (even every ?) disassemblers rely on ELF section headers (generally located at the end of the file), which has never used in reality (by Linux kernel or glibc) because ELF's mapping in memory is given by program header (another ELF structure, which I described in my article about ELF packer

So, we can easily hide code from disassemblers by manipulating virtual address fields of the ".text" section structure. I'll use an hexadecimal editor and the latest git revision of radare2 (which fixes a bug related to virtual address calculation in ELF binary), so I recommend you to have those tools installed of your computer to continue the reading of this article.

The trick

First, let's start with the following code:

int main() {
   printf("You will never see me !\n");
   return 0;
}

int foo() {
   return 1;
}

The goal of this article is to make disassemblers believe that the entrypoint of the binary is the "foo" function (and not the _start function added by gcc). First, let's compile the source code to work on the generated executable. Then we need to grab the offset of the "foo" symbol, by doing rabin2 -s | grep foo and note the offset of the symbol somewhere. Then we'll strip all symbols of the binary (to avoid corrupted disassembly in IDA) using the "strip" command on our binary.

Then we need to retrieve the section offset in the binary which is located at offset 0x20 of the file (for 32-bit executables, I don't have 64bit system yet so please tell me if it still works). Then we use radare2 (and the tool rabin2) to have the index in the array of sections of section .text we'll spoof.

So we execute rabin2 -S a.out | grep .text and note the "idx" field somewhere. To find the section we just need to compute section_offset + (idx+1)*0x28.

0x28 is the size of a section header entry, and we need to add 1 to the idx we got from radare2 because it seems to ignore the null section at the beginning. Then go to the offset calculated above, and then modify the "sh_offset" of the Elf_Shdr structure (at offset 0x10 relative to the beginning of the structure). Don't forget that we work in little endian (x86) when you edit the binary in your hex editor ! Then save the program, execute it (it should show "You will never see me !") and when you'll try to disassemble it, you will see the disassembly of the foo function as the entry point !

What happened and how to detect it ?

As I said in the introduction, kernel relies on the program header table (which generally follows the ELF header) and map PT_LOAD program headers into memory (see my articles on ELF packer I wrote). So, section headers are totally optional in ELF binaries, and are just metadata, since everything dynamic linkers need to know are stored in program headers of type PT_DYNAMIC. So we can easily spoof almost any section header without impact on a program's execution, but disassemblers (even IDA ;)) will be fooled and will produce incoherent disassembly, because disassemblers rely on section headers, which are not reliable.

Anyways, there are some ways to detect it. In almost binaries generated by compilers, virtual address usually have the same last digits of the offset. For exemple, 0x08048130 will match offset 0x130 in the file, or 0x0804956 can match offset 0x156. But with this manipulation we can see that offsets doesn't match at all virtual addresses, which can indicate that a binary was modified. Another way to detect it is to erase section header offset and size in the program header, which will force disassemblers (IDA and radare2 for instance) to rely on program headers for disassembly, or trying to fix section header manually.