Saturday, 18 December 2010

ELF Format files

Its been an interesting couple of weeks trying to get the
ELF converter to create a universal binary for Linux. With
later GLIBC releases, the .hash section was replaced with
a .gnu.hash section, and the /lib/ runtime linker
understands this new section. (The new section is smaller and
faster to link, especially with C++ apps which have long
mangled names, and potentially hundreds or thousands of dynamic
symbols, as is typical for GUI applications).

What I have uncovered is redundancy in the ELF specification. There
is a Program Header section which is used to bootstrap the executable
into memory. Heres an example program header:

Program headers:
Type Offset VirtAddr MemSz/FileSz PhysAddr Flgs Al
0 PHDR 000040 00400040 1f8/1f8 000000400040 R-X 08
1 INTERP 000238 00400238 1c/1c 000000400238 --X 01
2 LOAD 000000 00400000 172c/172c 000000400000 R-X 200000
3 LOAD 001e28 00600e28 210/200 000000600e28 -WX 200000
4 DYNAMIC 001e50 00600e50 190/190 000000600e50 -WX 08
5 NOTE 000254 00400254 44/44 000000400254 --X 04
6 GNU_EH_FRAME 00068c 0040068c 24/24 00000040068c --X 04
7 GNU_STACK 000000 00000000 0/0 000000000000 -WX 08
8 GNU_RELRO 000e28 00600e28 1d8/1d8 000000600e28 --X 01

The LOAD sections are used to mmap the two key sections (code and data)
into the memory image. Note the offset field: this specifies the byte offset
into the ELF file where the segment resides. The runtime linker wants
this to be on a page boundary.

The VirtAddr and PhysAddr control how the app can address these sections
whilst it is running. (I dont know why both fields are present or
what happens if they are different - I think one is redundant).

In addition to the Program headers, we have the section table. Think
of the Program header as what the runtime linker needs to mmap the
data into memory, and the section table as what the application uses
internally, although much of the sections are irrelevant to a running
program - the act of linker resolves addresses and so there is
little need to use the sections. (The runtime linker will
use some of the sections, e.g. to resolve dynamic symbols).

Then we have the DYNAMIC section (.dynamic) which is an array of
values - some are pointers, some are integers, used to list things
like shared lib dependencies or pointers to the hash table.

Section .dynamic: 0x1e50, size=0x190, num=25
[ 0] 00000001 DT_NEEDED 0x00000010
[ 1] 0000000c DT_INIT 0x00400428
[ 2] 0000000d DT_FINI 0x00400668
[ 3] 00000004 DT_HASH 0x0040072c
[ 4] 00000005 DT_STRTAB 0x00400330
[ 5] 00000006 DT_SYMTAB 0x004002b8
[ 6] 0000000a DT_STRSZ 0x00000057
[ 7] 0000000b DT_SYMENT 0x00000018
[ 8] 00000015 DT_DEBUG 0x00000000
[ 9] 00000003 DT_PLTGOT 0x00600fe8
[10] 00000002 DT_PLTRELSZ 0x00000048
[11] 00000014 DT_PLTREL DT_RELA
[12] 00000017 DT_JMPREL 0x004003e0
[13] 00000007 DT_RELA 0x004003c8
[14] 00000008 DT_RELASZ 0x00000018
[15] 00000009 DT_RELAENT 0x00000018
[16] 08000000 134217728 0x00400398
[17] 08000000 134217728 0x00000001
[18] 08000000 134217728 0x00400388
[19] 00000000 DT_NULL 0x00000000
[20] 00000000 DT_NULL 0x00000000
[21] 00000000 DT_NULL 0x00000000
[22] 00000000 DT_NULL 0x00000000
[23] 00000000 DT_NULL 0x00000000
[24] 00000000 DT_NULL 0x00000000

So, we now have *three* places with similar information, and these
can disagree. Here are some things to contemplate when they agree/disagree:

1. The kernel will refuse to load the executable of kill -9 on startup
if the LOAD program header segments are not well formed (eg not on a
4K boundary).

2. The executable may run, but gdb may core dump. Older gdb's, e.g. 6.8,
will core dump on a malformed executable, despite it running correctly.

3. gdb may run, but core dump on hitting a breakpoint. Even the gdb 7.2
release on Ubuntu 10.10 does this.

Combine the above with Ubuntu 10.10 vs Centos 4.7 (RedHat AS4) and
the permutations are a bit unwieldy. Also, because the runtime linker
on the earlier Linux's is using different fields, highlights the
redundancy and problems in getting this to work.

So, at present I have an executable compiled on Ubuntu 10.10 which
runs fine on Centos 4.7, but I now need to determine why the
gdb on both systems is core dumping.

Post created by CRiSP v10.0.2c-b5917

No comments:

Post a Comment