1 |
> Are there any numbers (benchmarks) about the performance penalty of |
2 |
> pageexec and/or segmexec on intel x86 machines? |
3 |
|
4 |
i remember only the kernel compiles on P3, for SEGMEXEC the slowdown |
5 |
was around 1-2%, for PAGEEXEC on 2.2/2.4 it was around 30-40% and |
6 |
on 2.6 it was 2-3%. i recall spender benchmarking PAGEEXEC on an |
7 |
athlon and 2.4 and it was something like 20-25%. on P4 PAGEEXEC is |
8 |
very bad (maybe a 100x slowdown, i don't think anyone bothered to |
9 |
benchmark it precisely ;-), you don't want to use it. |
10 |
|
11 |
> The idea that I have is that page-exec on x86 involves a page-fault |
12 |
> for every (execute) access to a new page that will be treated by |
13 |
> pax... and that is performance-wise .. bad.. |
14 |
|
15 |
not quite, you get an extra page fault for every data access to |
16 |
a page that the DTLB doesn't yet have an entry for. the larger |
17 |
the DTLB the smaller the number of these extra page faults (that's |
18 |
why the athlon is better than any intel). |
19 |
|
20 |
> And that segmexec is a diferent approach that involves, mirroring the |
21 |
> process address space on two segments with diferent "write" |
22 |
> permissions, and compairing those two, to check if there was any |
23 |
> overwrite of the code segment. |
24 |
|
25 |
nope, the difference is not in writability (executable pages are |
26 |
non-writable, regardless where they are), it's about being present |
27 |
or not in the 'upper' half of the address space (which happens to |
28 |
be the code segment), hence being present there equals to being |
29 |
executable, non-executable otherwise. |
30 |
|
31 |
> This would mean doubling the mem-usage, at least for the code-segment |
32 |
> in segmexec mode. |
33 |
|
34 |
what's doubled is the virtual memory usage (or you can say that the |
35 |
address space is halved), the underlying physical memory usage doesn't |
36 |
change (that's the whole point of vma mirroring - it creates two virtual |
37 |
mappings for the same physical page). |
38 |
|
39 |
> And in arches that suport no-exec pages (has sparc or amd64), what are |
40 |
> the performance penalties? Anyone can give me some pointers? |
41 |
|
42 |
except for ppc there should be nothing measurable there (well, maybe |
43 |
some contrived benchmark can show something on sparc/sparc64 because |
44 |
the fast path of the TLB load handler is 2 instructions longer, but |
45 |
i'd hardly call that 'penalty'). |
46 |
|
47 |
> stuff like: kernel compiles, mysql benches, or... any other benchmark |
48 |
> is good for me.. just to "grasp" a idea.. |
49 |
|
50 |
maybe http://www.grsecurity.net./grsecurity-slide_files/frame.htm helps |
51 |
although it's quite old and benchmarks only PAGEEXEC. |
52 |
|
53 |
-- |
54 |
gentoo-hardened@g.o mailing list |