1 |
Am 14.08.2012 16:00, schrieb Florian Philipp: |
2 |
> Am 13.08.2012 20:18, schrieb Michael Hampicke: |
3 |
>> Am 13.08.2012 19:14, schrieb Florian Philipp: |
4 |
>>> Am 13.08.2012 16:52, schrieb Michael Mol: |
5 |
>>>> On Mon, Aug 13, 2012 at 10:42 AM, Michael Hampicke |
6 |
>>>> <mgehampicke@×××××.com <mailto:mgehampicke@×××××.com>> wrote: |
7 |
>>>> |
8 |
>>>> Have you indexed your ext4 partition? |
9 |
>>>> |
10 |
>>>> # tune2fs -O dir_index /dev/your_partition |
11 |
>>>> # e2fsck -D /dev/your_partition |
12 |
>>>> |
13 |
>>>> Hi, the dir_index is active. I guess that's why delete operations |
14 |
>>>> take as long as they take (index has to be updated every time) |
15 |
>>>> |
16 |
>>>> |
17 |
>>>> 1) Scan for files to remove |
18 |
>>>> 2) disable index |
19 |
>>>> 3) Remove files |
20 |
>>>> 4) enable index |
21 |
>>>> |
22 |
>>>> ? |
23 |
>>>> |
24 |
>>>> -- |
25 |
>>>> :wq |
26 |
>>> |
27 |
>>> Other things to think about: |
28 |
>>> |
29 |
>>> 1. Play around with data=journal/writeback/ordered. IIRC, data=journal |
30 |
>>> actually used to improve performance depending on the workload as it |
31 |
>>> delays random IO in favor of sequential IO (when updating the journal). |
32 |
>>> |
33 |
>>> 2. Increase the journal size. |
34 |
>>> |
35 |
>>> 3. Take a look at `man 1 chattr`. Especially the 'T' attribute. Of |
36 |
>>> course this only helps after re-allocating everything. |
37 |
>>> |
38 |
>>> 4. Try parallelizing. Ext4 requires relatively few locks nowadays (since |
39 |
>>> 2.6.39 IIRC). For example: |
40 |
>>> find $TOP_DIR -mindepth 1 -maxdepth 1 -print0 | \ |
41 |
>>> xargs -0 -n 1 -r -P 4 -I '{}' find '{}' -type f |
42 |
>>> |
43 |
>>> 5. Use a separate device for the journal. |
44 |
>>> |
45 |
>>> 6. Temporarily deactivate the journal with tune2fs similar to MM's idea. |
46 |
>>> |
47 |
>>> Regards, |
48 |
>>> Florian Philipp |
49 |
>>> |
50 |
>> |
51 |
>> Trying out different journals-/options was already on my list, but the |
52 |
>> manpage on chattr regarding the T attribute is an interesting read. |
53 |
>> Definitely worth trying. |
54 |
>> |
55 |
>> Parallelizing multiple finds was something I already did, but the only |
56 |
>> thing that increased was the IO wait :) But now having read all the |
57 |
>> suggestions in this thread, I might try it again. |
58 |
>> |
59 |
>> Separate device for the journal is a good idea, but not possible atm |
60 |
>> (machine is abroad in a data center) |
61 |
>> |
62 |
> |
63 |
> Something else I just remembered. I guess it doesn't help you with your |
64 |
> current problem but it might come in handy when working with such large |
65 |
> cache dirs: I once wrote a script that sorts files by their starting |
66 |
> physical block. This improved reading them quite a bit (2 minutes |
67 |
> instead of 11 minutes for copying the portage tree). |
68 |
> |
69 |
> It's a terrible clutch, will probably fail when passing FS boundaries or |
70 |
> a thousand other oddities and requires root for some very scary |
71 |
> programs. I never had the time to finish an improved C version. Anyway, |
72 |
> maybe it helps you: |
73 |
> |
74 |
> #!/bin/bash |
75 |
> # |
76 |
> # Example below copies /usr/portage/* to /tmp/portage. |
77 |
> # Replace /usr/portage with the input directory. |
78 |
> # Replace `cpio` with whatever does the actual work. Input is a |
79 |
> # \0-delimited file list. |
80 |
> # |
81 |
> FIFO=/tmp/$(uuidgen).fifo |
82 |
> mkfifo "$FIFO" |
83 |
> find /usr/portage -type f -fprintf "$FIFO" 'bmap <%i> 0\n' -print0 | |
84 |
> tr '\n\0' '\0\n' | |
85 |
> paste <( |
86 |
> debugfs -f "$FIFO" /dev/mapper/vg-portage | |
87 |
> grep -E '^[[:digit:]]+' |
88 |
> ) - | |
89 |
> sort -k 1,1n | |
90 |
> cut -f 2- | |
91 |
> tr '\n\0' '\0\n' | |
92 |
> cpio -p0 --make-directories /tmp/portage/ |
93 |
> unlink "$FIFO" |
94 |
> |
95 |
|
96 |
No, I don't think that's practicable with the number of files in my |
97 |
setup. To be honest, currently I am quite happy with the performance of |
98 |
btrfs. Running through the directory tree only takes 1/10th of the time |
99 |
it took with ext4, and deletes are pretty fast as well. I'm sure there's |
100 |
still room for more improvement, but right now it's much better than it |
101 |
was before. |