Re: [gentoo-amd64] Re: Is my RAID performance bad possibly due to starting sector value? - gentoo-amd64

From:	Mark Knecht <markknecht@×××××.com>
To:	Gentoo AMD64 <gentoo-amd64@l.g.o>
Subject:	Re: [gentoo-amd64] Re: Is my RAID performance bad possibly due to starting sector value?
Date:	Sun, 23 Jun 2013 15:23:20
Message-Id:	`CAK2H+ee_Y5SSMwfXjT73GO+dVZpKNB71a0vLwd5Kh2f8Wb4j3g@mail.gmail.com`
In Reply to:	Re: [gentoo-amd64] Re: Is my RAID performance bad possibly due to starting sector value? by Rich Freeman

1

On Sun, Jun 23, 2013 at 4:43 AM, Rich Freeman <rich0@g.o> wrote:

2

> On Sat, Jun 22, 2013 at 7:04 PM, Mark Knecht <markknecht@×××××.com> wrote:

3

>>    I've been rereading everyone's posts as well as trying to collect

4

>> my own thoughts. One question I have at this point, being that you and

5

>> I seem to be the two non-RAID1 users (but not necessarily devotees) at

6

>> this time, is what chunk size, stride & stripe width with you are

7

>> using?

8

>

9

> I'm using 512K chunks on the two RAID5s which are my LVM PVs:

10

> md7 : active raid5 sdc3[0] sdd3[6] sde3[7] sda4[2] sdb4[5]

11

>       971765760 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]

12

>       bitmap: 1/2 pages [4KB], 65536KB chunk

13

>

14

> md6 : active raid5 sda3[0] sdd2[4] sdb3[3] sde2[5]

15

>       2197687296 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]

16

>       bitmap: 2/6 pages [8KB], 65536KB chunk

17

>

18

> On top of this I have a few LVs with ext4 filesystems:

19

> tune2fs -l /dev/vg1/root  | grep RAID

20

> RAID stride:              128

21

> RAID stripe width:        384

22

> (this is root, bin, sbin, lib)

23

>

24

> tune2fs -l /dev/vg1/data  | grep RAID

25

> RAID stride:              19204

26

> (this is just about everything else)

27

>

28

> tune2fs -l /dev/vg1/video  | grep RAID

29

> RAID stride:              11047

30

> (this is mythtv video)

31

>

32

> Those were all the defaults picked, and with the exception of root I

33

> believe the array was quite different when the others were created.

34

> I'm pretty confident that none of these are optimizes, and I'd be

35

> shocked if any of them are aligned unless this is automated (including

36

> across pvmoves, reshaping, and such).

37

>

38

> That is part of why I'd like to move to btrfs - optimizing raid with

39

> mdadm+lvm+mkfs.ext4 involves a lot of micromanagement as far as I'm

40

> aware.  Docs are very spotty at best, and it isn't at all clear that

41

> things get adjusted as needed when you actually take advantage of

42

> things like pvmove or reshaping arrays.  I suspect that having btrfs

43

> on bare metal will be more likely to result in something that keeps

44

> itself in-tune.

45

>

46

> Rich

47

>

48

49

Thanks Rich. I'm finding that helpful.

50

51

I completely agree on the micromanagement comment. At one level or

52

another that's sort of what this whole thread is about!

53

54

On your root partition I sort of wonder about the stripe width.

55

Assuming I did it right (5, 5, 512, 4) his little page calculates 128

56

for the stride and 512 stripe width. (4 data disks * 128 I think) Just

57

a piece of info.

58

59

http://busybox.net/~aldot/mkfs_stride.html

60

61

Returning to the title of the thread, asking about partition location

62

essentially, I woke up this morning and had sort of decided to just

63

try changing the chunk size to something large like your 512K. It

64

seems I'm out of luck as my partition size is not (apparently)

65

divisible by 512K:

66

67

c2RAID6 ~ # mdadm --grow /dev/md3 --chunk=512

68

--backup-file=/backups/ChunkSizeBackup

69

mdadm: component size 484088160K is not a multiple of chunksize 512K

70

c2RAID6 ~ # mdadm --grow /dev/md3 --chunk=256

71

--backup-file=/backups/ChunkSizeBackup

72

mdadm: component size 484088160K is not a multiple of chunksize 256K

73

c2RAID6 ~ # mdadm --grow /dev/md3 --chunk=128

74

--backup-file=/backups/ChunkSizeBackup

75

mdadm: component size 484088160K is not a multiple of chunksize 128K

76

c2RAID6 ~ # mdadm --grow /dev/md3 --chunk=64

77

--backup-file=/backups/ChunkSizeBackup

78

mdadm: component size 484088160K is not a multiple of chunksize 64K

79

c2RAID6 ~ #

80

c2RAID6 ~ # cat /proc/mdstat

81

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]

82

md3 : active raid6 sdb3[9] sdf3[5] sde3[6] sdd3[7] sdc3[8]

83

      1452264480 blocks super 1.2 level 6, 16k chunk, algorithm 2 [5/5] [UUUUU]

84

85

unused devices: <none>

86

c2RAID6 ~ # fdisk -l /dev/sdb

87

88

Disk /dev/sdb: 500.1 GB, 500107862016 bytes, 976773168 sectors

89

Units = sectors of 1 * 512 = 512 bytes

90

Sector size (logical/physical): 512 bytes / 512 bytes

91

I/O size (minimum/optimal): 512 bytes / 512 bytes

92

Disk identifier: 0x8b45be24

93

94

   Device Boot      Start         End      Blocks   Id  System

95

/dev/sdb1   *          63      112454       56196   83  Linux

96

/dev/sdb2          112455     8514449     4200997+  82  Linux swap / Solaris

97

/dev/sdb3         8594775   976773167   484089196+  fd  Linux raid autodetect

98

c2RAID6 ~ #

99

100

I suspect I might be much better off if all the partition sizes were

101

divisible by 2048 and started on 2048 multiple, like the newer fdisk

102

tools enforce.

103

104

I am thinking I won't make much headway unless I completely rebuild

105

the system from bare metal up. If I'm going to do that then I need to

106

get a good copy of the whole RAID onto some other drive which is a big

107

scary job, then start over with an install disk I guess.

108

109

Not sure I'm up for that just yet on a Sunday morning...

110

111

Take care,

112

Mark

1	On Sun, Jun 23, 2013 at 4:43 AM, Rich Freeman <rich0@g.o> wrote:
2	> On Sat, Jun 22, 2013 at 7:04 PM, Mark Knecht <markknecht@×××××.com> wrote:
3	>> I've been rereading everyone's posts as well as trying to collect
4	>> my own thoughts. One question I have at this point, being that you and
5	>> I seem to be the two non-RAID1 users (but not necessarily devotees) at
6	>> this time, is what chunk size, stride & stripe width with you are
7	>> using?
8	>
9	> I'm using 512K chunks on the two RAID5s which are my LVM PVs:
10	> md7 : active raid5 sdc3[0] sdd3[6] sde3[7] sda4[2] sdb4[5]
11	> 971765760 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
12	> bitmap: 1/2 pages [4KB], 65536KB chunk
13	>
14	> md6 : active raid5 sda3[0] sdd2[4] sdb3[3] sde2[5]
15	> 2197687296 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
16	> bitmap: 2/6 pages [8KB], 65536KB chunk
17	>
18	> On top of this I have a few LVs with ext4 filesystems:
19	> tune2fs -l /dev/vg1/root \| grep RAID
20	> RAID stride: 128
21	> RAID stripe width: 384
22	> (this is root, bin, sbin, lib)
23	>
24	> tune2fs -l /dev/vg1/data \| grep RAID
25	> RAID stride: 19204
26	> (this is just about everything else)
27	>
28	> tune2fs -l /dev/vg1/video \| grep RAID
29	> RAID stride: 11047
30	> (this is mythtv video)
31	>
32	> Those were all the defaults picked, and with the exception of root I
33	> believe the array was quite different when the others were created.
34	> I'm pretty confident that none of these are optimizes, and I'd be
35	> shocked if any of them are aligned unless this is automated (including
36	> across pvmoves, reshaping, and such).
37	>
38	> That is part of why I'd like to move to btrfs - optimizing raid with
39	> mdadm+lvm+mkfs.ext4 involves a lot of micromanagement as far as I'm
40	> aware. Docs are very spotty at best, and it isn't at all clear that
41	> things get adjusted as needed when you actually take advantage of
42	> things like pvmove or reshaping arrays. I suspect that having btrfs
43	> on bare metal will be more likely to result in something that keeps
44	> itself in-tune.
45	>
46	> Rich
47	>
48
49	Thanks Rich. I'm finding that helpful.
50
51	I completely agree on the micromanagement comment. At one level or
52	another that's sort of what this whole thread is about!
53
54	On your root partition I sort of wonder about the stripe width.
55	Assuming I did it right (5, 5, 512, 4) his little page calculates 128
56	for the stride and 512 stripe width. (4 data disks * 128 I think) Just
57	a piece of info.
58
59	http://busybox.net/~aldot/mkfs_stride.html
60
61	Returning to the title of the thread, asking about partition location
62	essentially, I woke up this morning and had sort of decided to just
63	try changing the chunk size to something large like your 512K. It
64	seems I'm out of luck as my partition size is not (apparently)
65	divisible by 512K:
66
67	c2RAID6 ~ # mdadm --grow /dev/md3 --chunk=512
68	--backup-file=/backups/ChunkSizeBackup
69	mdadm: component size 484088160K is not a multiple of chunksize 512K
70	c2RAID6 ~ # mdadm --grow /dev/md3 --chunk=256
71	--backup-file=/backups/ChunkSizeBackup
72	mdadm: component size 484088160K is not a multiple of chunksize 256K
73	c2RAID6 ~ # mdadm --grow /dev/md3 --chunk=128
74	--backup-file=/backups/ChunkSizeBackup
75	mdadm: component size 484088160K is not a multiple of chunksize 128K
76	c2RAID6 ~ # mdadm --grow /dev/md3 --chunk=64
77	--backup-file=/backups/ChunkSizeBackup
78	mdadm: component size 484088160K is not a multiple of chunksize 64K
79	c2RAID6 ~ #
80	c2RAID6 ~ # cat /proc/mdstat
81	Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
82	md3 : active raid6 sdb3[9] sdf3[5] sde3[6] sdd3[7] sdc3[8]
83	1452264480 blocks super 1.2 level 6, 16k chunk, algorithm 2 [5/5] [UUUUU]
84
85	unused devices: <none>
86	c2RAID6 ~ # fdisk -l /dev/sdb
87
88	Disk /dev/sdb: 500.1 GB, 500107862016 bytes, 976773168 sectors
89	Units = sectors of 1 * 512 = 512 bytes
90	Sector size (logical/physical): 512 bytes / 512 bytes
91	I/O size (minimum/optimal): 512 bytes / 512 bytes
92	Disk identifier: 0x8b45be24
93
94	Device Boot Start End Blocks Id System
95	/dev/sdb1 * 63 112454 56196 83 Linux
96	/dev/sdb2 112455 8514449 4200997+ 82 Linux swap / Solaris
97	/dev/sdb3 8594775 976773167 484089196+ fd Linux raid autodetect
98	c2RAID6 ~ #
99
100	I suspect I might be much better off if all the partition sizes were
101	divisible by 2048 and started on 2048 multiple, like the newer fdisk
102	tools enforce.
103
104	I am thinking I won't make much headway unless I completely rebuild
105	the system from bare metal up. If I'm going to do that then I need to
106	get a good copy of the whole RAID onto some other drive which is a big
107	scary job, then start over with an install disk I guess.
108
109	Not sure I'm up for that just yet on a Sunday morning...
110
111	Take care,
112	Mark

Gentoo Archives: gentoo-amd64