Re: [gentoo-user] SQL Server Advice for Small Business - gentoo-user

From:	"J. Roeleveld" <joost@××××××××.org>
To:	gentoo-user@l.g.o
Subject:	Re: [gentoo-user] SQL Server Advice for Small Business
Date:	Wed, 31 Jul 2013 06:01:03
Message-Id:	`94098ddcc8ff4912850a69be70b1e392.squirrel@www.antarean.org`
In Reply to:	Re: [gentoo-user] SQL Server Advice for Small Business by Randy Westlund

1

On Tue, July 30, 2013 23:34, Randy Westlund wrote:

2

> On Tue, Jul 30, 2013 at 07:52:11AM +0200, J. Roeleveld wrote:

3

>>

4

>> Will the server be internet-facing?

5

>> I would make sure you have a firewall and only open the port needed for

6

>> the front-end.

7

>> Don't update the kernel too often, keep an eye out for security fixes

8

>> and

9

>> apply where necessary.

10

>> Keep a seperate machine/VM where you build binary packages. This will

11

>> significantly reduce the time needed to upgrade the software.

12

>>

13

>

14

> No, it'll be LAN only.  I'll filter out external connections.  There's no

15

> wireless network and no adjacent businesses, so I'm not worrying too much

16

> about security.  The only thing I'll need from the outside is SSH.

17

18

In that case, make sure it runs stable and take time to test new versions.

19

20

> So your recommendation is to have a VM on the server with the same

21

> packages installed, compile things there, then move the binary package to

22

> the real server.  I might set this up at some point, but I think I'll be

23

> okay with updating things in place, so long as I do it at night.

24

25

I wouldn't put the VM on the server itself, but instead on your

26

desktop/laptop.

27

That way you also have a development environment where you can test new

28

features and fix the inevitable bugs.

29

The binary packages from there can then be moved to the server when you

30

are ready to update.

31

I always stop applications when I update them. To minimize downtime, I

32

always ensure I have binary packages available.

33

34

>> That depends on your budget and requirements.

35

>> For databases, RAID-10 is generally considered the best performance.

36

>> Also

37

>> avoid filling the disks and try to use the first half of the disk,

38

>> rather

39

>> then the whole. (First half is faster then 2nd half)

40

>> RAID-10 in software (eg. Linux Software Raid in the kernel) outperforms

41

>> the cheaper RAID-cards easily. If you have the budget, you could invest

42

>> in

43

>> a dedicated hardware raid card (but make sure it is 100% hardware and

44

>> doesn't use the CPU for the calculations)

45

>>

46

>

47

> Okay, RAID-10 sounds good.  Thanks for the tip about the first half of the

48

> drives.

49

50

I got that from a book about Postgresql performance tuning :)

51

The start is quite generic on how to test and optimize performance on

52

hardware and OS level.

53

54

>> Depends on how much you want in there. If just a simple share, then it

55

>> will be simple. If you also want the MS Windows machines to authenticate

56

>> against it, things get a little more complicated.

57

>>

58

>

59

> Should just be a simple share, I don't think I'll need any authentication.

60

61

I would still put in authentication. MS Windows can be set to save the

62

password. That way, you can also set up personal homedirectories and

63

enable tracing to see who does what.

64

65

>> How mission-critical will this be?

66

>> For my server (which has become quite critical over the years), I

67

>> currently use a self-build server with good reliable components.

68

>> TYAN-mainboard (with built-in iKVM), WD-RED drives, Areca hardware

69

>> raid-card.

70

>>

71

>> When I started running my own server, it was on a cheap no-brand

72

>> mainboard

73

>> with simple desktop disks connected via IDE. (yes, ancient :) )

74

>>

75

>

76

> The server will be pretty important.  If all goes according to plan, every

77

> employee that uses a computer (~15) will be interacting with it throughout

78

> the day.  The goal is to replace paper records.  Aside from the hard

79

> drives, are there any other components that are especially important for

80

> databases?

81

82

Yes, memory. Databases are happy with lots and lots of memory for caching.

83

Other then that, most components should work, but go for stability. Ignore

84

boards that are designed for gaming/overclocking. Those are not generally

85

designed for 24/7 usage over a period of several years.

86

One of my mainboards is still 100% stable. Never had a crash. Only reason

87

I stopped using it is because it only holds 4GB of memory.

88

89

Tyan mainboards are, in my experience, rock-solid. Others on the list will

90

probably have their own preferences as well.

91

You can also go for a pre-build server from the likes of DELL, HP,

92

Supermicro,... Those generally use good quality hardware as well. And they

93

often come with (optional) onsite warranty.

94

95

>> You want to try to keep the database design optimized for the usage

96

>> pattern of the client-tools. Which usually means not too much

97

>> normalization. That helps with reporting, not when you need to do mostly

98

>> inserts.

99

>>

100

>

101

> From what I've read so far, it sounded like everything should be

102

> normalized as much as possible even if there's a slight performance hit

103

> because it makes the system easier to modify and expand later.  In my

104

> prototype, I have it divided into as many tables as possible, and each

105

> SELECT has mutiple joins.  Is this a bad idea?

106

107

JOINs are heavy for a database. Normalizing a database is nice, but I

108

don't see that often on transactional systems. (Like what you are planning

109

on making)

110

111

Modifying tables don't take much either, simply do an ALTER TABLE to

112

add/expand fields. (Do NOT reduce the size, or you will LOOSE data) and if

113

necessary fill the fields for existing records with default values.

114

115

I would suggest not to overnormalize the database. Start by seperating the

116

different pieces of information into parts like:

117

- customers

118

- orders

119

- order details (a line on the order)

120

- products

121

- invoices

122

- documents/files

123

124

and other parts that are of interest.

125

Then see if you can fit all that into seperate tables.

126

127

>> How big will those documents be?

128

>> Either, as already mentioned, store them as blobs, or on a (samba) share

129

>> and put metadata (filepath,name,description,...) in the database.

130

>>

131

>

132

> I'm expecting job orders to have at most a few images of the job site,

133

> blueprints, random things the customer/contractor emailed us, and a few

134

> scanned sheets of handwritten notes.  Storing them outside the database

135

> sounds like asking for trouble.  Binary blobs sounds good.

136

137

It's a bit more difficult to organize, if the application itself handles

138

the storing of the files and updating the metadata table, then that will

139

go fine. Look at some of the DMS-systems to see how that is being handled

140

there.

141

142

>> Advice:

143

>> 1) Backup

144

>> 2) Backup

145

>> 3) Did I mention backup? ;)

146

>>

147

>> A tip, when you decide to put the documents on a share, to ensure the

148

>> backups are in sync, do the following:

149

>> 1) stop access to the database

150

>> 2) snapshot the fileshare (LVM helps here)

151

>> 3) backup the database

152

>> 4) allow access to the database again

153

>> 5) backup the snapshot

154

>> 6) remove the snapshot

155

>>

156

>> Total downtime with this should be less then 1 minute. A full backup

157

>> using

158

>> the Postgresql tools is really quick.

159

>> Step 5 can then take as long as it takes. The environment will still be

160

>> running.

161

>>

162

>

163

> How often should a small database like this be backed up?  Once a day?

164

> Twice a day?  I'm thinking that I should backup to another machine on the

165

> network, then copy that to at least one off-side machine.

166

167

That depends simply on the following:

168

- How long a period of data can you afford to loose?

169

170

I have a daily backup, but that is because I can afford to loose 1 day of

171

data.

172

173

Ensure you can always restore the backups. If you have a machine where you

174

develop new versions, use that also to restore the backups.

175

That way you also have a backup machine handy where the employees can

176

continue working.

177

178

> Thanks for your help.

179

180

You're welcome.

181

182

--

183

Joost

1	On Tue, July 30, 2013 23:34, Randy Westlund wrote:
2	> On Tue, Jul 30, 2013 at 07:52:11AM +0200, J. Roeleveld wrote:
3	>>
4	>> Will the server be internet-facing?
5	>> I would make sure you have a firewall and only open the port needed for
6	>> the front-end.
7	>> Don't update the kernel too often, keep an eye out for security fixes
8	>> and
9	>> apply where necessary.
10	>> Keep a seperate machine/VM where you build binary packages. This will
11	>> significantly reduce the time needed to upgrade the software.
12	>>
13	>
14	> No, it'll be LAN only. I'll filter out external connections. There's no
15	> wireless network and no adjacent businesses, so I'm not worrying too much
16	> about security. The only thing I'll need from the outside is SSH.
17
18	In that case, make sure it runs stable and take time to test new versions.
19
20	> So your recommendation is to have a VM on the server with the same
21	> packages installed, compile things there, then move the binary package to
22	> the real server. I might set this up at some point, but I think I'll be
23	> okay with updating things in place, so long as I do it at night.
24
25	I wouldn't put the VM on the server itself, but instead on your
26	desktop/laptop.
27	That way you also have a development environment where you can test new
28	features and fix the inevitable bugs.
29	The binary packages from there can then be moved to the server when you
30	are ready to update.
31	I always stop applications when I update them. To minimize downtime, I
32	always ensure I have binary packages available.
33
34	>> That depends on your budget and requirements.
35	>> For databases, RAID-10 is generally considered the best performance.
36	>> Also
37	>> avoid filling the disks and try to use the first half of the disk,
38	>> rather
39	>> then the whole. (First half is faster then 2nd half)
40	>> RAID-10 in software (eg. Linux Software Raid in the kernel) outperforms
41	>> the cheaper RAID-cards easily. If you have the budget, you could invest
42	>> in
43	>> a dedicated hardware raid card (but make sure it is 100% hardware and
44	>> doesn't use the CPU for the calculations)
45	>>
46	>
47	> Okay, RAID-10 sounds good. Thanks for the tip about the first half of the
48	> drives.
49
50	I got that from a book about Postgresql performance tuning :)
51	The start is quite generic on how to test and optimize performance on
52	hardware and OS level.
53
54	>> Depends on how much you want in there. If just a simple share, then it
55	>> will be simple. If you also want the MS Windows machines to authenticate
56	>> against it, things get a little more complicated.
57	>>
58	>
59	> Should just be a simple share, I don't think I'll need any authentication.
60
61	I would still put in authentication. MS Windows can be set to save the
62	password. That way, you can also set up personal homedirectories and
63	enable tracing to see who does what.
64
65	>> How mission-critical will this be?
66	>> For my server (which has become quite critical over the years), I
67	>> currently use a self-build server with good reliable components.
68	>> TYAN-mainboard (with built-in iKVM), WD-RED drives, Areca hardware
69	>> raid-card.
70	>>
71	>> When I started running my own server, it was on a cheap no-brand
72	>> mainboard
73	>> with simple desktop disks connected via IDE. (yes, ancient :) )
74	>>
75	>
76	> The server will be pretty important. If all goes according to plan, every
77	> employee that uses a computer (~15) will be interacting with it throughout
78	> the day. The goal is to replace paper records. Aside from the hard
79	> drives, are there any other components that are especially important for
80	> databases?
81
82	Yes, memory. Databases are happy with lots and lots of memory for caching.
83	Other then that, most components should work, but go for stability. Ignore
84	boards that are designed for gaming/overclocking. Those are not generally
85	designed for 24/7 usage over a period of several years.
86	One of my mainboards is still 100% stable. Never had a crash. Only reason
87	I stopped using it is because it only holds 4GB of memory.
88
89	Tyan mainboards are, in my experience, rock-solid. Others on the list will
90	probably have their own preferences as well.
91	You can also go for a pre-build server from the likes of DELL, HP,
92	Supermicro,... Those generally use good quality hardware as well. And they
93	often come with (optional) onsite warranty.
94
95	>> You want to try to keep the database design optimized for the usage
96	>> pattern of the client-tools. Which usually means not too much
97	>> normalization. That helps with reporting, not when you need to do mostly
98	>> inserts.
99	>>
100	>
101	> From what I've read so far, it sounded like everything should be
102	> normalized as much as possible even if there's a slight performance hit
103	> because it makes the system easier to modify and expand later. In my
104	> prototype, I have it divided into as many tables as possible, and each
105	> SELECT has mutiple joins. Is this a bad idea?
106
107	JOINs are heavy for a database. Normalizing a database is nice, but I
108	don't see that often on transactional systems. (Like what you are planning
109	on making)
110
111	Modifying tables don't take much either, simply do an ALTER TABLE to
112	add/expand fields. (Do NOT reduce the size, or you will LOOSE data) and if
113	necessary fill the fields for existing records with default values.
114
115	I would suggest not to overnormalize the database. Start by seperating the
116	different pieces of information into parts like:
117	- customers
118	- orders
119	- order details (a line on the order)
120	- products
121	- invoices
122	- documents/files
123
124	and other parts that are of interest.
125	Then see if you can fit all that into seperate tables.
126
127	>> How big will those documents be?
128	>> Either, as already mentioned, store them as blobs, or on a (samba) share
129	>> and put metadata (filepath,name,description,...) in the database.
130	>>
131	>
132	> I'm expecting job orders to have at most a few images of the job site,
133	> blueprints, random things the customer/contractor emailed us, and a few
134	> scanned sheets of handwritten notes. Storing them outside the database
135	> sounds like asking for trouble. Binary blobs sounds good.
136
137	It's a bit more difficult to organize, if the application itself handles
138	the storing of the files and updating the metadata table, then that will
139	go fine. Look at some of the DMS-systems to see how that is being handled
140	there.
141
142	>> Advice:
143	>> 1) Backup
144	>> 2) Backup
145	>> 3) Did I mention backup? ;)
146	>>
147	>> A tip, when you decide to put the documents on a share, to ensure the
148	>> backups are in sync, do the following:
149	>> 1) stop access to the database
150	>> 2) snapshot the fileshare (LVM helps here)
151	>> 3) backup the database
152	>> 4) allow access to the database again
153	>> 5) backup the snapshot
154	>> 6) remove the snapshot
155	>>
156	>> Total downtime with this should be less then 1 minute. A full backup
157	>> using
158	>> the Postgresql tools is really quick.
159	>> Step 5 can then take as long as it takes. The environment will still be
160	>> running.
161	>>
162	>
163	> How often should a small database like this be backed up? Once a day?
164	> Twice a day? I'm thinking that I should backup to another machine on the
165	> network, then copy that to at least one off-side machine.
166
167	That depends simply on the following:
168	- How long a period of data can you afford to loose?
169
170	I have a daily backup, but that is because I can afford to loose 1 day of
171	data.
172
173	Ensure you can always restore the backups. If you have a machine where you
174	develop new versions, use that also to restore the backups.
175	That way you also have a backup machine handy where the employees can
176	continue working.
177
178	> Thanks for your help.
179
180	You're welcome.
181
182	--
183	Joost

Gentoo Archives: gentoo-user