Re: [gentoo-user] PostgreSQL Vs MySQL @Uber - gentoo-user

From:	"J. Roeleveld" <joost@××××××××.org>
To:	gentoo-user@l.g.o
Subject:	Re: [gentoo-user] PostgreSQL Vs MySQL @Uber
Date:	Mon, 01 Aug 2016 16:49:59
Message-Id:	`1954543.QCiV5ja7vZ@andromeda`
In Reply to:	Re: [gentoo-user] PostgreSQL Vs MySQL @Uber by james

1

On Monday, August 01, 2016 08:43:49 AM james wrote:

2

> On 08/01/2016 02:16 AM, J. Roeleveld wrote:

3

> > On Saturday, July 30, 2016 06:38:01 AM Rich Freeman wrote:

4

> >> On Sat, Jul 30, 2016 at 6:24 AM, Alan McKinnon <alan.mckinnon@×××××.com>

5

> >

6

> > wrote:

7

> >>> On 29/07/2016 22:58, Mick wrote:

8

> >>>> Interesting article explaining why Uber are moving away from

9

> >>>> PostgreSQL.

10

> >>>> I am

11

> >>>> running both DBs on different desktop PCs for akonadi and I'm also

12

> >>>> running

13

> >>>> MySQL on a number of websites.  Let's which one goes sideways first.

14

> >>>> :p

15

> >>>>

16

> >>>>  https://eng.uber.com/mysql-migration/

17

> >>>

18

> >>> I don't think your akonadi and some web sites compares in any way to

19

> >>> Uber

20

> >>> and what they do.

21

> >>>

22

> >>> FWIW, my Dev colleagues support and entire large corporate ISP's

23

> >>> operational and customer data on PostgreSQL-9.3. With clustering. With

24

> >>> no

25

> >>> db-related issues :-)

26

> >>

27

> >> Agree, you'd need to be fairly large-scale to have their issues,

28

> >

29

> > And also have to design your database by people who think MySQL actually

30

> > follows common SQL standards.

31

> >

32

> >> but I

33

> >> think the article was something anybody interested in databases should

34

> >> read.  If nothing else it is a really easy to follow explanation of

35

> >> the underlying architectures.

36

> >

37

> > Check the link posted by Douglas.

38

> > Ubers article has some misunderstandings about the architecture with

39

> > conclusions drawn that are, at least also, caused by their database design

40

> > and usage.

41

> >

42

> >> I'll probably post this to my LUG mailing list.  I think one of the

43

> >> Postgres devs lurks there so I'm curious to his impressions.

44

> >>

45

> >> I was a bit surprised to hear about the data corruption bug.  I've

46

> >> always considered Postgres to have a better reputation for data

47

> >> integrity.

48

> >

49

> > They do.

50

> >

51

> >> And of course almost any FOSS project could have a bug.  I

52

> >> don't know if either project does the kind of regression testing to

53

> >> reliably detect this sort of issue.

54

> >

55

> > Not sure either, I do think PostgreSQL does a lot with regression tests.

56

> >

57

> >> I'd think that it is more likely

58

> >> that the likes of Oracle would (for their flagship DB (not for MySQL),

59

> >

60

> > Never worked with Oracle (or other big software vendors), have you? :)

61

> >

62

> >> and they'd probably be more likely to send out an engineer to beg

63

> >> forgiveness while they fix your database).

64

> >

65

> > Only if you're a big (as in, spend a lot of money with them) customer.

66

> >

67

> >> Of course, if you're Uber

68

> >> the hit you'd take from downtime/etc isn't made up for entirely by

69

> >> having somebody take a few days to get everything fixed.

70

> >

71

> > --

72

> > Joost

73

>

74

> I certainly respect your skills and posts on Databases, Joost, as

75

> everything you have posted, in the past is 'spot on'.

76

77

Comes with a keen interest and long-term (think decades) of working with 

78

different databases.

79

80

> Granted, I'm no database expert, far from it.

81

82

Not many people are, nor do they need to be.

83

84

> But I want to share a few thing with you,

85

> and hope you  (and others) will 'chime in' on these comments.

86

>

87

> Way back, when the earth was cooling and we all had dinosaurs for pets,

88

> some of us hacked on AT&T "3B2" unix systems. They were know for their

89

> 'roll back and recovery', triplicated (or more) transaction processes

90

> and 'voters' system to ferret out if a transaction was complete and

91

> correct. There was no ACID, the current 'gold standard' if you believe

92

> what Douglas and other write about concerning databases.

93

>

94

> In essence, (from crusted up memories) a basic (SS7) transaction related

95

> to the local telephone switch, was ran  on 3 machines. The results were

96

> compared. If they matched, the transaction went forward as valid. If 2/3

97

> matched,

98

99

And what in the likely case when only 1 was correct?

100

Have you seen the movie "minority report"?

101

If yes, think back to why Tom Cruise was found 'guilty' when he wasn't and how 

102

often this actually occured.

103

104

> and the switch was was configured, then the code would

105

> essentially 'vote' and majority ruled. This is what led to phone calls

106

> (switched phone calls) having variable delays, often in the order of

107

> seconds, mis-connections and other problems we all encountered during

108

> periods of excessive demand.

109

110

Not sure if that was the cause in the past, but these days it can also still 

111

take a few seconds before the other end rings. This is due to the phone-system 

112

(all PBXs in the path) needing to setup the routing between both end-points 

113

prior to the ring-tone actually starting.

114

When the system is busy, these lookups will take time and can even time-out. 

115

(Try wishing everyone you know a happy new year using a wired phone and you'll 

116

see what I mean. Mobile phones have a seperate problem at that time)

117

118

> That scenario was at the heart of how old, crappy AT&T unix (SVR?) could

119

> perform so well and therefore established the gold standard for RT

120

> transaction processing, aka the "five  9s" 99.999% of up-time (about 5

121

> minutes per year of downtime).

122

123

"Unscheduled" downtime. Regular maintenance will require more than 5 minutes 

124

per year.

125

126

> Sure this part is only related to

127

> transaction processing as there was much more to the "five 9s" legacy,

128

> but imho, that is the heart of what was the precursor to ACID property's

129

> now so greatly espoused in SQL codes that Douglas refers to.

130

>

131

> Do folks concur or disagree at this point?

132

133

ACID is about data integrity. The "best 2 out of 3" voting was, in my opinion, 

134

a work-around for unreliable hardware. It is based on a clever idea, but when 

135

2 computers having the same data and logic come up with 2 different answers, I 

136

wouldn't trust either of them.

137

138

> The reason this is important to me (and others?), is that, if this idea

139

> (granted there is much more detail to it) is still valid, then it can

140

> form  the basis for building up superior-ACID processes, that meet or

141

> exceed, the properties of an expensive (think Oracle) transaction

142

> process on distributed (parallel) or clustered systems, to a degree of

143

> accuracy only limited by the limit of the number of odd numbered voter

144

> codes involve in the distributed and replicated parts of the

145

> transaction. I even added some code where replicated routines were

146

> written in different languages, and the results compared to add an

147

> additional layer of verification before the voter step. (gotta love

148

> assembler?).

149

150

You have seen how "democracies" work, right? :)

151

The more voters involved, the longer it takes for all the votes to be counted.

152

With a small number, it might actually still scale, but when you pass a magic 

153

number (no clue what this would be), the counting time starts to exceed any 

154

time you might have gained by adding more voters.

155

156

Also, this, to me, seems to counteract the whole reason for using clusters: 

157

Have different nodes handle a different part of the problem.

158

159

Clusters of multiple compute-nodes is a quick and "simple" way of increasing 

160

the amount of computational cores to throw at problems that can be broken down 

161

in a lot of individual steps with minimal inter-dependencies.

162

I say "simple" because I think designing a 1,000 core chip is more difficult 

163

than building a 1,000-node cluster using single-core, single cpu boxes.

164

165

I would still consider the cluster to be a single "machine".

166

167

> I guess my point is 'Douglas' is full of stuffing, OR that is what folks

168

> are doing when they 'role their own solution specifically customized to

169

> their specific needs' as he alludes to near the end of his commentary?

170

171

The response Douglas linked to is closer to what seems to work when dealing 

172

with large amounts of data.

173

174

> (I'd like your opinion of this and maybe some links to current schemes

175

> how to have ACID/99.999% accurate transactions on clusters of various

176

> architectures.)  Douglas, like yourself, writes of these things in a

177

> very lucid fashion, so that is why I'm asking you for your thoughts.

178

179

The way Uber created the cluster is useful when having 1 node handle all the 

180

updates and multiple nodes providing read-only access while also providing 

181

failover functionality.

182

183

> Robustness of transactions, in a distributed (clustered) environment is

184

> fundamental to the usefulness of most codes that are trying to migrate

185

> to a cluster based processes in (VM/container/HPC) environments.

186

187

Whereas I do consider clusters to be very useful, not all work-loads can be 

188

redesigned to scale properly.

189

190

> I do

191

> not have the old articles handy but, I'm sure that many/most of those

192

> types of inherent processes can be formulated in the algebraic domain,

193

> normalized and used to solve decisions often where other forms of

194

> advanced logic failed (not that I'm taking a cheap shot at modern

195

> programming languages) (wink wink nudge nudge); or at least that's how

196

> we did it.... as young whipper_snappers bask in the day...

197

198

If you know what you are doing, the language is just a tool. Sometimes a 

199

hammer is sufficient, other times one might need to use a screwdriver.

200

201

> --an_old_farts_logic

202

203

Thinking back on how long I've been playing with computers, I wonder how long 

204

it will be until I am in the "old fart" category?

205

206

--

207

Joost

Gentoo Archives: gentoo-user

Replies

1	On Monday, August 01, 2016 08:43:49 AM james wrote:
2	> On 08/01/2016 02:16 AM, J. Roeleveld wrote:
3	> > On Saturday, July 30, 2016 06:38:01 AM Rich Freeman wrote:
4	> >> On Sat, Jul 30, 2016 at 6:24 AM, Alan McKinnon <alan.mckinnon@×××××.com>
5	> >
6	> > wrote:
7	> >>> On 29/07/2016 22:58, Mick wrote:
8	> >>>> Interesting article explaining why Uber are moving away from
9	> >>>> PostgreSQL.
10	> >>>> I am
11	> >>>> running both DBs on different desktop PCs for akonadi and I'm also
12	> >>>> running
13	> >>>> MySQL on a number of websites. Let's which one goes sideways first.
14	> >>>> :p
15	> >>>>
16	> >>>> https://eng.uber.com/mysql-migration/
17	> >>>
18	> >>> I don't think your akonadi and some web sites compares in any way to
19	> >>> Uber
20	> >>> and what they do.
21	> >>>
22	> >>> FWIW, my Dev colleagues support and entire large corporate ISP's
23	> >>> operational and customer data on PostgreSQL-9.3. With clustering. With
24	> >>> no
25	> >>> db-related issues :-)
26	> >>
27	> >> Agree, you'd need to be fairly large-scale to have their issues,
28	> >
29	> > And also have to design your database by people who think MySQL actually
30	> > follows common SQL standards.
31	> >
32	> >> but I
33	> >> think the article was something anybody interested in databases should
34	> >> read. If nothing else it is a really easy to follow explanation of
35	> >> the underlying architectures.
36	> >
37	> > Check the link posted by Douglas.
38	> > Ubers article has some misunderstandings about the architecture with
39	> > conclusions drawn that are, at least also, caused by their database design
40	> > and usage.
41	> >
42	> >> I'll probably post this to my LUG mailing list. I think one of the
43	> >> Postgres devs lurks there so I'm curious to his impressions.
44	> >>
45	> >> I was a bit surprised to hear about the data corruption bug. I've
46	> >> always considered Postgres to have a better reputation for data
47	> >> integrity.
48	> >
49	> > They do.
50	> >
51	> >> And of course almost any FOSS project could have a bug. I
52	> >> don't know if either project does the kind of regression testing to
53	> >> reliably detect this sort of issue.
54	> >
55	> > Not sure either, I do think PostgreSQL does a lot with regression tests.
56	> >
57	> >> I'd think that it is more likely
58	> >> that the likes of Oracle would (for their flagship DB (not for MySQL),
59	> >
60	> > Never worked with Oracle (or other big software vendors), have you? :)
61	> >
62	> >> and they'd probably be more likely to send out an engineer to beg
63	> >> forgiveness while they fix your database).
64	> >
65	> > Only if you're a big (as in, spend a lot of money with them) customer.
66	> >
67	> >> Of course, if you're Uber
68	> >> the hit you'd take from downtime/etc isn't made up for entirely by
69	> >> having somebody take a few days to get everything fixed.
70	> >
71	> > --
72	> > Joost
73	>
74	> I certainly respect your skills and posts on Databases, Joost, as
75	> everything you have posted, in the past is 'spot on'.
76
77	Comes with a keen interest and long-term (think decades) of working with
78	different databases.
79
80	> Granted, I'm no database expert, far from it.
81
82	Not many people are, nor do they need to be.
83
84	> But I want to share a few thing with you,
85	> and hope you (and others) will 'chime in' on these comments.
86	>
87	> Way back, when the earth was cooling and we all had dinosaurs for pets,
88	> some of us hacked on AT&T "3B2" unix systems. They were know for their
89	> 'roll back and recovery', triplicated (or more) transaction processes
90	> and 'voters' system to ferret out if a transaction was complete and
91	> correct. There was no ACID, the current 'gold standard' if you believe
92	> what Douglas and other write about concerning databases.
93	>
94	> In essence, (from crusted up memories) a basic (SS7) transaction related
95	> to the local telephone switch, was ran on 3 machines. The results were
96	> compared. If they matched, the transaction went forward as valid. If 2/3
97	> matched,
98
99	And what in the likely case when only 1 was correct?
100	Have you seen the movie "minority report"?
101	If yes, think back to why Tom Cruise was found 'guilty' when he wasn't and how
102	often this actually occured.
103
104	> and the switch was was configured, then the code would
105	> essentially 'vote' and majority ruled. This is what led to phone calls
106	> (switched phone calls) having variable delays, often in the order of
107	> seconds, mis-connections and other problems we all encountered during
108	> periods of excessive demand.
109
110	Not sure if that was the cause in the past, but these days it can also still
111	take a few seconds before the other end rings. This is due to the phone-system
112	(all PBXs in the path) needing to setup the routing between both end-points
113	prior to the ring-tone actually starting.
114	When the system is busy, these lookups will take time and can even time-out.
115	(Try wishing everyone you know a happy new year using a wired phone and you'll
116	see what I mean. Mobile phones have a seperate problem at that time)
117
118	> That scenario was at the heart of how old, crappy AT&T unix (SVR?) could
119	> perform so well and therefore established the gold standard for RT
120	> transaction processing, aka the "five 9s" 99.999% of up-time (about 5
121	> minutes per year of downtime).
122
123	"Unscheduled" downtime. Regular maintenance will require more than 5 minutes
124	per year.
125
126	> Sure this part is only related to
127	> transaction processing as there was much more to the "five 9s" legacy,
128	> but imho, that is the heart of what was the precursor to ACID property's
129	> now so greatly espoused in SQL codes that Douglas refers to.
130	>
131	> Do folks concur or disagree at this point?
132
133	ACID is about data integrity. The "best 2 out of 3" voting was, in my opinion,
134	a work-around for unreliable hardware. It is based on a clever idea, but when
135	2 computers having the same data and logic come up with 2 different answers, I
136	wouldn't trust either of them.
137
138	> The reason this is important to me (and others?), is that, if this idea
139	> (granted there is much more detail to it) is still valid, then it can
140	> form the basis for building up superior-ACID processes, that meet or
141	> exceed, the properties of an expensive (think Oracle) transaction
142	> process on distributed (parallel) or clustered systems, to a degree of
143	> accuracy only limited by the limit of the number of odd numbered voter
144	> codes involve in the distributed and replicated parts of the
145	> transaction. I even added some code where replicated routines were
146	> written in different languages, and the results compared to add an
147	> additional layer of verification before the voter step. (gotta love
148	> assembler?).
149
150	You have seen how "democracies" work, right? :)
151	The more voters involved, the longer it takes for all the votes to be counted.
152	With a small number, it might actually still scale, but when you pass a magic
153	number (no clue what this would be), the counting time starts to exceed any
154	time you might have gained by adding more voters.
155
156	Also, this, to me, seems to counteract the whole reason for using clusters:
157	Have different nodes handle a different part of the problem.
158
159	Clusters of multiple compute-nodes is a quick and "simple" way of increasing
160	the amount of computational cores to throw at problems that can be broken down
161	in a lot of individual steps with minimal inter-dependencies.
162	I say "simple" because I think designing a 1,000 core chip is more difficult
163	than building a 1,000-node cluster using single-core, single cpu boxes.
164
165	I would still consider the cluster to be a single "machine".
166
167	> I guess my point is 'Douglas' is full of stuffing, OR that is what folks
168	> are doing when they 'role their own solution specifically customized to
169	> their specific needs' as he alludes to near the end of his commentary?
170
171	The response Douglas linked to is closer to what seems to work when dealing
172	with large amounts of data.
173
174	> (I'd like your opinion of this and maybe some links to current schemes
175	> how to have ACID/99.999% accurate transactions on clusters of various
176	> architectures.) Douglas, like yourself, writes of these things in a
177	> very lucid fashion, so that is why I'm asking you for your thoughts.
178
179	The way Uber created the cluster is useful when having 1 node handle all the
180	updates and multiple nodes providing read-only access while also providing
181	failover functionality.
182
183	> Robustness of transactions, in a distributed (clustered) environment is
184	> fundamental to the usefulness of most codes that are trying to migrate
185	> to a cluster based processes in (VM/container/HPC) environments.
186
187	Whereas I do consider clusters to be very useful, not all work-loads can be
188	redesigned to scale properly.
189
190	> I do
191	> not have the old articles handy but, I'm sure that many/most of those
192	> types of inherent processes can be formulated in the algebraic domain,
193	> normalized and used to solve decisions often where other forms of
194	> advanced logic failed (not that I'm taking a cheap shot at modern
195	> programming languages) (wink wink nudge nudge); or at least that's how
196	> we did it.... as young whipper_snappers bask in the day...
197
198	If you know what you are doing, the language is just a tool. Sometimes a
199	hammer is sufficient, other times one might need to use a screwdriver.
200
201	> --an_old_farts_logic
202
203	Thinking back on how long I've been playing with computers, I wonder how long
204	it will be until I am in the "old fart" category?
205
206	--
207	Joost

Subject	Author
Re: [gentoo-user] PostgreSQL Vs MySQL @Uber	Rich Freeman <rich0@g.o>
Re: [gentoo-user] PostgreSQL Vs MySQL @Uber	james <garftd@×××××××.net>