1 |
On Monday, August 01, 2016 08:43:49 AM james wrote: |
2 |
> On 08/01/2016 02:16 AM, J. Roeleveld wrote: |
3 |
> > On Saturday, July 30, 2016 06:38:01 AM Rich Freeman wrote: |
4 |
> >> On Sat, Jul 30, 2016 at 6:24 AM, Alan McKinnon <alan.mckinnon@×××××.com> |
5 |
> > |
6 |
> > wrote: |
7 |
> >>> On 29/07/2016 22:58, Mick wrote: |
8 |
> >>>> Interesting article explaining why Uber are moving away from |
9 |
> >>>> PostgreSQL. |
10 |
> >>>> I am |
11 |
> >>>> running both DBs on different desktop PCs for akonadi and I'm also |
12 |
> >>>> running |
13 |
> >>>> MySQL on a number of websites. Let's which one goes sideways first. |
14 |
> >>>> :p |
15 |
> >>>> |
16 |
> >>>> https://eng.uber.com/mysql-migration/ |
17 |
> >>> |
18 |
> >>> I don't think your akonadi and some web sites compares in any way to |
19 |
> >>> Uber |
20 |
> >>> and what they do. |
21 |
> >>> |
22 |
> >>> FWIW, my Dev colleagues support and entire large corporate ISP's |
23 |
> >>> operational and customer data on PostgreSQL-9.3. With clustering. With |
24 |
> >>> no |
25 |
> >>> db-related issues :-) |
26 |
> >> |
27 |
> >> Agree, you'd need to be fairly large-scale to have their issues, |
28 |
> > |
29 |
> > And also have to design your database by people who think MySQL actually |
30 |
> > follows common SQL standards. |
31 |
> > |
32 |
> >> but I |
33 |
> >> think the article was something anybody interested in databases should |
34 |
> >> read. If nothing else it is a really easy to follow explanation of |
35 |
> >> the underlying architectures. |
36 |
> > |
37 |
> > Check the link posted by Douglas. |
38 |
> > Ubers article has some misunderstandings about the architecture with |
39 |
> > conclusions drawn that are, at least also, caused by their database design |
40 |
> > and usage. |
41 |
> > |
42 |
> >> I'll probably post this to my LUG mailing list. I think one of the |
43 |
> >> Postgres devs lurks there so I'm curious to his impressions. |
44 |
> >> |
45 |
> >> I was a bit surprised to hear about the data corruption bug. I've |
46 |
> >> always considered Postgres to have a better reputation for data |
47 |
> >> integrity. |
48 |
> > |
49 |
> > They do. |
50 |
> > |
51 |
> >> And of course almost any FOSS project could have a bug. I |
52 |
> >> don't know if either project does the kind of regression testing to |
53 |
> >> reliably detect this sort of issue. |
54 |
> > |
55 |
> > Not sure either, I do think PostgreSQL does a lot with regression tests. |
56 |
> > |
57 |
> >> I'd think that it is more likely |
58 |
> >> that the likes of Oracle would (for their flagship DB (not for MySQL), |
59 |
> > |
60 |
> > Never worked with Oracle (or other big software vendors), have you? :) |
61 |
> > |
62 |
> >> and they'd probably be more likely to send out an engineer to beg |
63 |
> >> forgiveness while they fix your database). |
64 |
> > |
65 |
> > Only if you're a big (as in, spend a lot of money with them) customer. |
66 |
> > |
67 |
> >> Of course, if you're Uber |
68 |
> >> the hit you'd take from downtime/etc isn't made up for entirely by |
69 |
> >> having somebody take a few days to get everything fixed. |
70 |
> > |
71 |
> > -- |
72 |
> > Joost |
73 |
> |
74 |
> I certainly respect your skills and posts on Databases, Joost, as |
75 |
> everything you have posted, in the past is 'spot on'. |
76 |
|
77 |
Comes with a keen interest and long-term (think decades) of working with |
78 |
different databases. |
79 |
|
80 |
> Granted, I'm no database expert, far from it. |
81 |
|
82 |
Not many people are, nor do they need to be. |
83 |
|
84 |
> But I want to share a few thing with you, |
85 |
> and hope you (and others) will 'chime in' on these comments. |
86 |
> |
87 |
> Way back, when the earth was cooling and we all had dinosaurs for pets, |
88 |
> some of us hacked on AT&T "3B2" unix systems. They were know for their |
89 |
> 'roll back and recovery', triplicated (or more) transaction processes |
90 |
> and 'voters' system to ferret out if a transaction was complete and |
91 |
> correct. There was no ACID, the current 'gold standard' if you believe |
92 |
> what Douglas and other write about concerning databases. |
93 |
> |
94 |
> In essence, (from crusted up memories) a basic (SS7) transaction related |
95 |
> to the local telephone switch, was ran on 3 machines. The results were |
96 |
> compared. If they matched, the transaction went forward as valid. If 2/3 |
97 |
> matched, |
98 |
|
99 |
And what in the likely case when only 1 was correct? |
100 |
Have you seen the movie "minority report"? |
101 |
If yes, think back to why Tom Cruise was found 'guilty' when he wasn't and how |
102 |
often this actually occured. |
103 |
|
104 |
> and the switch was was configured, then the code would |
105 |
> essentially 'vote' and majority ruled. This is what led to phone calls |
106 |
> (switched phone calls) having variable delays, often in the order of |
107 |
> seconds, mis-connections and other problems we all encountered during |
108 |
> periods of excessive demand. |
109 |
|
110 |
Not sure if that was the cause in the past, but these days it can also still |
111 |
take a few seconds before the other end rings. This is due to the phone-system |
112 |
(all PBXs in the path) needing to setup the routing between both end-points |
113 |
prior to the ring-tone actually starting. |
114 |
When the system is busy, these lookups will take time and can even time-out. |
115 |
(Try wishing everyone you know a happy new year using a wired phone and you'll |
116 |
see what I mean. Mobile phones have a seperate problem at that time) |
117 |
|
118 |
> That scenario was at the heart of how old, crappy AT&T unix (SVR?) could |
119 |
> perform so well and therefore established the gold standard for RT |
120 |
> transaction processing, aka the "five 9s" 99.999% of up-time (about 5 |
121 |
> minutes per year of downtime). |
122 |
|
123 |
"Unscheduled" downtime. Regular maintenance will require more than 5 minutes |
124 |
per year. |
125 |
|
126 |
> Sure this part is only related to |
127 |
> transaction processing as there was much more to the "five 9s" legacy, |
128 |
> but imho, that is the heart of what was the precursor to ACID property's |
129 |
> now so greatly espoused in SQL codes that Douglas refers to. |
130 |
> |
131 |
> Do folks concur or disagree at this point? |
132 |
|
133 |
ACID is about data integrity. The "best 2 out of 3" voting was, in my opinion, |
134 |
a work-around for unreliable hardware. It is based on a clever idea, but when |
135 |
2 computers having the same data and logic come up with 2 different answers, I |
136 |
wouldn't trust either of them. |
137 |
|
138 |
> The reason this is important to me (and others?), is that, if this idea |
139 |
> (granted there is much more detail to it) is still valid, then it can |
140 |
> form the basis for building up superior-ACID processes, that meet or |
141 |
> exceed, the properties of an expensive (think Oracle) transaction |
142 |
> process on distributed (parallel) or clustered systems, to a degree of |
143 |
> accuracy only limited by the limit of the number of odd numbered voter |
144 |
> codes involve in the distributed and replicated parts of the |
145 |
> transaction. I even added some code where replicated routines were |
146 |
> written in different languages, and the results compared to add an |
147 |
> additional layer of verification before the voter step. (gotta love |
148 |
> assembler?). |
149 |
|
150 |
You have seen how "democracies" work, right? :) |
151 |
The more voters involved, the longer it takes for all the votes to be counted. |
152 |
With a small number, it might actually still scale, but when you pass a magic |
153 |
number (no clue what this would be), the counting time starts to exceed any |
154 |
time you might have gained by adding more voters. |
155 |
|
156 |
Also, this, to me, seems to counteract the whole reason for using clusters: |
157 |
Have different nodes handle a different part of the problem. |
158 |
|
159 |
Clusters of multiple compute-nodes is a quick and "simple" way of increasing |
160 |
the amount of computational cores to throw at problems that can be broken down |
161 |
in a lot of individual steps with minimal inter-dependencies. |
162 |
I say "simple" because I think designing a 1,000 core chip is more difficult |
163 |
than building a 1,000-node cluster using single-core, single cpu boxes. |
164 |
|
165 |
I would still consider the cluster to be a single "machine". |
166 |
|
167 |
> I guess my point is 'Douglas' is full of stuffing, OR that is what folks |
168 |
> are doing when they 'role their own solution specifically customized to |
169 |
> their specific needs' as he alludes to near the end of his commentary? |
170 |
|
171 |
The response Douglas linked to is closer to what seems to work when dealing |
172 |
with large amounts of data. |
173 |
|
174 |
> (I'd like your opinion of this and maybe some links to current schemes |
175 |
> how to have ACID/99.999% accurate transactions on clusters of various |
176 |
> architectures.) Douglas, like yourself, writes of these things in a |
177 |
> very lucid fashion, so that is why I'm asking you for your thoughts. |
178 |
|
179 |
The way Uber created the cluster is useful when having 1 node handle all the |
180 |
updates and multiple nodes providing read-only access while also providing |
181 |
failover functionality. |
182 |
|
183 |
> Robustness of transactions, in a distributed (clustered) environment is |
184 |
> fundamental to the usefulness of most codes that are trying to migrate |
185 |
> to a cluster based processes in (VM/container/HPC) environments. |
186 |
|
187 |
Whereas I do consider clusters to be very useful, not all work-loads can be |
188 |
redesigned to scale properly. |
189 |
|
190 |
> I do |
191 |
> not have the old articles handy but, I'm sure that many/most of those |
192 |
> types of inherent processes can be formulated in the algebraic domain, |
193 |
> normalized and used to solve decisions often where other forms of |
194 |
> advanced logic failed (not that I'm taking a cheap shot at modern |
195 |
> programming languages) (wink wink nudge nudge); or at least that's how |
196 |
> we did it.... as young whipper_snappers bask in the day... |
197 |
|
198 |
If you know what you are doing, the language is just a tool. Sometimes a |
199 |
hammer is sufficient, other times one might need to use a screwdriver. |
200 |
|
201 |
> --an_old_farts_logic |
202 |
|
203 |
Thinking back on how long I've been playing with computers, I wonder how long |
204 |
it will be until I am in the "old fart" category? |
205 |
|
206 |
-- |
207 |
Joost |