Gentoo Archives: gentoo-user

From: Alec Ten Harmsel <alec@××××××××××××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] postgresql 9.5.2 versus Gentoo wiki install instructions?
Date: Sat, 21 May 2016 10:52:05
Message-Id: 944f4939-7520-ca06-6ba7-0c35e7585dc7@alectenharmsel.com
In Reply to: [gentoo-user] postgresql 9.5.2 versus Gentoo wiki install instructions? by waltdnes@waltdnes.org
1 Joost knows far more about databases than I do, so I mostly commented on
2 the workflow part.
3
4 On 2016-05-20 22:36, waltdnes@××××××××.org wrote:
5 > Yes, I did RTFM at https://wiki.gentoo.org/wiki/PostgreSQL/QuickStart
6 > and that's part of my problem. <G> I figured it would be a simple
7 > search and replace "9.3" ==> "9.5" in the wiki, but...
8 >
9 > 1) The wiki recommends...
10 > PG_INITDB_OPTS="--locale=en_US.UTF-8"
11 >
12 > ...but I get...
13 >
14 >> The database cluster will be initialized with locale "en_US.iso88591".
15 >> initdb: "en_US.UTF8" is not a valid server encoding name
16 > "locale -a" returns...
17 > C
18 > POSIX
19 > en_US
20 > en_US.iso88591
21 > en_US.utf8
22 >
23 > 2) The wiki says...
24 >> This time the focus is upon the files in the PGDATA directory,
25 >> /etc/postgresql-9.3 , instead with primary focus on the
26 >> postgresql.conf and pg_hba.conf files.
27 > "ls /etc/postgresql-9.5/" returns...
28 > postgresql.conf psqlrc
29 >
30 > but postgresql seems to want them in /var/lib instead...
31 >
32 >> mv: cannot stat '/var/lib/postgresql/9.5/data/pg_hba.conf': No such
33 >> file or directory
34 >> mv: cannot stat '/var/lib/postgresql/9.5/data/pg_ident.conf': No
35 >> such file or directory
36 >> mv: cannot stat '/var/lib/postgresql/9.5/data/postgresql.conf':
37 >> No such file or directory
38 > Can somebody please confirm the correct way to go?
39
40 I have never run postgresql on gentoo (hopefully soon :D), but on
41 Debian-derived distros and RPM-based distros, PGDATA is always somewhere
42 in /var. /etc seems wrong.
43
44 >
45 > Why I want postgresql... I've been keeping a bunch of data in a
46 > spreadsheet, and it's gotten too large. The spreadsheet locks up my
47 > system when I try to update it. I've used "top" and watched as
48 > gnumeric's memory consumption grows to eat all available ram. It locks
49 > up the system so I can't even ssh in. This is on an X86_64 with 8 gigs
50 > of RAM! Fortunately, "magic-sysrq" allows a relatively clean shutdown.
51 > While we're at it, is there a way for gnumeric to pull in data directly
52 > from postgresql? ODBC? I'm aware of copying from postgresql to a CSV
53 > file and importing that, but it's rather clunky.
54
55 `equery use gnumeric' gives the `libgda' flag, which should pull in
56 database support. I've never used it, so I don't know whether or not it
57 works/how well it works. What is in this spreadsheet? If it is financial
58 stuff, you can use Gnucash, which supports using a database as a backend.
59
60 >
61 > My main problem is that columns of several thousand rows are functions
62 > based on other columns of several thousand rows. For the time-being,
63 > I've split up the spreadsheet into a few pieces, but a database is the
64 > best solution. If I could run the calculations in the database, and
65 > pull in the final results as static numbers for graphing, that would
66 > greatly reduce the strain on the spreadsheet. Or is it possible to
67 > graph directly from postgresql?
68
69 Here are my recommendations, in order of "least code" to "most code" (I
70 don't think postgresql supports graphing):
71
72 1. Write some sql scripts that compute the data you need and output CSV,
73 then import to Gnumeric and do the plots.
74 2. Write python script(s) that run SQL commands and plot the data with
75 matplotlib.
76 3. Write a webapp so you don't have to run scripts by hand - the plots
77 are generated by opening a web page.
78
79 Depending on how much automation you want vs. how much time you want to
80 spend writing/debugging code, hopefully one of those helps. I help
81 researchers use a HPC cluster; some are very savvy programmers, some are
82 not. For working on "big data" projects, some will throw raw data into a
83 Hadoop cluster and happily do all their work using Hadoop, while some
84 will put in raw data, clean it up, and then pull it out and use MATLAB,
85 stata, R, etc., so you just need to find the workflow that works best
86 for you. I personally would choose option 3, as it involves the least
87 amount of running scripts over and over, but to each his own.
88
89 I have actual free time now (done with school, finally), so I might be
90 able to help prototype if you would like as well.
91
92 Alec

Replies

Subject Author
Re: [gentoo-user] postgresql 9.5.2 versus Gentoo wiki install instructions? "J. Roeleveld" <joost@××××××××.org>