1 |
On Saturday, May 21, 2016 06:51:46 AM Alec Ten Harmsel wrote: |
2 |
> Joost knows far more about databases than I do, so I mostly commented on |
3 |
> the workflow part. |
4 |
> |
5 |
> On 2016-05-20 22:36, waltdnes@××××××××.org wrote: |
6 |
|
7 |
<snipped> |
8 |
|
9 |
> I have never run postgresql on gentoo (hopefully soon :D), but on |
10 |
> Debian-derived distros and RPM-based distros, PGDATA is always somewhere |
11 |
> in /var. /etc seems wrong. |
12 |
|
13 |
There are symlinks from the /var location to /etc for the configuration files. |
14 |
The data itself, eg. PGDATA, sits, by default, in /var/..... |
15 |
|
16 |
<snipped> |
17 |
> `equery use gnumeric' gives the `libgda' flag, which should pull in |
18 |
> database support. I've never used it, so I don't know whether or not it |
19 |
> works/how well it works. What is in this spreadsheet? If it is financial |
20 |
> stuff, you can use Gnucash, which supports using a database as a backend. |
21 |
|
22 |
Does this finally work? |
23 |
Last time I tried this, half the functionality didn't work at all and the |
24 |
other half was buggy. (This was years ago) |
25 |
|
26 |
> > My main problem is that columns of several thousand rows are functions |
27 |
> > |
28 |
> > based on other columns of several thousand rows. For the time-being, |
29 |
> > I've split up the spreadsheet into a few pieces, but a database is the |
30 |
> > best solution. If I could run the calculations in the database, and |
31 |
> > pull in the final results as static numbers for graphing, that would |
32 |
> > greatly reduce the strain on the spreadsheet. Or is it possible to |
33 |
> > graph directly from postgresql? |
34 |
> |
35 |
> Here are my recommendations, in order of "least code" to "most code" (I |
36 |
> don't think postgresql supports graphing): |
37 |
> |
38 |
> 1. Write some sql scripts that compute the data you need and output CSV, |
39 |
> then import to Gnumeric and do the plots. |
40 |
|
41 |
For script examples: |
42 |
http://stackoverflow.com/questions/1517635/save-pl-pgsql-output-from-postgresql-to-a-csv-file |
43 |
|
44 |
> 2. Write python script(s) that run SQL commands and plot the data with |
45 |
> matplotlib. |
46 |
> 3. Write a webapp so you don't have to run scripts by hand - the plots |
47 |
> are generated by opening a web page. |
48 |
4. Write it all in C++ :) |
49 |
|
50 |
> Depending on how much automation you want vs. how much time you want to |
51 |
> spend writing/debugging code, hopefully one of those helps. I help |
52 |
> researchers use a HPC cluster; some are very savvy programmers, some are |
53 |
> not. For working on "big data" projects, some will throw raw data into a |
54 |
> Hadoop cluster and happily do all their work using Hadoop, while some |
55 |
> will put in raw data, clean it up, and then pull it out and use MATLAB, |
56 |
> stata, R, etc., so you just need to find the workflow that works best |
57 |
> for you. I personally would choose option 3, as it involves the least |
58 |
> amount of running scripts over and over, but to each his own. |
59 |
> |
60 |
> I have actual free time now (done with school, finally), so I might be |
61 |
> able to help prototype if you would like as well. |
62 |
|
63 |
Something I could use (and others): |
64 |
A simple PHP page which I can feed: |
65 |
- connection parameters to a database |
66 |
- select-query |
67 |
- which result-field to use for the horizontal axis |
68 |
and then plots the remaining fields for the vertical axis. |
69 |
|
70 |
I haven't checked with google yet, so if there is a decent example, I'd be |
71 |
interested :) |
72 |
|
73 |
-- |
74 |
Joost |