1 |
Zhang Weiwu wrote: |
2 |
> Hi. |
3 |
> |
4 |
> I got a datasheet from my colleague in MS Excel format and I intend to |
5 |
> process that file with my awk/sed knowledge. The problem is: he sent me |
6 |
> two Excel files each with 2134 records, in fact there should be only one |
7 |
> excel file with 2134 rows and 295 columns, but MS Excel can only handle |
8 |
> 256 data columns, so he split the datasheet vertically so he can manage |
9 |
> to send to me. |
10 |
> |
11 |
> Now I saved both file to tab-separated-value format, how do I join them? |
12 |
> |
13 |
> I could have used join(1) but that require a join field, an ID of some |
14 |
> sort. I think of this: |
15 |
> |
16 |
> $ grep -n '' left.tsv | sed 's/:/\t/'> left.forjoin |
17 |
> $ grep -n '' right.tsv | sed 's/:/\t/'> right.forjoin |
18 |
> $ join -t " " left.forjoin right.forjoin > result.tsv |
19 |
> (note that for join's -t parameter somehow I need to manage to get a tab |
20 |
> between the quotes) |
21 |
> |
22 |
> Yes I achieved what I want, but that looks complex. Is there a simpler |
23 |
> way? Thanks in advance. |
24 |
> |
25 |
> I know OpenOffice 3.0 can handle up to 1024 data columns. It's difficult |
26 |
> to convince anyone to switch to OOO because here in China MS Office |
27 |
> costs only 0$. I also could use OOO3.0 for doing the join but I wish to |
28 |
> know the commandline way:) |
29 |
> |
30 |
Got perl? |
31 |
|
32 |
#!/usr/bin/perl |
33 |
|
34 |
if($#ARGV < 1) { |
35 |
print "Arguments: <file1> <file2>\n"; |
36 |
exit(1); |
37 |
} |
38 |
|
39 |
open(FIRSTFILE, $ARGV[0]); |
40 |
open(SECONDFILE, $ARGV[1]); |
41 |
@first = <FIRSTFILE>; |
42 |
@second = <SECONDFILE>; |
43 |
|
44 |
$i = 0; |
45 |
for($i = 0;$i < 2; $i++) { |
46 |
$tmp1 = $first[$i]; |
47 |
$tmp1 =~ s/\n//g; |
48 |
$tmp2 = $second[$i]; |
49 |
$tmp2 =~ s/\n//g; |
50 |
|
51 |
$str = $tmp1 . "\t" . $tmp2 . "\n"; |
52 |
print $str; |
53 |
} |
54 |
|
55 |
close(FIRSTFILE); |
56 |
close(SECONDFILE); |
57 |
|
58 |
This is likely not the best or fastest way to do it, and I don't have a |
59 |
dataset as large as yours readily available for testing, but it seems to |
60 |
work. |
61 |
|
62 |
-Tim |
63 |
-- |
64 |
gentoo-user@l.g.o mailing list |