Gentoo Archives: gentoo-soc

From: A Schenck <lane_andrew@×××××××.com>
To: gentoo-soc@l.g.o
Subject: Re: [gentoo-soc] Week 6 Report for Big Data Infrastructure and H2O ebuilds Project
Date: Tue, 20 Jul 2021 17:39:21
Message-Id: DM6PR07MB41567B1C5031BD2ABC013A399DE29@DM6PR07MB4156.namprd07.prod.outlook.com
In Reply to: [gentoo-soc] Week 6 Report for Big Data Infrastructure and H2O ebuilds Project by "Yuan Liao (Leo)"
1 On 7/18/21 10:52 PM, Yuan Liao (Leo) wrote:
2 > Hi folks,
3 >
4 > <snip/>
5 >
6 > As per my original project proposal, I am also adding a test case for
7 > the ebuild installation tests which will ensure every package in the
8 > Spark overlay can be installed at least once. Adding every package to
9 > the emerge command theoretically works, but the command would be too
10 > long. Invoking emerge separately for each package would resolve this
11 > problem, but the overhead of emerge's dependency calculation would
12 > seriously impact the test runtime. I came up with a solution that
13 > could address both issues: write a script to compute a list of leaf
14 > packages in the Spark overlay and pass the packages in the list to
15 > emerge, so every package in the overlay would be installed, and the
16 > emerge command can be simplified to have a shorter length too. The
17 > script can also act as a helpful tool for any ebuild repository's
18 > maintainers to find out all leaf packages in the repository for
19 > maintenance tasks like last-rite and package clean-up. After some
20 > initial optimization and tuning, the script (written in Python) can
21 > compute a list of leaf packages among about 500 packages in the Spark
22 > overlay within only a few minutes. The optimization and tuning is
23 > also the topic for this week's blog post of mine [1]. This post
24 > covers some knowledge and topics from computer science, including
25 > graph theory, graph algorithms, data structure, and time complexity.
26 > If you are interested in any of those subjects, make sure you don't
27 > miss it!
28
29 Thanks!  We actually don't really care much about Java (haven't used it
30 seriously since College), and haven't even been involved in Gentoo GSoC
31 in a decade, but we're glad we stay on this list for things like this. 
32 It's really nice seeing someone who still has that spark of interest in
33 computer things.  We do happen to like graph theory and network analysis
34 and time complexity and such, but haven't really been able to apply it
35 in "the real world" of tech companies.  Every time we try to do things
36 "the right way" with real computer science, coworkers and bosses just
37 say "just hack something together".
38
39
40 Oh well, thanks for what you're doing,
41
42 -A
43
44 >
45 > <snip/>
46 >
47 > This concludes my work during the past week and this report. Thank
48 > you for reading it (and my blog post in case you are checking it out)!
49 >
50 > Best regards,
51 > Leo
52 >
53 > [1]: https://leo3418.github.io/2021/07/18/find-leaf-packages.html
54 > [2]: https://wiki.gentoo.org/wiki/User:Leo3418/Kotlin/Package_Maintainer_Guide
55 >

Replies

Subject Author
Re: [gentoo-soc] Week 6 Report for Big Data Infrastructure and H2O ebuilds Project "Yuan Liao (Leo)" <liaoyuan@×××××.com>