Gentoo Archives: gentoo-dev

From: Kent Fredric <kentnl@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [RFC] Ideas for gentoostats implementation
Date: Sun, 26 Apr 2020 12:56:36
Message-Id: 20200427004547.093d40b2@katipo2.lan
In Reply to: [gentoo-dev] [RFC] Ideas for gentoostats implementation by "Michał Górny"
1 On Sun, 26 Apr 2020 10:08:32 +0200
2 Michał Górny <mgorny@g.o> wrote:
3
4 > A proper solution to cluster problem would probably involve some way to
5 > internally collect and combine data data before submission. If you have
6 > large clusters of similar systems, I think you'd want to have all
7 > packages used on different systems reported as one entry.
8
9 For this, I'd suggest the ability to have an overrideable
10 "STATS_SERVER" (or something) ENV var URI that tells the submission
11 clients where to send their reports to.
12
13 Then have some server shipped in gentoo people can deploy, and submit
14 aggregated as a cron job, or potentially hand review the aggregated
15 submission data before submission, and potentially have tools to
16 whittle data out you don't want to share at the org level.
17
18 Such a tool is potentially useful to an organisation even without its
19 "submit to gentoo" capacity, as being able to internally analyse what
20 your organisation is using seems to be useful.
21
22 (eg: provide an admin a single point of information showing what
23 packages they need to audit, if all the nodes in the org are not
24 entirely controlled at the top level)
25
26 Though I think the overall design of anonymity by design is useful, I
27 can see usecases, especially in the organisation model, where being
28 able to voluntarily self-identify a node could be useful without
29 inherently being a privacy concern.
30
31 And you'd configure your relay to suppress these node identities in the
32 submitted data, or map them to a different org-wide identity.
33
34 Example:
35 I need to find somebody who is using <x> so I can ask them if <y>
36 works, or if <z> is important to this package.
37
38 Example:
39 Data indicates somebody within my org is using <x>, and I need to ask
40 them not to use <x>, as its licensing terms are not compatible with
41 our org.
42
43 Though for cases of voluntary identification, you'd need an interface
44 on the server node somewhere that allows you to generate unique ident
45 tokens, and associate data with them, possibly with a list of flags
46 dictating what records associated with this identity may be used for
47 (eg: Contact [y/n] )