Gentoo Archives: gentoo-portage-dev

From: Jeremy Maitin-Shepard <jbms@g.o>
To: gentoo-portage-dev@l.g.o
Subject: [gentoo-portage-dev] Multiple language use, `component-based' design, and other issues
Date: Mon, 12 Jan 2004 05:13:28
Message-Id: 877jzxg34n.fsf@jbms.ath.cx
1 It has been mentioned that portage-ng will be written using multiple
2 programming languages. I see a number of compelling reasons to avoid
3 using multiple languages, while I see no particular compelling advantage.
4
5 First, supporting multiple languages will mean that significant time and
6 energy will need to be spent developing and maintaining the
7 inter-language interfaces. The result will be a system that is less
8 efficient (inter-language interfaces tend to be less efficient then
9 intra-language ones) and more difficult to maintain and extend (because
10 extending will mean that these inter-language interfaces will need to be
11 extended correspondingly). Furthermore, inter-language interfaces tend
12 to be less elegant syntactically than intra-language ones, meaning that
13 the resulting code will generally be less elegant (and thus more
14 difficult to follow).
15
16 The Python-Bash inter-operation in the current portage is a clear
17 example of the inefficiency involved in language interoperability
18 (certainly, most will not be as inefficient as Bash). This problem can
19 be avoided in portage-ng in a number of ways, which include stricter
20 specification of how meta-data variables can be specified in the ebuild
21 files, separating ebuilds into multiple files, and developing a new
22 bash-like language specific to portage and writing a parser and
23 interpreter for this language which is included in portage. (In order
24 for the development of portage-ng to begin, it would be useful to
25 decide on issues like this.)
26
27 A second disadvantage to multiple language use is that it will increase
28 the compile-time dependencies (and depending on the language, possibly
29 the runtime dependencies) of portage unnecessarily. If portage is
30 written using, for example, Python, Prolog, and Ruby, then _EVERY_
31 Gentoo system that compiles portage will need Python, Prolog, and Ruby
32 installed, and possibly have them to run portage also. Python is less
33 of a problem because it is relatively common, but Prolog and Ruby are
34 often not found on systems. Additional dependencies further complicate
35 the handling in portage of those dependency packages (as is the case
36 currently with python), especially if they are a runtime dependency.
37 Additional dependencies also bloat the size of the stage tarballs.
38
39 A third issue is that while we all have our pet language, and in
40 principle it might seem useful to support `all' languages, so that we
41 can each use our language of choice in writing portage, in practice
42 support for each language will likely require significant effort to
43 write the necessary interface code. Furthermore, and perhaps a greater
44 problem, in practice it will be necessary for portage developers to
45 learn all of the languages used in the various parts of portage. Thus,
46 instead of some people having to learn and use a single language that
47 they might prefer to avoid, all developers would need to learn and use
48 multiple languages which they would prefer to avoid.
49
50 The main advantage cited for using multiple languages has been that
51 certain languages are `better' for doing certain types of things. For
52 example, it has been argued that Prolog should be used for dependency
53 calculation because programs written in it can be proven correct. I
54 fail to understand why it is particularly important that the portage
55 dependency checker be ``provably correct.'' To my knowledge, there have
56 been no significant problems with there being bugs in the current
57 dependency checker. It is also useful to note that there are tens of
58 thousands or more software programs, far more critical than a portage
59 dependency checker, which are not written in languages in which they can
60 be proven correct, but which operate quite adequately. It is perhaps
61 useful for the software on the Spirit Mars rover to be provably correct,
62 but that is simply not a useful guideline for the portage dependency
63 checker. It has also been argued that it is more convenient to write a
64 dependency checker in Prolog, compared to other languages. I can
65 assure the reader that implementing a topological sort algorithm in any
66 language is not overly complex; certainly not sufficiently complex to
67 justify doing it in another language and dealing with language
68 interoperability problems and adding an additional compile-time and
69 possibly runtime dependency.
70
71 It has been mentioned that portage-ng will have a `component-based'
72 design. It is not clear what the term `component-based' means exactly,
73 and so it would be useful to get some clarification. To me that term
74 suggests that portage-ng will use some sort of complex runtime dynamic
75 linking model. I would argue that it may be simpler to simply handle
76 all optional functionality, if there is any, through the use of USE
77 flags, rather than going to the trouble of supporting such a complex
78 model. Clearly, some type of static linking could only be supported if
79 portage-ng did not depend on a runtime dynamic linking model.
80 Advantages to allow static linking include greater robustness in cases
81 of failure of various shared libraries, and the dependencies of portage
82 will not need to be handled as carefully, because there will be no risk
83 of breaking a statically-linked portage.
84
85 --
86 Jeremy Maitin-Shepard

Replies