[gentoo-portage-dev] Re: How to have several gentoo repos on one machine? - gentoo-portage-dev

From:	Duncan <1i5t5.duncan@×××.net>
To:	gentoo-portage-dev@l.g.o
Subject:	[gentoo-portage-dev] Re: How to have several gentoo repos on one machine?
Date:	Thu, 22 Oct 2015 11:27:20
Message-Id:	`pan$10876$c7dac759$895e9925$b575df20@cox.net`
In Reply to:	Re: [gentoo-portage-dev] Re: How to have several gentoo repos on one machine? by Joakim Tjernlund

1

Joakim Tjernlund posted on Thu, 22 Oct 2015 06:48:06 +0000 as excerpted:

2

3

> On Thu, 2015-10-22 at 02:29 +0000, Duncan wrote:

4

>> Joakim Tjernlund posted on Wed, 21 Oct 2015 11:08:02 +0000 as

5

>> excerpted:

6

>>

7

>> > I need to more than one gentoo repo in my computer.

8

9

10

>> > this did not work as "portageq repositories_configuration /"

11

>> > complains:

12

>> > !!! Section 'tm-cusfpv3' in repos.conf has name different from

13

>> > repository name 'gentoo' set inside repository

14

>> >

15

>> > I figured the name in repos.conf would just override

16

>> > /usr/local/portage/tm-cusfpv3/profiles/repo_name ?

17

>>

18

>> While it's not quite clear to me either why you'd need two identical

19

>> gentoo repos[...]

20

>

21

> I use one for my host and the other for cross building our products root

22

> FS and they are not in sync. That rules out the aliases I guess?

23

24

I think so, yes.  However, as a user I'd really like to understand 

25

aliases, their purpose, and at high level how they work, and the current 

26

manpage doesn't help so much there.  Without that I really don't know 

27

enough about aliases to say anything further.

28

29

But meanwhile, I was sort of in your situation for awhile as I was 

30

building for my main amd64 system and in a 32-bit chroot for a 32-bit-

31

only netbook, with a separate portage config for each, and while in my 

32

case they both pointed at the same gentoo repo and overlays using bind-

33

mounts into the 32-bit chroot, without those bind-mounts it would have 

34

been two parallel and separate portage installations, one configured for 

35

32-bit x86 in the chroot, one configured for amd64 outside the chroot.

36

37

And that's what I'd use in your case, two separate portage installations, 

38

which could then of course have separate configs.

39

40

That said, while I understand the principle of stability, and if it's 

41

private there shouldn't be legal issues, I still wonder at the idea.  One 

42

of the reasons I could and did use bind-mounts and thus literally the 

43

same repos in my case, was that the gentoo repo is the gentoo repo, and 

44

other than the possibility of snapshotting it for archiving purposes (and 

45

of using one of those snapshots should it be needed, say because I left 

46

the netbook unupgraded for too long and it could no longer jump from the 

47

version on it to current), I considered the gentoo repo the gentoo repo, 

48

and a local copy that wasn't synced would no longer represent the present 

49

state of the gentoo repo.

50

51

If I were to un-sync for other than very temporary recovery purposes, I'd 

52

thus want to call the repo something other than gentoo, since it would no 

53

longer represent the current state of the true gentoo repo.

54

55

And if I made changes to that unsynced repo, say to stabilize it further 

56

(and if I wasn't doing so, what would be the purpose of keeping it 

57

unsynced for so long), that'd be even /more/ reason to call it something 

58

other than gentoo, because then it would no longer properly represent 

59

that state of the true gentoo repo at /any/ time.

60

61

But having the git repo available changes the way that works 

62

dramatically, see below...

63

64

> I don't plan on renaming anything in the repo_name file, it should just

65

> be ignored and the name I have select in repos.conf should used.

66

>

67

> I don't see any value in repo_name file now that we have the new

68

> repos.conf, possibly it could be a fallback only for PORTDIR users.

69

70

The portage devs are welcome to contradict me if they like, but AFAIK, it 

71

still serves the useful purpose of double-checking that you don't for 

72

instance have two repos accidently syncing to the same place, and that 

73

the names used to refer to the repo stay consistent.  (Again, part of the 

74

need for consistency would be due to the metadata and thus metadata cache 

75

being repo-specific, automatically invalidating the cache if the remote 

76

name and local name don't agree.  Locally regenerating the metadata cache 

77

will go a long way to avoiding that problem, but it's an expensive 

78

operation that most users won't want to do, and keeping the names in sync 

79

helps avoid inadvertent cache invalidation.)

80

81

>> I actually use gentoo's git-based usersync

82

>> repo on github, now, and thus don't rsync any repos all any more, here,

83

>> and git of course has its git-ignore feature/files, which I use now.

84

>> But I used rsync's exclude as suggested above, for years.  Worked fine.

85

>> =:^)

86

>

87

> Nice, I am heading the same was, using git all the way but I not there

88

> yet.

89

> One problem is that using git is disk space I think. Files are just

90

> ignored but still present in the repo so syncing to our embedded target

91

> will take a lot more space.

92

> Any thoughts on that?

93

94

Well, at least once your trailing target (presumably the embedded repo) 

95

is safely past the git repo's epoc (the date imported from cvs, for our 

96

purposes), git flexibility will let you checkout older versions on-

97

demand, then checkout HEAD once again.

98

99

In a scenario where both copies aren't likely to be used at once, you can 

100

use a single local git repo and just checkout the version of it you want 

101

dynamically.

102

103

In a concurrent-use scenario, there's a few ways you could go.  What I'd 

104

probably do would be two git repos, one synced to gentoo-remote, 

105

presumably with full git history (or at least git history back to the 

106

other checkout), the other locally checked out from the "current" repo, 

107

at the checkout of interest.

108

109

If you're doing this sort of thing then the sort of space the git repo 

110

takes up shouldn't be a big concern, but in case it is, it's worth noting 

111

that given the right filesystem and dedup tools, there will only actually 

112

be the one copy of "common" data on-storage, with each of those two git 

113

repos reflinking (think a lower-level hard-link) data that's common 

114

between them, which will be pretty much everything in the earlier one 

115

since the current one will have the earlier one as history.

116

117

I'm a regular on the btrfs list, for instance, and on btrfs, a very space 

118

efficient solution would be to originally do an initial git checkout of 

119

the older, presumably embedded target repo, create a btrfs snapshot out 

120

of it, and then (in the working copy, not the snapshot) git-pull from the 

121

remote to update to current.  The btrfs snapshot will have locked in 

122

place the older version in the snapshot, while the git pull in the 

123

working copy will create any new files, delete any remote-deleted ones 

124

(but they'll still be in the btrfs snapshot), reflink any old files, and 

125

reflink but then cow (copy-on-write) any updated files.  For this 

126

scenario you wouldn't even need any additional dedup tools, tho if you 

127

had them, they'd probably save even more space (multiple versions of the 

128

same package often have very nearly the same ebuilds, for instance, 

129

differing in little more than name, and dedup would catch and dedup these 

130

as well, while the pure native btrfs snapshot method probably wouldn't).

131

132

Of course I'm conservative enough that I only call btrfs "stabilizing and 

133

maturing, but not fully stable or mature yet", for various reasons you'll 

134

see enumerated in my posts on the btrfs list, but if you're following the 

135

standard sysadmin backup rule, if it's not backed up, by definition you 

136

value it less than the time/resource necessary for doing the backup, 

137

factored against the risk of actually needing the backup (thus nicely 

138

dealing with second and third and Nth level backups as well, since the 

139

risk of actually needing them drops accordingly, but they may well be 

140

worth it out to some higher value of N for very highly valued data), then 

141

in general I and others have found it stable /enough/.

142

143

I guess xfs and ext4 both have dedup features as well, but I went 

144

straight from reiserfs to btrfs and am thus not really familiar with 

145

them.  (And zfs of course is the more mature btrfs, but there's some down 

146

sides like needing loads of ecc-strongly-recommended ram, as well as 

147

license concerns for people like me, that may eliminate it from 

148

consideration even if it'd otherwise really be a stable and reliable 

149

version of where btrfs is still headed, but hasn't yet arrived.)

150

151

--

152

Duncan - List replies preferred.   No HTML msgs.

153

"Every nonfree program has a lord, a master --

154

and if you use the program, he is your master."  Richard Stallman

Gentoo Archives: gentoo-portage-dev

Replies

1	Joakim Tjernlund posted on Thu, 22 Oct 2015 06:48:06 +0000 as excerpted:
2
3	> On Thu, 2015-10-22 at 02:29 +0000, Duncan wrote:
4	>> Joakim Tjernlund posted on Wed, 21 Oct 2015 11:08:02 +0000 as
5	>> excerpted:
6	>>
7	>> > I need to more than one gentoo repo in my computer.
8
9
10	>> > this did not work as "portageq repositories_configuration /"
11	>> > complains:
12	>> > !!! Section 'tm-cusfpv3' in repos.conf has name different from
13	>> > repository name 'gentoo' set inside repository
14	>> >
15	>> > I figured the name in repos.conf would just override
16	>> > /usr/local/portage/tm-cusfpv3/profiles/repo_name ?
17	>>
18	>> While it's not quite clear to me either why you'd need two identical
19	>> gentoo repos[...]
20	>
21	> I use one for my host and the other for cross building our products root
22	> FS and they are not in sync. That rules out the aliases I guess?
23
24	I think so, yes. However, as a user I'd really like to understand
25	aliases, their purpose, and at high level how they work, and the current
26	manpage doesn't help so much there. Without that I really don't know
27	enough about aliases to say anything further.
28
29	But meanwhile, I was sort of in your situation for awhile as I was
30	building for my main amd64 system and in a 32-bit chroot for a 32-bit-
31	only netbook, with a separate portage config for each, and while in my
32	case they both pointed at the same gentoo repo and overlays using bind-
33	mounts into the 32-bit chroot, without those bind-mounts it would have
34	been two parallel and separate portage installations, one configured for
35	32-bit x86 in the chroot, one configured for amd64 outside the chroot.
36
37	And that's what I'd use in your case, two separate portage installations,
38	which could then of course have separate configs.
39
40	That said, while I understand the principle of stability, and if it's
41	private there shouldn't be legal issues, I still wonder at the idea. One
42	of the reasons I could and did use bind-mounts and thus literally the
43	same repos in my case, was that the gentoo repo is the gentoo repo, and
44	other than the possibility of snapshotting it for archiving purposes (and
45	of using one of those snapshots should it be needed, say because I left
46	the netbook unupgraded for too long and it could no longer jump from the
47	version on it to current), I considered the gentoo repo the gentoo repo,
48	and a local copy that wasn't synced would no longer represent the present
49	state of the gentoo repo.
50
51	If I were to un-sync for other than very temporary recovery purposes, I'd
52	thus want to call the repo something other than gentoo, since it would no
53	longer represent the current state of the true gentoo repo.
54
55	And if I made changes to that unsynced repo, say to stabilize it further
56	(and if I wasn't doing so, what would be the purpose of keeping it
57	unsynced for so long), that'd be even /more/ reason to call it something
58	other than gentoo, because then it would no longer properly represent
59	that state of the true gentoo repo at /any/ time.
60
61	But having the git repo available changes the way that works
62	dramatically, see below...
63
64	> I don't plan on renaming anything in the repo_name file, it should just
65	> be ignored and the name I have select in repos.conf should used.
66	>
67	> I don't see any value in repo_name file now that we have the new
68	> repos.conf, possibly it could be a fallback only for PORTDIR users.
69
70	The portage devs are welcome to contradict me if they like, but AFAIK, it
71	still serves the useful purpose of double-checking that you don't for
72	instance have two repos accidently syncing to the same place, and that
73	the names used to refer to the repo stay consistent. (Again, part of the
74	need for consistency would be due to the metadata and thus metadata cache
75	being repo-specific, automatically invalidating the cache if the remote
76	name and local name don't agree. Locally regenerating the metadata cache
77	will go a long way to avoiding that problem, but it's an expensive
78	operation that most users won't want to do, and keeping the names in sync
79	helps avoid inadvertent cache invalidation.)
80
81	>> I actually use gentoo's git-based usersync
82	>> repo on github, now, and thus don't rsync any repos all any more, here,
83	>> and git of course has its git-ignore feature/files, which I use now.
84	>> But I used rsync's exclude as suggested above, for years. Worked fine.
85	>> =:^)
86	>
87	> Nice, I am heading the same was, using git all the way but I not there
88	> yet.
89	> One problem is that using git is disk space I think. Files are just
90	> ignored but still present in the repo so syncing to our embedded target
91	> will take a lot more space.
92	> Any thoughts on that?
93
94	Well, at least once your trailing target (presumably the embedded repo)
95	is safely past the git repo's epoc (the date imported from cvs, for our
96	purposes), git flexibility will let you checkout older versions on-
97	demand, then checkout HEAD once again.
98
99	In a scenario where both copies aren't likely to be used at once, you can
100	use a single local git repo and just checkout the version of it you want
101	dynamically.
102
103	In a concurrent-use scenario, there's a few ways you could go. What I'd
104	probably do would be two git repos, one synced to gentoo-remote,
105	presumably with full git history (or at least git history back to the
106	other checkout), the other locally checked out from the "current" repo,
107	at the checkout of interest.
108
109	If you're doing this sort of thing then the sort of space the git repo
110	takes up shouldn't be a big concern, but in case it is, it's worth noting
111	that given the right filesystem and dedup tools, there will only actually
112	be the one copy of "common" data on-storage, with each of those two git
113	repos reflinking (think a lower-level hard-link) data that's common
114	between them, which will be pretty much everything in the earlier one
115	since the current one will have the earlier one as history.
116
117	I'm a regular on the btrfs list, for instance, and on btrfs, a very space
118	efficient solution would be to originally do an initial git checkout of
119	the older, presumably embedded target repo, create a btrfs snapshot out
120	of it, and then (in the working copy, not the snapshot) git-pull from the
121	remote to update to current. The btrfs snapshot will have locked in
122	place the older version in the snapshot, while the git pull in the
123	working copy will create any new files, delete any remote-deleted ones
124	(but they'll still be in the btrfs snapshot), reflink any old files, and
125	reflink but then cow (copy-on-write) any updated files. For this
126	scenario you wouldn't even need any additional dedup tools, tho if you
127	had them, they'd probably save even more space (multiple versions of the
128	same package often have very nearly the same ebuilds, for instance,
129	differing in little more than name, and dedup would catch and dedup these
130	as well, while the pure native btrfs snapshot method probably wouldn't).
131
132	Of course I'm conservative enough that I only call btrfs "stabilizing and
133	maturing, but not fully stable or mature yet", for various reasons you'll
134	see enumerated in my posts on the btrfs list, but if you're following the
135	standard sysadmin backup rule, if it's not backed up, by definition you
136	value it less than the time/resource necessary for doing the backup,
137	factored against the risk of actually needing the backup (thus nicely
138	dealing with second and third and Nth level backups as well, since the
139	risk of actually needing them drops accordingly, but they may well be
140	worth it out to some higher value of N for very highly valued data), then
141	in general I and others have found it stable /enough/.
142
143	I guess xfs and ext4 both have dedup features as well, but I went
144	straight from reiserfs to btrfs and am thus not really familiar with
145	them. (And zfs of course is the more mature btrfs, but there's some down
146	sides like needing loads of ecc-strongly-recommended ram, as well as
147	license concerns for people like me, that may eliminate it from
148	consideration even if it'd otherwise really be a stable and reliable
149	version of where btrfs is still headed, but hasn't yet arrived.)
150
151	--
152	Duncan - List replies preferred. No HTML msgs.
153	"Every nonfree program has a lord, a master --
154	and if you use the program, he is your master." Richard Stallman