public inbox for gentoo-catalyst@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-catalyst] catalyst changes for improving automation
@ 2020-11-03  3:44 Matt Turner
  2020-11-03 10:54 ` Daniel Cordero
  2020-11-03 13:04 ` Brian Dolbec
  0 siblings, 2 replies; 6+ messages in thread
From: Matt Turner @ 2020-11-03  3:44 UTC (permalink / raw
  To: gentoo-catalyst, gentoo-releng

The catalyst-auto automation scripts live in a repo separate from
catalyst. That increases the difficulty of changing catalyst's
interface, and it doesn't seem to offer any advantages otherwise.
(Keeping build specs in a separate repo allows them to be updated
independent of catalyst and that is valuable). Additionally, since the
primary way catalyst is used is via this automation, it makes sense to
support this workflow in catalyst directly.

But to get there, there are some changes to catalyst that I think are
improvements on their own and simplify the path to integrating
automation capabilities directly into catalyst. That's what I'd like
to discuss here.

I'd like to:

 1) Replace the custom .spec file format with TOML

 2) Combine .spec file sequences (e.g., stage1 -> stage2 -> stage3 ->
livecd-stage1 -> livecd-stage2) into a single file. I suggest naming
this a ".build" file. This will also allow us to remove the redundant
information that currently has to be specified in stage1.spec,
stage2.spec, stage3.spec, like rel_type, version, profile, etc. It
also means that we remove the nonsensical ability to change settings
from one stage to the next that should not change (e.g., rel_type,
version).

 3) Add ability to denote which stage builds produce artifacts we care
about (and want to save and/or upload) and which are just temporary.
If they're temporary (e.g., a stage1 build) we can delete the artifact
after the build sequence has no further use of it, and we can skip
compressing the result, etc.


To that end, I'm starting by figuring out what I would like the new
spec file format to look like. Below are some open questions and then
a strawman new-style spec file.

• The .spec files in releng.git are really templates that are not
directly usable without sed'ing @REPO_DIR@ and @TIMESTAMP@. It would
be nice if they were directly usable as that would reduce confusion
from users.
  • Can we make them directly usable?
  • Perhaps we can make catalyst handle the replacements directly?
    • Calculating @TIMESTAMP@ is trivially doable—we do it today (see below)
    • We could configure @REPO_DIR@ in catalyst.conf and let catalyst
do the replacement, or we could just make the field relative to some
path specified in catalyst.conf?

• In the current automation scripts, we generate a value for
@TIMESTAMP@ from the git HEAD used in creating the snapshot.
  • Would be nice to remove the dependence on the squashfs snapshot
generation—not difficult to do

• Can we generate and upload a .build file with replacements done to
make stage builds more easily reproducible? Seems easy.

Thoughts? Are there advantages to the current system I'm not considering?

ppc.build:

[build]
arch = "ppc"
subarch = "ppc"

version = "@TIMESTAMP@"
rel_type = "default"
profile = "default/linux/powerpc/ppc32/17.0"
source_subpath = "default/stage3-ppc-latest"
portage_confdir = "@REPO_DIR@/releases/portage/stages"

[build.snapshot]
snapshot = "@TIMESTAMP@"

[build.stage1]
update_seed = "yes"
update_seed_command = "--update --deep --newuse @world"

[build.stage2]
skip = true # Stage3 will be built directly from stage1

[build.stage3]
compression = "xz"

[build.iso]
portage_confdir = "@REPO_DIR@/releases/portage/isos"

[build.iso.stage1]
use = [
    "compile-locales",
    "fbcon",
    "ipv6",
    "livecd",
    "modules",
    "ncurses",
    "nls",
    "nptl",
    "pam",
    "readline",
    "socks5",
    "ssl",
    "static-libs",
    "unicode",
    "xml",
]

packages = [
    "app-admin/pwgen",
    "app-admin/syslog-ng",
    "app-arch/unzip",
    "app-crypt/gnupg",
    "app-laptop/pbbuttonsd",
    "app-misc/livecd-tools",
    "app-misc/screen",
    "app-portage/mirrorselect",
    "app-text/wgetpaste",
    "net-analyzer/tcptraceroute",
    "net-analyzer/traceroute",
    "net-dialup/mingetty",
    "net-dialup/pptpclient",
    "net-dialup/rp-pppoe",
    "net-fs/cifs-utils",
    "net-fs/nfs-utils",
    "net-irc/irssi",
    "net-misc/dhcpcd",
    "net-misc/iputils",
    "net-misc/ntp",
    "net-misc/openssh",
    "net-misc/rdate",
    "net-misc/rsync",
    "net-wireless/wireless-tools",
    "net-wireless/wpa_supplicant",
    "sys-apps/busybox",
    "sys-apps/ethtool",
    "sys-apps/fxload",
    "sys-apps/hdparm",
    "sys-apps/hwsetup",
    "sys-apps/ibm-powerpc-utils",
    "sys-apps/iproute2",
    "sys-apps/lm-sensors",
    "sys-apps/memtester",
    "sys-apps/pcmciautils",
    "sys-apps/powerpc-utils",
    "sys-apps/sdparm",
    "sys-block/parted",
    "sys-boot/grub",
    "sys-firmware/b43-firmware",
    "sys-firmware/b43legacy-firmware",
    "sys-fs/btrfs-progs",
    "sys-fs/dosfstools",
    "sys-fs/e2fsprogs",
    "sys-fs/hfsplusutils",
    "sys-fs/hfsutils",
    "sys-fs/iprutils",
    "sys-fs/jfsutils",
    "sys-fs/lvm2",
    "sys-fs/mac-fdisk",
    "sys-fs/mdadm",
    "sys-fs/ntfs3g",
    "sys-fs/reiserfsprogs",
    "sys-fs/xfsprogs",
    "sys-libs/gpm",
    "www-client/links",
]

[build.iso.stage2]
fstype = "squashfs"
iso = "/var/tmp/catalyst/builds/default/install-powerpc-minimal-@TIMESTAMP@.iso"
type = "gentoo-release-minimal"

rcadd = [
    ["pbbuttonsd", "default"],
]

unmerge = [
    "app-admin/eselect",
    "app-admin/eselect-ctags",
    "app-admin/eselect-vi",
    "app-admin/perl-cleaner",
    "app-admin/python-updater",
    "app-arch/cpio",
    "dev-libs/gmp",
    "dev-libs/libxml2",
    "dev-libs/mpfr",
    "dev-libs/popt",
    "dev-python/pycrypto",
    "dev-util/pkgconfig",
    "perl-core/PodParser",
    "perl-core/Test-Harness",
    "sys-apps/debianutils",
    "sys-apps/diffutils",
    "sys-apps/groff",
    "sys-apps/man-db",
    "sys-apps/man-pages",
    "sys-apps/miscfiles",
    "sys-apps/sandbox",
    "sys-apps/texinfo",
    "sys-devel/autoconf",
    "sys-devel/autoconf-wrapper",
    "sys-devel/automake",
    "sys-devel/automake-wrapper",
    "sys-devel/binutils",
    "sys-devel/binutils-config",
    "sys-devel/bison",
    "sys-devel/flex",
    "sys-devel/gcc",
    "sys-devel/gcc-config",
    "sys-devel/gettext",
    "sys-devel/gnuconfig",
    "sys-devel/libtool",
    "sys-devel/m4",
    "sys-devel/make",
    "sys-devel/patch",
    "sys-libs/db",
    "sys-libs/gdbm",
    "sys-libs/libkudzu",
    "sys-kernel/genkernel",
    "sys-kernel/linux-headers",
]

empty = [
    "/boot",
    "/boot/initr*",
    "/boot/kernel*",
    "/boot/System*",
    "/etc/*-",
    "/etc/cron.daily",
    "/etc/cron.hourly",
    "/etc/cron.monthly",
    "/etc/cron.weekly",
    "/etc/default/audioctl",
    "/etc/dispatch-conf.conf",
    "/etc/env.d/05binutils",
    "/etc/env.d/05gcc",
    "/etc/etc-update.conf",
    "/etc/genkernel.conf",
    "/etc/hosts.bck",
    "/etc/issue*",
    "/etc/logrotate.d",
    "/etc/make.conf*",
    "/etc/make.globals",
    "/etc/make.profile",
    "/etc/man.conf",
    "/etc/modules.autoload.d",
    "/etc/*.old",
    "/etc/resolv.conf",
    "/etc/runlevels/single",
    "/etc/skel",
    "/lib64/dev-state",
    "/lib64/udev-state",
    "/lib*/*.a",
    "/lib*/cpp",
    "/lib/dev-state",
    "/lib*/*.la",
    "/lib/udev-state",
    "/root/.bash_history",
    "/root/.ccache",
    "/root/.viminfo",
    "/sbin/fsck.cramfs",
    "/sbin/fsck.minix",
    "/sbin/mkfs.bfs",
    "/sbin/mkfs.cramfs",
    "/sbin/mkfs.minix",
    "/sbin/*.static",
    "/tmp",
    "/usr/bin/addr2line",
    "/usr/bin/ar",
    "/usr/bin/as",
    "/usr/bin/audioctl",
    "/usr/bin/c++*",
    "/usr/bin/cc",
    "/usr/bin/cjpeg",
    "/usr/bin/cpp",
    "/usr/bin/djpeg",
    "/usr/bin/ebuild",
    "/usr/bin/egencache",
    "/usr/bin/elftoaout",
    "/usr/bin/emerge",
    "/usr/bin/emerge-webrsync",
    "/usr/bin/emirrordist",
    "/usr/bin/f77",
    "/usr/bin/g++*",
    "/usr/bin/g77",
    "/usr/bin/gcc*",
    "/usr/bin/genkernel",
    "/usr/bin/gprof",
    "/usr/bin/jpegtran",
    "/usr/bin/ld",
    "/usr/bin/libpng*",
    "/usr/bin/nm",
    "/usr/bin/objcopy",
    "/usr/bin/objdump",
    "/usr/bin/piggyback*",
    "/usr/bin/portageq",
    "/usr/bin/powerpc64-unknown-linux-gnu-*",
    "/usr/bin/powerpc-unknown-linux-gnu-*",
    "/usr/bin/ranlib",
    "/usr/bin/readelf",
    "/usr/bin/repoman",
    "/usr/bin/size",
    "/usr/bin/strings",
    "/usr/bin/strip",
    "/usr/bin/tbz2tool",
    "/usr/bin/xpak",
    "/usr/bin/yacc",
    "/usr/diet/include",
    "/usr/diet/man",
    "/usr/include",
    "/usr/lib64/awk",
    "/usr/lib64/ccache",
    "/usr/lib64/gcc-config",
    "/usr/lib64/gconv",
    "/usr/lib64/nfs",
    "/usr/lib64/perl5/site_perl",
    "/usr/lib64/portage",
    "/usr/lib64/python*/test",
    "/usr/lib64/X11/config",
    "/usr/lib64/X11/doc",
    "/usr/lib64/X11/etc",
    "/usr/lib*/*.a",
    "/usr/lib*/gcc-lib/*/*/libgcj*",
    "/usr/lib*/*.la",
    "/usr/lib*/perl5/site_perl",
    "/usr/local",
    "/usr/portage",
    "/usr/powerpc64-unknown-linux-gnu",
    "/usr/powerpc-unknown-linux-gnu",
    "/usr/sbin/archive-conf",
    "/usr/sbin/dispatch-conf",
    "/usr/sbin/emaint",
    "/usr/sbin/env-update",
    "/usr/sbin/etc-update",
    "/usr/sbin/fb*",
    "/usr/sbin/fixpackages",
    "/usr/sbin/quickpkg",
    "/usr/sbin/regenworld",
    "/usr/share/aclocal",
    "/usr/share/baselayout",
    "/usr/share/binutils-data",
    "/usr/share/consolefonts/1*",
    "/usr/share/consolefonts/7*",
    "/usr/share/consolefonts/8*",
    "/usr/share/consolefonts/9*",
    "/usr/share/consolefonts/a*",
    "/usr/share/consolefonts/A*",
    "/usr/share/consolefonts/c*",
    "/usr/share/consolefonts/C*",
    "/usr/share/consolefonts/dr*",
    "/usr/share/consolefonts/E*",
    "/usr/share/consolefonts/g*",
    "/usr/share/consolefonts/G*",
    "/usr/share/consolefonts/i*",
    "/usr/share/consolefonts/k*",
    "/usr/share/consolefonts/l*",
    "/usr/share/consolefonts/L*",
    "/usr/share/consolefonts/M*",
    "/usr/share/consolefonts/partialfonts",
    "/usr/share/consolefonts/r*",
    "/usr/share/consolefonts/R*",
    "/usr/share/consolefonts/s*",
    "/usr/share/consolefonts/t*",
    "/usr/share/consolefonts/v*",
    "/usr/share/consoletrans",
    "/usr/share/dict",
    "/usr/share/doc",
    "/usr/share/emacs",
    "/usr/share/et",
    "/usr/share/gcc-data",
    "/usr/share/genkernel",
    "/usr/share/gettext",
    "/usr/share/glib-2.0",
    "/usr/share/gnuconfig",
    "/usr/share/gtk-doc",
    "/usr/share/i18n",
    "/usr/share/info",
    "/usr/share/lcms",
    "/usr/share/libtool",
    "/usr/share/locale",
    "/usr/share/man",
    "/usr/share/misc/*.old",
    "/usr/share/rfc",
    "/usr/share/ss",
    "/usr/share/state",
    "/usr/share/texinfo",
    "/usr/share/unimaps",
    "/usr/share/zoneinfo",
    "/usr/src",
    "/var/cache",
    "/var/empty",
    "/var/lib/portage",
    "/var/log",
    "/var/spool",
    "/var/state",
    "/var/tmp",
]

# Just for bootloader ordering purposes
kernel = ["ppc64", "ppc32", "ibmpower",]

[build.iso.stage2.kernel.ibmpower]
sources = "sys-kernel/gentoo-sources"
config = "../../../kconfig/powerpc/installcd-ibm-4.19.config"
console = ["ttyS0,9600", "hvc0", "hvsi0",]
gk_kernargs = [
    "--kernel-cc='gcc -m64'",
    "--kernel-ld='ld -m elf64ppc'",
    "--kernel-as='as -a64'",
]

[build.iso.stage2.kernel.ppc64]
sources = "sys-kernel/gentoo-sources"
config = "../../../kconfig/powerpc/ppc64.config"
console = ["ttyS0,57600",]
gk_kernargs = [
    "--kernel-cc='gcc -m64'",
    "--kernel-ld='ld -m elf64ppc'",
    "--kernel-as='as -a64'",
]

[build.iso.stage2.kernel.ppc32]
sources = "sys-kernel/gentoo-sources"
config = "../../../kconfig/powerpc/ppc32.config"


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-catalyst] catalyst changes for improving automation
  2020-11-03  3:44 [gentoo-catalyst] catalyst changes for improving automation Matt Turner
@ 2020-11-03 10:54 ` Daniel Cordero
  2020-11-03 18:19   ` Matt Turner
  2020-11-03 18:36   ` Matt Turner
  2020-11-03 13:04 ` Brian Dolbec
  1 sibling, 2 replies; 6+ messages in thread
From: Daniel Cordero @ 2020-11-03 10:54 UTC (permalink / raw
  To: gentoo-catalyst; +Cc: gentoo-releng

On Mon, Nov 02, 2020 at 10:44:07PM -0500, Matt Turner wrote:
> The catalyst-auto automation scripts live in a repo separate from
> catalyst. That increases the difficulty of changing catalyst's
> interface, and it doesn't seem to offer any advantages otherwise.
> (Keeping build specs in a separate repo allows them to be updated
> independent of catalyst and that is valuable). Additionally, since the
> primary way catalyst is used is via this automation, it makes sense to
> support this workflow in catalyst directly.
> 

What would be more heavily impacted are those users who may not already
have infra set up to do builds or just starting out using catalyst for
the first time and haven't written their own automation.

I suggest prioritising the collection of up-to-date documentation,
especially regarding running catalyst manually, since it'll be
completely different to the literature that's currently out there.


> But to get there, there are some changes to catalyst that I think are
> improvements on their own and simplify the path to integrating
> automation capabilities directly into catalyst. That's what I'd like
> to discuss here.
> 
> I'd like to:
> 
>  1) Replace the custom .spec file format with TOML
> 

Fine. Aside from the extra quotes and commas, I'd be happy with any well
defined format that can handle strings and lists properly.

>  2) Combine .spec file sequences (e.g., stage1 -> stage2 -> stage3 ->
> livecd-stage1 -> livecd-stage2) into a single file. I suggest naming
> this a ".build" file. This will also allow us to remove the redundant
> information that currently has to be specified in stage1.spec,
> stage2.spec, stage3.spec, like rel_type, version, profile, etc. It
> also means that we remove the nonsensical ability to change settings
> from one stage to the next that should not change (e.g., rel_type,
> version).
> 

How would a target that depends on a different rel_type work? Forks in
the dependency tree.

>  3) Add ability to denote which stage builds produce artifacts we care
> about (and want to save and/or upload) and which are just temporary.
> If they're temporary (e.g., a stage1 build) we can delete the artifact
> after the build sequence has no further use of it, and we can skip
> compressing the result, etc.
> 

This feature should (haven't tested) already exist - it's just not
documented.

compression_mode: rsync
options=['seedcache']

or don't call 'capture' and/or 'remove_chroot' in action_/finish_sequence.

> 
> To that end, I'm starting by figuring out what I would like the new
> spec file format to look like. Below are some open questions and then
> a strawman new-style spec file.
> 
> • The .spec files in releng.git are really templates that are not
> directly usable without sed'ing @REPO_DIR@ and @TIMESTAMP@. It would
> be nice if they were directly usable as that would reduce confusion
> from users.
>   • Can we make them directly usable?
>   • Perhaps we can make catalyst handle the replacements directly?
>     • Calculating @TIMESTAMP@ is trivially doable—we do it today (see below)

Maybe a strftime() template, or even fstring-like tokens?
(e.g. "{year}-{month}-{day}")

>     • We could configure @REPO_DIR@ in catalyst.conf and let catalyst
> do the replacement, or we could just make the field relative to some
> path specified in catalyst.conf?
> 

While nice to have, I don't agree with locking users into a particular
repository layout.

> • In the current automation scripts, we generate a value for
> @TIMESTAMP@ from the git HEAD used in creating the snapshot.
>   • Would be nice to remove the dependence on the squashfs snapshot
> generation—not difficult to do
> 

I have no comment on this.

> • Can we generate and upload a .build file with replacements done to
> make stage builds more easily reproducible? Seems easy.
> 

These can just be artifacts from the build.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-catalyst] catalyst changes for improving automation
  2020-11-03  3:44 [gentoo-catalyst] catalyst changes for improving automation Matt Turner
  2020-11-03 10:54 ` Daniel Cordero
@ 2020-11-03 13:04 ` Brian Dolbec
  1 sibling, 0 replies; 6+ messages in thread
From: Brian Dolbec @ 2020-11-03 13:04 UTC (permalink / raw
  To: gentoo-catalyst

On Mon, 2 Nov 2020 22:44:07 -0500
Matt Turner <mattst88@gentoo.org> wrote:

> The catalyst-auto automation scripts live in a repo separate from
> catalyst. That increases the difficulty of changing catalyst's
> interface, and it doesn't seem to offer any advantages otherwise.
> (Keeping build specs in a separate repo allows them to be updated
> independent of catalyst and that is valuable). Additionally, since the
> primary way catalyst is used is via this automation, it makes sense to
> support this workflow in catalyst directly.
> 
> But to get there, there are some changes to catalyst that I think are
> improvements on their own and simplify the path to integrating
> automation capabilities directly into catalyst. That's what I'd like
> to discuss here.
> 


I have been thinking for the past few years that the automation could
benefit from using a buildbot to run the stages.   In that way it
would set the required variables, run the stages in sequence.  Upon
failure, buildbot makes it easy to re-run failed steps from where they
left off.  Or initiate unscheduled builds. It also makes it easy to see
detailed logs (by anyone with a browser if the buildbot is public
viewable) of the various steps for debugging, etc..


Perhaps with your new spec file, you add a varialbe that lists the
stages to run in sequence.  In that way it would preserve the old
capability of running single stages independantly or a full sequence.
Or perhaps a cli option to override the setting on an unedited spec
file.

ie:  

[build.stages]
stage1
stage2
stage3
livecd1
livecd2


sorry, not familiar with toml 

In that way a spec file could be edited simply to restart from any
point with a single variable edit without removing unused [build.???]
definitions for the full run.  This would be particularly useful when
troubleshooting hidden/delayed stage build fails.


Overall, I do agree that the releng automation scripts capabilities
should be part of the caltlyst repo in order for them to be up-to-date
with catalyst code.

I have limited time and resources lately, so can't help out much until
I get back home (probably Xmas) largely due to covid...  I only have
this small laptop and my eyes are not that good to be doing lots of
work on a tiny 14 inch screen and a stage run to take hours instead
of minutes... (yes, I got spoiled with 2 28 inch 4K displays, 16-core
128k ram system at home...)

Brian Dolbec
<dolsen@gentoo.org>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-catalyst] catalyst changes for improving automation
  2020-11-03 10:54 ` Daniel Cordero
@ 2020-11-03 18:19   ` Matt Turner
  2020-11-04 10:46     ` Daniel Cordero
  2020-11-03 18:36   ` Matt Turner
  1 sibling, 1 reply; 6+ messages in thread
From: Matt Turner @ 2020-11-03 18:19 UTC (permalink / raw
  To: gentoo-catalyst; +Cc: gentoo-releng

On Tue, Nov 3, 2020 at 5:56 AM Daniel Cordero <gentoo.catalyst@xxoo.ws> wrote:
>
> On Mon, Nov 02, 2020 at 10:44:07PM -0500, Matt Turner wrote:
> > The catalyst-auto automation scripts live in a repo separate from
> > catalyst. That increases the difficulty of changing catalyst's
> > interface, and it doesn't seem to offer any advantages otherwise.
> > (Keeping build specs in a separate repo allows them to be updated
> > independent of catalyst and that is valuable). Additionally, since the
> > primary way catalyst is used is via this automation, it makes sense to
> > support this workflow in catalyst directly.
> >
>
> What would be more heavily impacted are those users who may not already
> have infra set up to do builds or just starting out using catalyst for
> the first time and haven't written their own automation.
>
> I suggest prioritising the collection of up-to-date documentation,
> especially regarding running catalyst manually, since it'll be
> completely different to the literature that's currently out there.

I'm a bit unsure what you mean. Do you suggest prioritizing
documenting the current method of running catalyst before changing it?

> > But to get there, there are some changes to catalyst that I think are
> > improvements on their own and simplify the path to integrating
> > automation capabilities directly into catalyst. That's what I'd like
> > to discuss here.
> >
> > I'd like to:
> >
> >  1) Replace the custom .spec file format with TOML
> >
>
> Fine. Aside from the extra quotes and commas, I'd be happy with any well
> defined format that can handle strings and lists properly.
>
> >  2) Combine .spec file sequences (e.g., stage1 -> stage2 -> stage3 ->
> > livecd-stage1 -> livecd-stage2) into a single file. I suggest naming
> > this a ".build" file. This will also allow us to remove the redundant
> > information that currently has to be specified in stage1.spec,
> > stage2.spec, stage3.spec, like rel_type, version, profile, etc. It
> > also means that we remove the nonsensical ability to change settings
> > from one stage to the next that should not change (e.g., rel_type,
> > version).
> >
>
> How would a target that depends on a different rel_type work? Forks in
> the dependency tree.
>
> >  3) Add ability to denote which stage builds produce artifacts we care
> > about (and want to save and/or upload) and which are just temporary.
> > If they're temporary (e.g., a stage1 build) we can delete the artifact
> > after the build sequence has no further use of it, and we can skip
> > compressing the result, etc.
> >
>
> This feature should (haven't tested) already exist - it's just not
> documented.
>
> compression_mode: rsync
> options=['seedcache']

Hah! I was completely unaware of this. Thanks.

> or don't call 'capture' and/or 'remove_chroot' in action_/finish_sequence.
>
> >
> > To that end, I'm starting by figuring out what I would like the new
> > spec file format to look like. Below are some open questions and then
> > a strawman new-style spec file.
> >
> > • The .spec files in releng.git are really templates that are not
> > directly usable without sed'ing @REPO_DIR@ and @TIMESTAMP@. It would
> > be nice if they were directly usable as that would reduce confusion
> > from users.
> >   • Can we make them directly usable?
> >   • Perhaps we can make catalyst handle the replacements directly?
> >     • Calculating @TIMESTAMP@ is trivially doable—we do it today (see below)
>
> Maybe a strftime() template, or even fstring-like tokens?
> (e.g. "{year}-{month}-{day}")

One goal I have is to make it more transparent what is actually in a
particular stage tarball or ISO and along with that to make it easier
to reproduce the result.

Obviously we'll want to keep the ability to specify a particular
version, as you describe, but I think for Gentoo releases we will want
to continue using a timestamp that's unambiguously tied to the git
SHA1 of gentoo.git as is possible.

> >     • We could configure @REPO_DIR@ in catalyst.conf and let catalyst
> > do the replacement, or we could just make the field relative to some
> > path specified in catalyst.conf?
> >
>
> While nice to have, I don't agree with locking users into a particular
> repository layout.

Can you explain what you mean? I don't know how what I said would
require a particular repository layout.

Perhaps you're confused by the @REPO_DIR@ name? It is the path to the
releng.git repository (containing the .specs and the /etc/portage/
files) on the build machine and is not in any way connected with the
ebuild repositories.

The name predates my involvement, so don't blame me :)

> > • In the current automation scripts, we generate a value for
> > @TIMESTAMP@ from the git HEAD used in creating the snapshot.
> >   • Would be nice to remove the dependence on the squashfs snapshot
> > generation—not difficult to do
> >
>
> I have no comment on this.
>
> > • Can we generate and upload a .build file with replacements done to
> > make stage builds more easily reproducible? Seems easy.
> >
>
> These can just be artifacts from the build.

Yes, that's what I'm thinking too.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-catalyst] catalyst changes for improving automation
  2020-11-03 10:54 ` Daniel Cordero
  2020-11-03 18:19   ` Matt Turner
@ 2020-11-03 18:36   ` Matt Turner
  1 sibling, 0 replies; 6+ messages in thread
From: Matt Turner @ 2020-11-03 18:36 UTC (permalink / raw
  To: gentoo-catalyst; +Cc: gentoo-releng

Sorry, I missed one of the questions, but it requires a longer answer anyway.

On Tue, Nov 3, 2020 at 5:56 AM Daniel Cordero <gentoo.catalyst@xxoo.ws> wrote:
> How would a target that depends on a different rel_type work? Forks in
> the dependency tree.

I haven't given that a lot of thought yet, but it's something I would
like to have a plan for.

We build 32-bit and 64-bit systemd and non-systemd stages on SPARC, as
well as a bootable ISO.

32-bit     systemd: stage1 -> stage3
32-bit non-systemd: stage1 -> stage3
64-bit     systemd: stage1 -> stage3
64-bit non-systemd: stage1 -> stage3 -> livecd-stage1 -> livecd-stage2
(We skip stage2)

This means that we have some build chains that are entirely
independent from one another and could actually run in parallel. E.g.,
a 32-bit build could happen at the same time a 64-bit build runs
without any conflicts. Our SPARC system has 256 threads, so it would
like to build in parallel if possible.

Similarly, a stage1 build from one of the 32-bit build chains could
happen in parallel with a stage3 build from the other. We wouldn't
want to run the same type of build concurrently if they share a binary
package cache, because we would inevitably spend CPU cycles doing
duplicate work. E.g., the systemd stage3 build running in parallel
with the non-systemd stage3.

Whether all of those build chains should be specified in the same
".build" file... I don't know. It seems like it could get a bit
unwieldy.

Maybe we could have a top-level ".build" file that references each of
these build chains, described in other files? If we did that, that
would certainly allow us to specify a different rel_type per chain.

I'm not aware of cases where we'd want different rel_types in the same
chain. Do you know of such a case?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-catalyst] catalyst changes for improving automation
  2020-11-03 18:19   ` Matt Turner
@ 2020-11-04 10:46     ` Daniel Cordero
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel Cordero @ 2020-11-04 10:46 UTC (permalink / raw
  To: gentoo-catalyst; +Cc: gentoo-releng

On Tue, Nov 03, 2020 at 01:19:51PM -0500, Matt Turner wrote:
> On Tue, Nov 3, 2020 at 5:56 AM Daniel Cordero wrote:
> >
> > On Mon, Nov 02, 2020 at 10:44:07PM -0500, Matt Turner wrote:
> > > The catalyst-auto automation scripts live in a repo separate from
> > > catalyst. That increases the difficulty of changing catalyst's
> > > interface, and it doesn't seem to offer any advantages otherwise.
> > > (Keeping build specs in a separate repo allows them to be updated
> > > independent of catalyst and that is valuable). Additionally, since the
> > > primary way catalyst is used is via this automation, it makes sense to
> > > support this workflow in catalyst directly.
> > >
> >
> > What would be more heavily impacted are those users who may not already
> > have infra set up to do builds or just starting out using catalyst for
> > the first time and haven't written their own automation.
> >
> > I suggest prioritising the collection of up-to-date documentation,
> > especially regarding running catalyst manually, since it'll be
> > completely different to the literature that's currently out there.
> 
> I'm a bit unsure what you mean. Do you suggest prioritizing
> documenting the current method of running catalyst before changing it?
> 

I'm suggesting that documentation is more important than any trivial changes
to catalyst, especially with the large amount of changes that have
happened recently. We'll still be running scripts on top of catalyst
that can handle these tasks on a day-to-day basis.

> > > But to get there, there are some changes to catalyst that I think are
> > > improvements on their own and simplify the path to integrating
> > > automation capabilities directly into catalyst. That's what I'd like
> > > to discuss here.
> > >
> > > I'd like to:
> > >
> > >  1) Replace the custom .spec file format with TOML
> > >
> >
> > Fine. Aside from the extra quotes and commas, I'd be happy with any well
> > defined format that can handle strings and lists properly.
> >
> > >  2) Combine .spec file sequences (e.g., stage1 -> stage2 -> stage3 ->
> > > livecd-stage1 -> livecd-stage2) into a single file. I suggest naming
> > > this a ".build" file. This will also allow us to remove the redundant
> > > information that currently has to be specified in stage1.spec,
> > > stage2.spec, stage3.spec, like rel_type, version, profile, etc. It
> > > also means that we remove the nonsensical ability to change settings
> > > from one stage to the next that should not change (e.g., rel_type,
> > > version).
> > >
> >
> > How would a target that depends on a different rel_type work? Forks in
> > the dependency tree.
> >
> 
> I haven't given that a lot of thought yet, but it's something I would
> like to have a plan for.
> 
> We build 32-bit and 64-bit systemd and non-systemd stages on SPARC, as
> well as a bootable ISO.
> 
> 32-bit     systemd: stage1 -> stage3
> 32-bit non-systemd: stage1 -> stage3
> 64-bit     systemd: stage1 -> stage3
> 64-bit non-systemd: stage1 -> stage3 -> livecd-stage1 -> livecd-stage2
> (We skip stage2)
> 
> This means that we have some build chains that are entirely
> independent from one another and could actually run in parallel. E.g.,
> a 32-bit build could happen at the same time a 64-bit build runs
> without any conflicts. Our SPARC system has 256 threads, so it would
> like to build in parallel if possible.
> 
> Similarly, a stage1 build from one of the 32-bit build chains could
> happen in parallel with a stage3 build from the other. We wouldn't
> want to run the same type of build concurrently if they share a binary
> package cache, because we would inevitably spend CPU cycles doing
> duplicate work. E.g., the systemd stage3 build running in parallel
> with the non-systemd stage3.
> 
> Whether all of those build chains should be specified in the same
> ".build" file... I don't know. It seems like it could get a bit
> unwieldy.
> 
> Maybe we could have a top-level ".build" file that references each of
> these build chains, described in other files? If we did that, that
> would certainly allow us to specify a different rel_type per chain.
> 
> I'm not aware of cases where we'd want different rel_types in the same
> chain. Do you know of such a case?
> 

Well, rel_type is just a text field. I use it to create a server
(non-GUI) systemd stage4 and also a full KDE Plasma/systemd stage4.

They're both systemd stages, but they would otherwise use the same
output tarball name, so they get separated out into their own rel_type.

https://wiki.gentoo.org/wiki/File:Substrate_Stage_Paths.svg

Do both target chains define the stage1/3 without rebuilding it multiple
times? I imagine that a singular .spec file will still be runnable, but
I am not really in a position to implement a dependency graph calculator
into catalyst.

> > >  3) Add ability to denote which stage builds produce artifacts we care
> > > about (and want to save and/or upload) and which are just temporary.
> > > If they're temporary (e.g., a stage1 build) we can delete the artifact
> > > after the build sequence has no further use of it, and we can skip
> > > compressing the result, etc.
> > >
> >
> > This feature should (haven't tested) already exist - it's just not
> > documented.
> >
> > compression_mode: rsync
> > options=['seedcache']
> 
> Hah! I was completely unaware of this. Thanks.
> 

I only figured this out because I've been so deep into the compression
code.

> > or don't call 'capture' and/or 'remove_chroot' in action_/finish_sequence.
> >
> > >
> > > To that end, I'm starting by figuring out what I would like the new
> > > spec file format to look like. Below are some open questions and then
> > > a strawman new-style spec file.
> > >
> > > • The .spec files in releng.git are really templates that are not
> > > directly usable without sed'ing @REPO_DIR@ and @TIMESTAMP@. It would
> > > be nice if they were directly usable as that would reduce confusion
> > > from users.
> > >   • Can we make them directly usable?
> > >   • Perhaps we can make catalyst handle the replacements directly?
> > >     • Calculating @TIMESTAMP@ is trivially doable—we do it today (see below)
> >
> > Maybe a strftime() template, or even fstring-like tokens?
> > (e.g. "{year}-{month}-{day}")
> 
> One goal I have is to make it more transparent what is actually in a
> particular stage tarball or ISO and along with that to make it easier
> to reproduce the result.
> 
> Obviously we'll want to keep the ability to specify a particular
> version, as you describe, but I think for Gentoo releases we will want
> to continue using a timestamp that's unambiguously tied to the git
> SHA1 of gentoo.git as is possible.
> 
> > >     • We could configure @REPO_DIR@ in catalyst.conf and let catalyst
> > > do the replacement, or we could just make the field relative to some
> > > path specified in catalyst.conf?
> > >
> >
> > While nice to have, I don't agree with locking users into a particular
> > repository layout.
> 
> Can you explain what you mean? I don't know how what I said would
> require a particular repository layout.
> 
> Perhaps you're confused by the @REPO_DIR@ name? It is the path to the
> releng.git repository (containing the .specs and the /etc/portage/
> files) on the build machine and is not in any way connected with the
> ebuild repositories.
> 

I was just thinking that there could be more files outside of @REPO_DIR@
or /var/tmp/catalyst (or whereever) that may need to be referenced.
In practice, this might be limited; I have been wanting a feature like
this to exist - as long at it's configurable enough.
For me, I'd really just like paths to be relative to the current working
directory...

> The name predates my involvement, so don't blame me :)
> 
> > > • In the current automation scripts, we generate a value for
> > > @TIMESTAMP@ from the git HEAD used in creating the snapshot.
> > >   • Would be nice to remove the dependence on the squashfs snapshot
> > > generation—not difficult to do
> > >
> >
> > I have no comment on this.
> >
> > > • Can we generate and upload a .build file with replacements done to
> > > make stage builds more easily reproducible? Seems easy.
> > >
> >
> > These can just be artifacts from the build.
> 
> Yes, that's what I'm thinking too.
> 



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-11-04 10:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-11-03  3:44 [gentoo-catalyst] catalyst changes for improving automation Matt Turner
2020-11-03 10:54 ` Daniel Cordero
2020-11-03 18:19   ` Matt Turner
2020-11-04 10:46     ` Daniel Cordero
2020-11-03 18:36   ` Matt Turner
2020-11-03 13:04 ` Brian Dolbec

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox