From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 70E8715800D for ; Tue, 4 Jul 2023 01:29:02 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id A40B1E083E; Tue, 4 Jul 2023 01:29:01 +0000 (UTC) Received: from out-19.mta0.migadu.com (out-19.mta0.migadu.com [91.218.175.19]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 8122FE083E for ; Tue, 4 Jul 2023 01:29:00 +0000 (UTC) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=catcream.org; s=key1; t=1688434138; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=EUFlSyk/CjK+ABwxQC0/6oP0h5T+y78x1tPyHfQ7iLc=; b=BFOU8ZTF99k4JQSN+baxnFNsb+g7TGqqDqLlTQBChf6aJ3E0UsJ+uzjOMWm+wJoJTggIyv JzYM5/J3+fqrHGrxwX2j+Xic9CDVYIcCXngACTFefDypNVrOoGuOd3PRchoOrgAXGV2DZr gUg6wSESld5kBb1GiXe/NLqRAmhLnG+pMzTPYdXyx+5YysQPju3A2GvtN/krwlXbJR5lKt X/yZm3eIBKFRU7oviBfIwR5iOPdjovSWVSfkUrz1O7He7tmElDVVMkCEDQryq/3bz13eWZ Jpvnr7nOCbMZQmp6nHJj4nlRrrTjjNl0a0vMIG+IDb25MhPamRUQHZ9cV9bldw== From: catcream To: gentoo-soc@lists.gentoo.org Subject: [gentoo-soc] Weekly report 5, LLVM libc Date: Tue, 04 Jul 2023 01:07:17 +0200 Message-ID: <873524zcq2.fsf@catcream.org> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-soc@lists.gentoo.org Reply-to: gentoo-soc@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT X-Archives-Salt: 5ac683e9-1e71-43f7-828a-72e6fa1cc39e X-Archives-Hash: 87305ecbfae1962b64e969f2234d3bfb =2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Hey! This week I've spent most of my time figuring out how to bootstrap a LLVM cross compiler toolchain targeting a hosted Linux environment. I have also resolved the wint_t issue from last week. Both of these things took way longer than expected, but I also learned a lot more than expected so it was worth it. I'll start with discussing the LLVM cross compiler setup. My initial idea on how to bootstrap a toolchain was to simply specify LLVM_TARGETS for the target architecture when building LLVM, then compile compiler-rt for the target triple, and then the libc. This is indeed true, but the offi= cial cross compilation instructions tells you to specify a sysroot where the libc is already built, and that's not possible when bootstrapping from scratch. As the compiler-rt cross compilation documentation only tells you to use an already set up sysroot, which I didn't have, I had to try my way forward. This actually took me a few days, and I did things like trying to bootstrap with a barebones build of compiler-rt, mixing in some GCC things, and a lot of hacks. I then studied https://github.com/firasuke/mussel for a while until finding out about headers-only "builds" for glibc and musl. It turns out that the only thing compiler-rt needs the sysroot for is libc headers, and those can be generated without a functioning compiler for both musl and glibc. This is done by setting CC=3Dtrue to pass all the configure tests and then run 'make headers-install' (for musl) into a temporary install directory to generate the headers needed for bootstrapping compiler-rt. >export CC=3Dtrue >./configure \ > --target=3D${CTARGET} \ > --prefix=3D"${MUSL_HEADERS}/usr" \ > --syslibdir=3D"${MUSL_HEADERS}/lib" \ > --disable-gcc-wrapper >make install-headers After this is done you can pass the following CFLAGS: '-nostdinc -I*path to temporary musl install dir*/usr/include' to the compiler-rt build.=20 > -DCMAKE_ASM_COMPILER_TARGET=3D"${CTARGET}" > -DCMAKE_C_COMPILER_TARGET=3D"${CTARGET}" > -DCMAKE_C_COMPILER_WORKS=3D1 > -DCMAKE_CXX_COMPILER_WORKS=3D1 > -DCMAKE_C_FLAGS=3D"--target=3D${CTARGET} -isystem ${MUSL_HEADERS}/usr/inc= lude -nostdinc -v" After this is done you can export >LIBCC=3D"${COMPILER_RT_BUILDDIR}"/lib/linux/libclang_rt.builtins-aarch64.a to the musl build to use the previously built compiler-rt builtins for the actual libc build. To then build actual binaries targeting the newly built libc you can do som= ething like this: >clang --target=3D"${CTARGET}" main.c -c -nostdinc -nostdlib -I"${MUSL_HEAD= ERS}"/usr/include -v >ld.lld -static main.o \ > "${COMPILER_RT_BUILDDIR}"/lib/linux/libclang_rt.builtins-aarch64.a \ > "${MUSLLIB}"/crti.o "${MUSLLIB}"/crt1.o "${MUSLLIB}"/crtn.o "${MUSL= LIB}"/libc.a Running the binary with qemu-user: > $ cat /etc/portage/package.use/qemu >> app-emulation/qemu static-user QEMU_USER_TARGETS: aarch64 > $ emerge qemu > $ qemu-aarch64 a.out >> hello, world Afterwards it feels pretty obvious that the headers were needed, and I could've probably figured it out a lot sooner by for example examining crossdev a bit closer. But I am happy I did play with this since I learned things like what the different runtime libraries did, what's needed to link a binary, and a lot more. Here's a complete script that does everything: https://gist.github.com/alfredfo/e6c65293eb210bcf58e7cbdc80db3d7c. Next I will integrate this into crossdev. Another thing I need to think about is how to do a header-only install of LLVM libc. Currently the headers get generated with libc-hdrgen and installed with the install-libc target. Probably this can be done by packaging a standalone libc-hdrgen binary and using that for bootstrapping. I could also temporarily "cheat" and do a compiler-rt+libc build to get going. Next I also figured out what, and why, the wint_t problem occurs when building LLVM libc in fullbuild mode on a musl system (see last week's report). The problem here is that on a musl system, /usr/include will be first in the include path, regardless of CFLAGS=3D"-ffreestanding". (for C++ they will be after the standard C++ headers and then #include_next'ed, so no difference). I thought at first that this was a bug since you don't want to target an environment where the libc is available (hosted environment) when building in freestanding mode. However, after asking in #musl IRC this is actually fine since the musl headers respect the __STDC_HOSTED__ variable that gets set when using =2D -ffreestanding, and there is a clear standard specifying what should be available in a freestanding environment. The problem arises because LLVM libc assumes that the Clang headers will be used when passing -ffreestanding, and therefore relies on Clang header internals. Specifically the __need_wint_t macro for stddef.h which is in no way standardized and only an implementation detail. My thought here was to instead of relying on CFLAGS=3D"-ffreestanding" to use the Clang headers, we should instead figure out another way using the build system to force Clang headers. Another way to solve this would also just be to also rely on musl internals (__NEED_wint_t for stddef.h). After discussing this we agreed to first actually get the libc built, and then decide on a strategy once we know how many times similar issues pop up. If there are only a few instances of this then more #defines are fine, else we could do something like the gcc buildbot target. My only worry with this is that it will keep biting us in the ass as more things get added. https://github.com/llvm/llvm-project/issues/63510 Other things worth noting is that my 'USE=3Demacs llvm-common' PR inspired a new elisp-common.eclass function called elisp-make-site-file https://github.com/gentoo/gentoo/commit/a4e8704d22916a96725e0ef819d912ae822= 70d28 because mgorny thought that my sitefiles were a waste of inodes :D. https://github.com/gentoo/gentoo/pull/31635. I also got my __unix__->__linux__ CL merged into LLVM. I do however have some worries that this could've broken some things on macOS as seen in my comment: > done! I think there should be something addressing pthread_once_t and > once_flag for other Unix platforms though. Both of these would've > previously, before this commit, been valid on macOS, as __unix__ is > defined and __futex_word is just an aligned 32 bit int. No internal > Linux headers were used here before that would've caused an error. https://reviews.llvm.org/D153729 Next week I will try to make Crossdev be able to use LLVM/Clang by integrating the things I did this week. =2D --=20 catcream =2D----BEGIN PGP SIGNATURE----- iIcEARYKAC8WIQTrvBqrbtsVNc2oScop9g9HYPvztAUCZKN11REcY2F0QGNhdGNy ZWFtLm9yZwAKCRAp9g9HYPvztLsRAQC4JpWzxareh4nP3agPVu9TkkMXDs+uIfTv KQRlNxSneQD8CC6xCpmBdL1QYY3TQ4nSuEpjrCOGaL+jzjQw/ZMe7g0=3D =3D0aDy =2D----END PGP SIGNATURE-----