public inbox for gentoo-portage-dev@lists.gentoo.org
 help / color / mirror / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download: 
* [gentoo-portage-dev] [PATCH] ecompress: optimize docompress -x precompressed comparison
@ 2020-06-28 19:54 99% Zac Medico
  0 siblings, 0 replies; 1+ results
From: Zac Medico @ 2020-06-28 19:54 UTC (permalink / raw
  To: gentoo-portage-dev; +Cc: Zac Medico, Robin H . Johnson

Use sort and comm with temporary files in order to compare lists
of docompress -x and precompressed files, since the file lists
can be extremely large. Also strip ${D%/} from paths in order to
reduce length.

Bug: https://bugs.gentoo.org/721516
Suggested-by: Robin H. Johnson <robbat2@gentoo.org>
Signed-off-by: Zac Medico <zmedico@gentoo.org>
---
 bin/ecompress                                 | 29 ++++++++++---------
 .../tests/resolver/ResolverPlayground.py      |  1 +
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/bin/ecompress b/bin/ecompress
index 60b083834..983a4d1f7 100755
--- a/bin/ecompress
+++ b/bin/ecompress
@@ -19,29 +19,30 @@ while [[ $# -gt 0 ]] ; do
 		shift
 
 		skip_dirs=()
-		skip_files=()
+		> "${T}/.ecompress_skip_files" || die
 		for skip; do
 			if [[ -d ${ED%/}/${skip#/} ]]; then
 				skip_dirs+=( "${ED%/}/${skip#/}" )
 			else
 				rm -f "${ED%/}/${skip#/}.ecompress" || die
-				skip_files+=("${ED%/}/${skip#/}")
+				printf '%s\0' "${EPREFIX}/${skip#/}" >> "${T}/.ecompress_skip_files"
 			fi
 		done
 
 		if [[ ${#skip_dirs[@]} -gt 0 ]]; then
-			while read -r -d ''; do
-				skip_files+=("${REPLY%.ecompress}")
+			while read -r -d '' skip; do
+				skip=${skip%.ecompress}
+				printf '%s\0' "${skip#${D%/}}" >> "${T}/.ecompress_skip_files"
 			done < <(find "${skip_dirs[@]}" -name '*.ecompress' -print0 -delete || die)
 		fi
 
-		if [[ ${#skip_files[@]} -gt 0 && -s ${T}/.ecompress_had_precompressed ]]; then
-			sed_args=()
-			for f in "${skip_files[@]}"; do
-				sed_args+=("s|^${f}\$||;")
-			done
-			sed_args+=('/^$/d')
-			sed -f - -i "${T}/.ecompress_had_precompressed" <<< "${sed_args[@]}" || die
+		if [[ -s ${T}/.ecompress_skip_files && -s ${T}/.ecompress_had_precompressed ]]; then
+			# Filter skipped files from ${T}/.ecompress_had_precompressed,
+			# using temporary files since these lists can be extremely large.
+			LC_COLLATE=C sort -zu "${T}/.ecompress_skip_files" > "${T}/.ecompress_skip_files_sorted"|| die
+			LC_COLLATE=C sort -zu "${T}/.ecompress_had_precompressed" > "${T}/.ecompress_had_precompressed_sorted" || die
+			LC_COLLATE=C comm -z13 "${T}/.ecompress_skip_files_sorted" "${T}/.ecompress_had_precompressed_sorted" > "${T}/.ecompress_had_precompressed" || die
+			rm -f "${T}/.ecompress_had_precompressed_sorted" "${T}/.ecompress_skip_files"{,_sorted}
 		fi
 
 		exit 0
@@ -81,7 +82,7 @@ while [[ $# -gt 0 ]] ; do
 								continue 2
 							fi
 						done
-						echo "${path}" >> "${T}"/.ecompress_had_precompressed
+						printf '%s\0' "${path#${D%/}}" >> "${T}"/.ecompress_had_precompressed || die
 						;;
 				esac
 
@@ -195,8 +196,8 @@ if [[ -s ${T}/.ecompress_had_precompressed ]]; then
 	eqawarn "(manpages, documentation) when automatic compression is used:"
 	eqawarn
 	n=0
-	while read -r f; do
-		eqawarn "  ${f#${D%/}}"
+	while read -r -d '' f; do
+		eqawarn "  ${f}"
 		if [[ $(( n++ )) -eq 10 ]]; then
 			eqawarn "  ..."
 			break
diff --git a/lib/portage/tests/resolver/ResolverPlayground.py b/lib/portage/tests/resolver/ResolverPlayground.py
index de80a0cc1..ec2e31ae9 100644
--- a/lib/portage/tests/resolver/ResolverPlayground.py
+++ b/lib/portage/tests/resolver/ResolverPlayground.py
@@ -91,6 +91,7 @@ class ResolverPlayground(object):
 				"chgrp",
 				"chmod",
 				"chown",
+				"comm",
 				"cp",
 				"egrep",
 				"env",
-- 
2.25.3



^ permalink raw reply related	[relevance 99%]

Results 1-1 of 1 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2020-06-28 19:54 99% [gentoo-portage-dev] [PATCH] ecompress: optimize docompress -x precompressed comparison Zac Medico

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox