From: Jeroen van Wolffelaar Date: Tue, 14 Feb 2006 02:07:32 +0000 (+0000) Subject: Document contents backend stuff loosely, still subject to change X-Git-Tag: switch-to-templates~128 X-Git-Url: https://git.deb.at/w?a=commitdiff_plain;h=888a7de02572eb30f2b35aa6ea469ad7eb9f05b4;p=deb%2Fpackages.git Document contents backend stuff loosely, still subject to change (filenames_$suite.txt is inadequate) --- diff --git a/BACKEND b/BACKEND index cde9ef4..f3a5753 100644 --- a/BACKEND +++ b/BACKEND @@ -93,3 +93,24 @@ Generated by means of Sources.gz files: | - files: \01 separated list of "md5 size filename" Note: different key from packages_all, is that needed? +********************************************************* +Generated by means of Contents-$arch.gz files: +********************************************************* + +This one is tricky, because it deals with about 1G of raw uncompressed data +per suite. Not all data is updated every day though, so dealing with that +efficiently pays off. + +Each sourcefile will create a filelists_$suite_$arch.db, with prefix +compression. The last updated one will have a symlink from _all.db to it, to +help filelist queries for 'all' packages. + +reverse_$suite_$arch.txt will be the reversed pathnames for that file, +lowercased, sorted, with packagename:arch following it. + +For each suite, the suite-wide indices can then be updated by reading the 11 +or so reverse_$suite_$arch.txt in sorted order with sort -m. Same pathnames +can be put together, and stored in reverse_$suite.db; filenames are then also +incidently coming by grouped uniquely (but reverse sorted, not normal sorted), +and can be written out linearly to filenames_$suite.txt +