X-Git-Url: https://git.deb.at/?a=blobdiff_plain;f=BACKEND;h=f3a57537e4372fa8b98eced5faaf0a6f0897b5ac;hb=27d33ebb54f354d5d0d1fb68c94ff6a5682b54c3;hp=cde9ef4d4e4fd554057022d7eb7934a1e1d590c8;hpb=bcbeaca96ba2c409e061b64807b986e4b8464192;p=deb%2Fpackages.git diff --git a/BACKEND b/BACKEND index cde9ef4..f3a5753 100644 --- a/BACKEND +++ b/BACKEND @@ -93,3 +93,24 @@ Generated by means of Sources.gz files: | - files: \01 separated list of "md5 size filename" Note: different key from packages_all, is that needed? +********************************************************* +Generated by means of Contents-$arch.gz files: +********************************************************* + +This one is tricky, because it deals with about 1G of raw uncompressed data +per suite. Not all data is updated every day though, so dealing with that +efficiently pays off. + +Each sourcefile will create a filelists_$suite_$arch.db, with prefix +compression. The last updated one will have a symlink from _all.db to it, to +help filelist queries for 'all' packages. + +reverse_$suite_$arch.txt will be the reversed pathnames for that file, +lowercased, sorted, with packagename:arch following it. + +For each suite, the suite-wide indices can then be updated by reading the 11 +or so reverse_$suite_$arch.txt in sorted order with sort -m. Same pathnames +can be put together, and stored in reverse_$suite.db; filenames are then also +incidently coming by grouped uniquely (but reverse sorted, not normal sorted), +and can be written out linearly to filenames_$suite.txt +