From 888a7de02572eb30f2b35aa6ea469ad7eb9f05b4 Mon Sep 17 00:00:00 2001 From: Jeroen van Wolffelaar Date: Tue, 14 Feb 2006 02:07:32 +0000 Subject: [PATCH] Document contents backend stuff loosely, still subject to change (filenames_$suite.txt is inadequate) --- BACKEND | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/BACKEND b/BACKEND index cde9ef4..f3a5753 100644 --- a/BACKEND +++ b/BACKEND @@ -93,3 +93,24 @@ Generated by means of Sources.gz files: | - files: \01 separated list of "md5 size filename" Note: different key from packages_all, is that needed? +********************************************************* +Generated by means of Contents-$arch.gz files: +********************************************************* + +This one is tricky, because it deals with about 1G of raw uncompressed data +per suite. Not all data is updated every day though, so dealing with that +efficiently pays off. + +Each sourcefile will create a filelists_$suite_$arch.db, with prefix +compression. The last updated one will have a symlink from _all.db to it, to +help filelist queries for 'all' packages. + +reverse_$suite_$arch.txt will be the reversed pathnames for that file, +lowercased, sorted, with packagename:arch following it. + +For each suite, the suite-wide indices can then be updated by reading the 11 +or so reverse_$suite_$arch.txt in sorted order with sort -m. Same pathnames +can be put together, and stored in reverse_$suite.db; filenames are then also +incidently coming by grouped uniquely (but reverse sorted, not normal sorted), +and can be written out linearly to filenames_$suite.txt + -- 2.39.2