Version 1 (modified by skvidal, 8 years ago)

Ideas for New Repodata

filelists - break them up

filelists broken out by paths so you don't have to download a huge glop of the complete filelists just to find out who owns /foo/baz

possible ways to break up the files - info from rawhide as of june 2010:

2.4 million files total in pkgs in rawhide 2.3 million of those are in /usr 1.8 million of those are /usr/share

Top 3 dirs by file count under /usr/share:

533046 /usr/share/doc 120555 /usr/share/javadoc 105591 /usr/share/icons

45 file-requires requiring something in /usr/share none of those file-requires are in the top 3 /usr/share dirs

  • most of them are fonts.

so in general we're downloading 75% more files for a filereq check than we'll EVER need.

So if we break the files up by 'top level dir' + 3 layers deep for /usr/*/*/ then we'll have reduced the number unnecessary file-list downloads by about 75%

complete repodata per-pkg

  • in a file or a directory of files so I can grab all of a certain kind of metadata for ONLY one pkg


- provide a way to break out summary/description into a structure that supports translations.

potential structure

And some more specific ideas:


repomd.xml <-- same as before - the index for everything else - but making sure not to use any of the existing data type attributes

packagelist.sqlite - contains name, arch, epoch-ver-rel, checksum, summary, description, url, license, size (package, installed), location (baseurl, mirrorlist(optional), href), group, header byte-ranges?(maybe), has_conflicts?, has_obsoletes?, is_signed?, key_fingerprint?, location-style-path to per-pkg metadata and checksum.

provides.sqlite <-- provides: providename + flags + evr requires.sqlite <-- requires: requiresname + flags + evr + prereq conflicts_obsoletes.sqlite <-- conflicts and obsoletes (name, flags, evr) files_by_path.xml <-- index file to point to the files-by-path

path_it_holds + filename + checksum, per file