Minimize the need for operator intervention whenever appropriate, so the system scales up to the capacity of the automation, rather than the availabilty of the operators time.
In particular, all submissions and updates would be done through a Web form with upload capability, eliminating moderator intervention for routine updates.
Packages would be indexed through a rich multiple-keyword structure rather than topic directories.
Package data to include, at minimum: name, version, home page, FTP & RPM site. (See the proposed schema).
Anyone should be able to sign up to be notified by email when a package is updated.
At the package uploader's option, the archive would either store a package in a local cache or point at the home site.
One of the deliverables should be a tool that collects everything in the local cache and indirects through remote-site pointers to collect the remote stuff as well, so that CD-ROM distributors can make distributable snapshots of the complete archive.
Schema configurability of things like keyword categories, so the Trove software can be used for multiple archives with different policies (in particular, both son-of-Sunsite and the Python archive).
Must scale well, up to Sunsite's level of traffic and beyond. Verifying this scalability before launching will be important.
Secondary Objectives
It would be a good idea (for performance) if running CGIs was only required for searching and for modifying the database, and everything else was available as static HTML files. (Among other things, we can pre-generate the HTML for package metadata displays.)
Strong authentication for packages and package updates, like what Debian does.
Meta-archive functions -- queries to one Trove instance are automatically also forwarded to other Trove archives.
Define a plain-tex tag format for rendering metadata. Allow email submissions in this format. Add a `trusted remote metadata' field to the schema and write a crawler that polls these for metadata updates.
Teach Trove to extract inter-resource dependencies by analyzing binaries. Long-term project!