Basic Interaction Model
All interactions with a Trove database are mediated through HTML pages on a Web browser.
Users
Users (people looking for packages that match their requirements to download) are presented with a search form. The search form allows them to enter keyword search terms. The keywords may be selected with buttons from a controlled vocabulary defined by site policy, or entered as `roll-your-owns' in a text field.
Searches would yield all targets that are in the the intersection set of the controlled-keyword hits, unioned with all hits from a search for roll-your-own keywords in package text descriptions. There is a more detailed proposal for the handling of controlled-vocabulary keywords.
The result of a search is a generated HTML catalog listing. The body of a catalog listing consists of a series of one-line entries each beginning with a package-name hotlink and including a one-line package summary. The catalog has section headers indicating which lines are controlled- keyword hits and which are free-text hits.
Users looking at a catalog listing may either refine the search or look at individual entries that interest them (by chasing the package-name hotlinks). An individual entry displays all package metadata contained in the Trove database, possibly including resource links to a local cache of package resource files.
When an individual entry is selected, a user may take one of several actions:
- Chase a resource hotlink on the package metadata display (such as the package home page URL, or a mailto URL for the package contact person).
- Download package resources (e.g. by chasing FTP hotlinks on the package metadata display).
- Subscribe or unsubscribe to the package's heads-up list (that is, the list of people automatically notified by email whenever package metadata or resources are changed). Unsubscription will be prohibited to an unauthenticated user; this is to prevent bad guys from masquerading as good guys in order to suppress notifications.
- Attach a review annotation to the package. (This is a future feature and has not yet been designed into the database schema.)
Contributors
Contributors (people updating package metadata or uploading new associated resources such as source tarballs) use a Web form to create or edit the metadata, and possibly to upload package resource copies to the site's local FTP cache.
Description fields are interpreted according to the following rules:
Text is plain text. Paragraphs are separated by one or more blank lines. No HTML tags are recognized; >, & and < mean themselves. Normal paragraphs are word-filled. Indented text is treated as-is and converted to <PRE>...</PRE> in HMTL (tabs should be expanded to spaces here). A single word between *asterisks* means <b>bold</b> and a single word in _underscores_ means <i>italics</i> (even in indented text). Any text that looks sufficiently like a URL (e.g. http://www.python.org) is turned into a hyperlink with an <A...>...</A> tag pair (even in indented text).
Administrators
Trove site administrators can use a web form to view a catalog of recently added entries, and delete or modify them if there appears to be some problem.
Administrators are also responsible for watching logs of roll-your-own keyword entries and noticing when keywords should be migrated into the core set described in site policy.
Security and Authentication
There are two levels of protection in the Trove design. Which will operate depends on whether a contributor is authenticated or not.
How to Authenticate Users
To be authenticated, a user must register a PGP public key with a Trove site. A user becomes authenticated by asking Trove to issue a challenge. The user must then return the challenge encrypted with the matching private key to become authenticated.
On success, Trove issues a timed cookie to the user. While the cookie remains valid, the user is authenticated.
Security through Visibility
The contributor who creates a package entry, and anyone who changes the package metadata or resources after the fact, will be put on the package's heads-up list. Every time the package metadata is modified after that, the updating contributor will be added to the heads-up list, and everybody on the heads-up list will be notified.
The intent of this feature (and the prohibition on unsubscribing from a heads-up list unless you're validated) is to make sure that all metadata & resource changes are visible to everybody with a stake in the package. In particular, any modifications an unauthorized person succeeds in doing will be visible to the real package owners.
Security through Authentication
Either a resource or a package may be locked. When an item (resource or package) is locked, you must be validated as an owner to modify it.
Here are the rules of ownership:
- The keeper of an item is the person who can add and delete owners.
- The keeper of an item is automatically an owner of the item.
- The person who creates an item is its first keeper.
- The keeper may pass the keeper role to another validated user.
- Any owner of an item (package or resouece) can modify or delete the item.
- The owners of a package may delete associated resources.
- The owners of a package can modify its sticky bit:
- If the sticky bit is off, anyone can attach resources the a package.
- If the sticky bit is on, only owners of the package can attach resources to the package. (Owners of other attached resources have no automatic privilege.)
Package Authentication
To be specified. Base on the JAR approach suggested by Jeremy Hylton?
Sketch of implementation
How will all this be done? Essentially, by replacing the meta-information now stored in LSMs with a Web-accessible database. A Trove site would consist of two parts:
- The metadata. Metadata (defined by the schema) would be stored in a database. The metadata would include pointers to...
- The resources. Resources are files that live in a site-local FTP tree. Some will be created by contributors; some generated by the Trove code itself (for example, it will pre-generate the HTML for faster subsequent metadata display each time the metadata is changed).
The CGIs constituting the web-accessible front end of the database will be written in Python. A major open issue is what database to use as the back end.
Eric S. Raymond