Docco is a little personal document management system we build on top of Apache's indexing and search engine Lucene. It adds user interfaces for indexing and querying to Lucene, where the latter gets enhanced by using Formal Concept Analysis' visualisation techniques.
The tool is able to index local hard drives and everything mounted into the local file system, such as Windows or Unix network drives. It scans for a number of different document formats and creates a database containing which words are contained in which documents. This allows very fast lookup of keywords and other information like authors, title or location. The keywords used are generated from the bodies of the documents, such that no manual annotation is required.
Docco support the follwing formats:
Once an index is created, the query interface allows asking for any documents containing certain keywords and shows how these combine. Once a set of interesting documents is found, they can be selected and will be displayed as tree view, from which they can be opened in the default application.
The program and the plugins can be downloaded from the Sourceforge download page, it requires a recent version of Java to be installed (at least JRE/JDK 1.4.0). To run the program just extract the main zip file and start the script for your platform ("run-docco.[bat/sh]"). To install the plugins just extract the plugin zip files into the "plugins" directory in the main installation. Afterwards restart the program. All new indexes created will include the files supported by the plugin.