Automatic Discovery of Useful Facet Terms

Databases of text and text-annotated data constitute a significant fraction of the information available in electronic form.  Searching and browsing are the typical ways that users locate items of interest in such databases.  Faceted interfaces represent a new powerful paradigm which has been proven to be a successful complement to keyword searching.  Thus far, the generation of faceted interfaces relied either on manual identification of the facets, or on apriori knowledge of the facets that can potentially appear in the underlying database.  In this paper, we present our ongoing research towards automatic identification of facets that can be used to browse a collection of free-text documents.  We present some preliminary results on building facets on top of a news archive.  The results are promising and suggest directions for future research.