In this talk I will present a joint work with Michael Benedikt concerning the specification and composition of XML subtree queries. A frequent task encountered in XML processing is to filter an input document to produce a subdocument; that is, a document whose root-to-leaf paths is a subset of the set of root-to-leaf paths of the original document and which inherits the tree structure thereof. These are what we mean by subtree queries. Such queries can be used in defining views, either in data integration or in access control, and these views may as well be layered one on top of the other. Attempting to represent subtree queries in a general-purpose query language leads not only to cumbersome query specifications but also to performance issues, since the query engine cannot exploit the subtree nature of the query. The problem is exacerbated when applications require the sequential composition of multiple subtree queries, since it is even less likely that the composition can be recognized as a subtree query and the evaluation be optimized accordingly.
In this talk I will present the XML Subtree Query Language, a simple language for specifying subtree queries, and show that the language is closed under composition. This closure property allows a sequence of XML subtree queries to be rewritten as a single subtree query.
The XML subtree query language and the associated composition algorithms have been used in the GUPster and Incognito projects developed at Bell Labs. GUPster provides a single point of controlled access to user profile data. Incognito is a dedicated platform for XML access control. Incognito uses the Vortex rules engine to resolve user context information and applies the composition algorithms to compose user queries with access control views (all of these being subtree queries) to compute the authorized user query that will be evaluated against XML documents.
Irini Fundulaki is a Member of Technical Staff at the Network Data and Services Research Department of Bell Laboratories since June 2004. She received her PhD diploma from the Conservatoire National des Arts et Metiers and INRIA (French National Institute on Research and Automation) on January 2003 in the area of XML Data Integration for User Communities. From January 2003 till May 2004 she was a PostDoc at the Network Data and Services Research Department in Bell Labs working on User Profile Data Management. Her interests are in the areas of XML data integration, XML access control, Personalization and User Profile Data Management.