Home Research People Publications Seminar Admin


DB group publishes 6 papers in ACM SIGMOD/PODS 2005


The database group publishes 6 papers in ACM SIGMOD/PODS. Considered as the top database conferences, the International Conference on Management of Data (SIGMOD) and the Symposium on Principles of Database Systems (PODS) are held together covering database applications and theory, respectively.


SIGMOD 2005 Accepted Papers

Carrying on the group's tradition of innovating on the XML front, two of the three accepted papers in SIGMOD focus on querying XML documents. The third research paper is the outcome of an exciting project on verification of interactive web applications currently in progress in the DB group.

Extending XQuery for Analytics [.PDF]
Kevin Beyer, Don Chamberlin, Latha Colby, Fatma Ozcan, Hamid Pirahesh, Yu Xu

In this paper, DB group's graduating PhD student Yu Xu together with researchers from IBM Almaden Research Center propose an extension to XQuery - the language for querying XML data. With XML gaining importance as the standard for representing business data, XQuery must support the types of queries that are common in business analytics. This is why the authors provide a proposal for extending the XQuery FLWOR expression with explicit syntax for grouping and for numbering of results. The paper shows that these new XQuery constructs not only simplify the construction and evaluation of queries requiring grouping and ranking but also enable complex analytic queries (such as moving-window aggregation and rollups along dynamic hierarchies) to be expressed without additional language extensions.
Efficient Keyword Search for Smallest LCAs in XML Databases [.PDF]
Yu Xu, Yannis Papakonstantinou

In the second XML-related work published at ACM SIGMOD, DB group's professor Yannis Papakonstantinou and Yu Xu deal with the problem of searching XML documents by keywords. The proposed algorithms return the set of smallest trees containing all keywords, where a tree is designated as “smallest” if it contains no tree that also contains all keywords. The core contributions are two algorithms; the Indexed Lookup Eager algorithm and the Scan Eager algorithm. The first exploits key properties of smallest trees in order to outperform prior algorithms by orders of magnitude when the query contains keywords with significantly different frequencies. The second is a variant tuned for the case where the keywords have similar frequencies. A system for keyword search utilizing these two algorithms as well as other existing ones has been implemented and a demo of it is available at http://www.db.ucsd.edu/projects/xksearch.
A Verifier for Interactive, Data-driven Web Applications [.PS]
Alin Deutsch, Monica Marcus, Liying Sui, Victor Vianu, Dayou Zhou

In this work, DB group's professors Alin Deutsch and Victor Vianu, Postdoc Monica Marcus and PhD students Liying Sui and Dayou Zhou present "WAVE", a verifier for interactive, database-driven Web applications specified using high­level modeling tools such as WebML. WAVE can be used to verify temporal properties and is complete for a broad class of applications and temporal properties. For other applications, it can be used as an incomplete verifier, as commonly done in software verification. The experiments done by the authors on four representative data­driven applications and a battery of common properties yielded surprisingly good verification times, on the order of seconds. This suggests that interactive applications controlled by database queries may be unusually well suited to automatic verification. This also shows that the coupling of model checking with database optimization techniques used in the implementation of WAVE can be extremely effective. This is significant both to the database area and to automatic verification in general. A demo of the verifier can be found online at http://sashimi.ucsd.edu:8080/wave.

PODS 2005 Accepted Papers

The three papers presented at PODS provide theoretical insight on several core database topics with applications in many important areas, such as data integration or data exchange.

Views and Queries: Determinacy and Rewriting [.PDF] [.PS]
Luc Segoufin, Victor Vianu

In this paper, DB group's professor Victor Vianu with Luc Segoufin from INRIA investigate the question of whether a query over a database can be answered using a set of views over it. They first define the problem in information-theoretic terms by saying that a set of views "determines" a query if the views provide enough information to uniquely determine the answer to the query. Next, they look at the problem of rewriting a query in terms of a set of views using a specific language. Given a view language V and query language Q, they define a rewriting language R as being "complete for V-to-Q rewritings" if every query in Q can be rewritten in terms of a set of views in V using a query in R, whenever the set of views determines the query. While query rewriting using views has been extensively investigated for some specific languages, the connection to the information-theoretic notion of determinacy, and the question of completeness of a rewriting language, have received little attention. In this paper the authors investigate systematically the notion of determinacy and its connection to rewriting. The results concern decidability of determinacy for various view and query languages, as well as the power required of complete rewriting languages. The languages considered vary in expressibility, ranging from first-order to conjunctive queries.
Composition of Mappings Given by Embedded Dependencies [.PDF]
Alan Nash, Philip A. Bernstein, Sergey Melnik

The composition of mappings is the topic of this paper authored by DB group's PhD student Alan Nash together with Philip Bernstein and Sergey Melnik from Microsoft Research. Mappings between schemas express the relationship between these database schemas and their composition is essential to support schema evolution, data exchange, data integration, and other data management tasks. In this paper, the authors study the issues involved in composing mappings. The results presented extend previous studies by being applicable to more general classes of mappings. For each of the three classes of mappings studied in this work, the authors provide (a) an algorithm that attempts to compute the composition and (b) sufficient conditions on the input mappings that guarantee that the algorithm will succeed. Additionaly, they also provide several negative results regarding the composition problem.
Determining Source Contribution in Integration Systems [.PDF]
Alin Deutsch, Yannis Katsis, Yannis Papakonstantinou

Owners of sources registered in an information integration system, which provides answers to a (potentially evolving) set of client queries, need to know their contribution to the query results. In order to provide this information to source owners, DB group's professors Alin Deutsch and Yannis Papakonstantinou and PhD student Yannis Katsis define several degrees of source contribution to a client query and provide decidability results on them. Specifically the authors study the problem of deciding, given a client query and a source registration, whether the source registration is (i) “self-sufficient” (can contribute to the result of the query even if it is the only source in the system) or (ii) “now complementary” (can contribute, but only in cooperation with other specific existing sources), or (iii) “later complementary” (can contribute if in the future appropriate new sources join the system).

More news...



UCSD - Official web page of the University of California, San Diego