DATE: Friday, October 5, 2001
TIME: Noon - 1 pm
PLACE: Wean
Hall 7220
SPEAKER:
Vladimir I. Zadorozhny
University of
Pittsburgh
TITLE:
Efficient Query Processing
in a Mediator for Web Data Sources
ABSTRACT:
I consider a mediator architecture for querying multiple Web data sources
in a wide area environment. One objective of this architecture is to support
declarative database-like queries to noisy Web sources with limited query
capability. I present a two-phase Web query optimizer that uses a capability-based
pre-optimizer and an extended relational optimizer. The pre-optimizer
generates a pre-plan for a mediator query. The pre-plan identifies Web
access patterns relevant to the mediator query, as well as restrictions
imposed by the capabilities of the Web sources. A relational optimizer
utilizes the knowledge in the pre-plan in producing a good query execution
plan. I will show that the choice of Web access patterns strongly impacts
the cost of the query execution plan, and consider cost-based heuristics
that the optimizer should use to make a good choice.
Finally I present a novel optimization strategy to meet performance targets for queries in a noisy wide area environment. Using access cost distributions for Web sources, the optimizer determines a cost-delay utility for a query plan. The optimizer behavior can be more optimistic, where it ignores the expected delay of accessing Web sources, or it can be conservative and consider this delay.
BIO:
Vladimir Zadorozhny is an Assistant Professor in Department of Information
Science and Telecommunications, School of Information Sciences, University
of Pittsburgh. He received his Ph.D. in 1993 from the Institute for Problems
of Informatics, Russian Academy of Sciences in Moscow. Before coming to
US he was a Principal Research Fellow in the Institute of System Programming,
Russian Academy of Sciences. Since May 1998 he worked as a Research Associate,
and then Research Scientist in the University of Maryland Institute for
Advanced Computer Studies at College Park. He joined University of Pittsburgh
in September 2001. Vladimir's research interests include scalable architectures
for wide-area environments with heterogeneous information servers, Web-based
information systems, query optimization in distributed databases, semantic
interoperability in heterogeneous network environments,
distributed object systems and object metamodels.
SDI / LCS Seminar Questions?
Karen Lindenfelser, 86716, or visit www.pdl.cmu.edu/SDI/