Semistructure from Relations
Speaker: Prof Tim Merrett, McGill University, Canada
When: 2006-04-20 10:00:00
Venue: 78-420
Host: Dr Heng Tao Shen
Abstract:This is a programming language talk disguised as a database talk. We
are developing a general-purpose programming language for data on
secondary storage, typically too big to fit into RAM. This is in the
tradition of database programming languages (as opposed to database
query languages). We start from secondary-storage data structures
and work upwards through programming language principles (such as no
second-class citizens) rather than starting with one of the many PL
paradigms (e.g., functional, logic or constraint programming) and
working down to implementation. Thus, rather than proving theorems
about what is expressible in the language we test empirically
against developing areas of applications. Semistructured data is one
such area, with most of the foundations laid over the past decade
but still in active development. The central characteristics of
semistructured data are irregular and incomplete structure, and
hierarchies navigated by path expressions. We have shown that all
significant aspects of semistructured data types and processing can
be expressed by a relational formalism which has been brought to
completion through programming-language considerations. This is one
of many tests at which relational programming has succeeded: our
language is thus not restricted to semistructured data but can
integrate it with many other areas, such as spatial data, expert
systems and conventional administrative databases. This talk is an
introduction to programming with relations, and covers the basics of
semistructured data without needing to introduce specialized syntax,
apart from path expressions, which are syntactic "sugar" for more
general underlying operations already in the language. (We will not
have time to discuss the other application areas mentioned.)
Biography:Professor Tim Merrett's main research work is in database, and to
unification of database and programming language concepts such as
relations, functions and objects. His group at McGill has also
published new data structures and retrieval algorithms for secondary
storage. The Aldat Project at McGill is situated at the conflux of
databases and programming languages, linking the fundamental
concepts of both to achieve greater generality in principle and
greater flexibility in practice.
Type: EII
Contact:Dr Heng Tao Shen, seminar host (shenht@itee.uq.edu.au)
or Guido Governatori (ITEE seminar co-ordinator)
(guido@itee.uq.edu.au)
