Description of topics
Problem statement. Most collections of hyperdocuments
have one or more of the following problems: (a) structured search for documents
is not sufficiently supported; (b) proving properties (e.g. correctness
criteria or integrity constraints) is difficult or impossible; (c) maintenance
is not supported by formal mechanisms; (d) personalisation of information, or
adaptation to user groups, is difficult or impossible. As these problems are
mainly caused by the lack of suitable modelling techniques, this book is an
important step towards the next generation of hyperdocuments in general, and
web sites in particular. We have considered fundamentals in modelling (theory)
and concrete modelling techniques (practice), and hope that further research in this area will contribute to Internet
being fully integrated in society.
We will give an overview of the book. It
consists of three parts: fundamentals of Internet information modelling,
elaboration of modelling techniques, and additional topics. These three parts
contain fourteen chapters. It is possible to start in a later chapter (e.g. in
part II), without reading all earlier chapters (e.g. more theoretical chapters
in part I). Part I of this book is about fundamentals and
consists of four chapters. The focus of chapter 1 is semistructured data: modelling and querying web data. In this
chapter a logic approach is presented, as algebraic approaches have already
been defined in other literature. The logic approach uses a rule-based
constraint language, having declarative and operational semantics (fix-point
theory). This is illustrated by a case study for XML. Chapter 2 extends the logic approach
of chapter 1. Here verification of properties is considered: verifying web site properties using
computational logic. The properties of web sites include what is usually
called "integrity constraints". It is a well-known fact that we need
to be able to specify as well as check integrity constraints for information
systems. In the context of web-based systems, these topics have not yet
received much attention. In the logic approach in this chapter, web sites are
defined as hypergraphs. It is important to treat the
contents and properties of web sites using existing database theory. We then
need mechanisms to generate hypertext views on databases. This is considered in
chapter 3: design and analysis of active
hypertext views on databases. In the chapter, an overview of existing
approaches to database publishing is given. The specification of hypertext
views is considered, along with a design methodology. Finally, specific kinds
of views are introduced, including active and adaptive views (necessary for
e.g. personalization of web sites). Before going into the details of
concrete modelling techniques in part II of this book, the focus of chapter 4
is the integration of several aspects considered in the first three chapters: an object-oriented hypermedia reference
model formally specified in UML. In this model, formal constraints on
hypermedia model elements are possible, such as invariants, pre-conditions, and
post-conditions. These are expressed in terms of UML using the Object
Constraint Language. Part II of this book is about concrete
modelling techniques and contains six chapters. When a modelling technique for
web-based information systems is defined, we should not forget that there is a
lot of experience with modelling techniques for "normal" information
systems. Therefore, in chapter 5 we consider systematic development of Internet sites - extending approaches of
conceptual modelling. This extension of conceptual modelling aims at a
number of new challenges, including full flexibility, support of tracing, and
push-up content just-in-time. The site specification uses the following
modelling concepts: stories, scenarios, scenes, dialogue steps, and media
objects. In the Webspace method in chapter 6, existing database techniques are
used to define advanced search
possibilities. This is done in three stages. First, multimedia web data is modelled; then extraction of meta-data is
performed; finally, collections of documents are queried. Webspace models
define concepts and allow for the derivation of document structures. Once such
a structure has been derived, content and presentation functions may be added. For the modelling of web data, the araneus data
model (ADM) can be used. This model is discussed in chapter 7: specification of web applications with ADM-2.
This chapter presents ADM, with some new extensions for the specification of
dynamic aspects. A basic dynamic aspect is interaction, which is often done via
web pages by activating links or buttons. This activation results in the
execution of actions and thus special attention is given to the question how
actions should be specified. Although modelling of web data is
essential in developing a web site, designing a suitable web interface is of
major relevance as well. This is discussed in chapter 8 OO-H method: extending UML to model web interfaces. In the
Object-Oriented Hypermedia (OO-H) method, diagrams and their populations are
used for code generation. Several kinds of links are distinguished, such as
internal, traversal, requirement, exit, and service links. Other attributes
associated to links are e.g. visualization, user interaction, and application
scope. In chapter 9, special attention is
given to construction of models for existing web pages, also called extraction:
ontology extraction and conceptual
modelling for web information. A distinction is made between data
extraction and schema extraction. This approach works with HTML pages. These
pages are annotated with tags such as verbs, nouns, and adverbs (part-of-speech
tags). The conceptual model is based on extended entity-relationship models,
where meta-data are stored in a relational database. The final chapter of part II is
called OODM - an object-oriented design
methodology for development of web applications. Here, a comparison is made
with Hypermedia Design Model, Relationship Management Methodology,
Object-Oriented Hypermedia Design Model, and Object-Oriented Design Method for
Hypermedia Information Systems. The chapter contains an elaborated case study,
where a number of page classes is identified and elaborated. In part III several additional topics are
treated. The focus of chapter 11 is maintenance and testing: web application quality - supporting
maintenance and testing. Attention is given to static analysis and
transformations. A distinction is made between testing for pages, hyperlinks,
definition-use, all-uses, and all-paths. In this context, generation of test
cases and statistical testing are considered. For example, the structure of a
web application can be tested by measuring the coverage of a set of test cases
for a given set of features under consideration. In chapter 12 the topics of
personalization and performance are discussed: modelling data intensive web sites for personalization, integrity, and
performance. The following processes of analysis are defined: data, user
requirements, usability constraint, integrity constraint, and personalization
provision analysis. A distinction is made between active and passive
personalization. In discussing the performance, attention is given to ranking. It is generally recognized that the
Internet context is an excellent environment to work in communities. In the
database area, this results in adaptive
web-based database communities. In chapter 13, organizing and modelling is
discussed, as well as advertising and querying database communities. Special
attention is given to inter-community relationships, particularly when these
relationships are changing. Then, monitoring and maintaining inter-community
relationships become necessary. It is proposed to perform monitoring by means
of statistics and agents providing recommendations for changing existing
relationships or creating new ones. In the final chapter of this book,
several important concluding issues in building web applications are explained.
This chapter is entitled designing
hypertext and the web with the heart and the mind. The topic of
internationalisation is addressed from the perspective of user's native
language and cultural background. It is explained that cultural issues often
may come disguised as e.g. illiteracy problems or user faults, rather than as
surmountable cultural differences. Finally, several pressing ethical questions
are addressed, such as "how do business rules apply to the Internet"
and "who are the regulators and what activities do they regulate".
Fundamentals of Internet
information modelling
Elaboration of modelling
techniques
Additional topics