What Can the Rest of Us Learn From Research on Adaptive Hypermedia - and Vice-Versa?Comments by Anthony Jameson on the book Adaptive Hypertext and Hypermedia, edited by Peter Brusilovsky, Alfred Kobsa, and Julita Vassileva (Dordrecht: Kluwer, 1998) |
[Full Text] [send contribution] [debate procedure] [copyright] |
No | Comment(s) | Answer(s) | Continued discussion |
---|---|---|---|
1 |
22.3.99 Anthony Jameson |
29.3.99 Annika Waern |
|
Comments by Anthony Jameson on the book Adaptive Hypertext and Hypermedia, edited by Peter Brusilovsky, Alfred Kobsa, and Julita Vassileva (Dordrecht: Kluwer, 1998).
My intent with these comments is not to discuss each individual article in this collection; this task was performed well by the three editors in their Preface, and Peter Brusilovsky has kindly made this preface available on-line for the purpose of this discussion.
Nor will I offer an overview of the methods and techniques of adaptive hypermedia, since that job was done by Peter Brusilovsky in the thorough and insightful review article that begins this book. I can simply advise anyone who is at all interested in this area to acquire and read this at least this overview article, even if they for some reason can't manage the whole book.
Note that the recommendation is to read and use the overview, not just to cite it. As the author of an earlier overview article in the journal in which this one originally appeared, I should warn that overview articles threaten to diminish the extent to which researchers are aware of relevant previous work - although, of course, they have the purpose and the potential of doing the opposite. The problem is that some authors think they can fulfill their obligation to cite "related work" by citing the overview article. So they don't need to read any of the previous works themselves - or even the overview article. Fortunately, the use of this strategy is usually betrayed by the presence of recognizably false statements in the work of the authors who use it (e.g., that they are the first ones in the history of the planet who have used a particular technique).
My aim with these comments is to place the work in this book in the broader context of work on user-adaptive systems, which of course encompasses research involving a lot of systems other than adaptive hypermedia systems: What can those who work on other types of user-adaptive systems learn from research on adaptive hypermedia--and what can adaptive hypermedia researchers perhaps learn from the others?
To this end, I've indexed the 7 system-specific papers in the book (i.e., all of them except Brusilovsky's overview) according to a general scheme that has proven useful for integrating the many lines of research in the broad area of user-adaptive systems. (For an application of this scheme to the whole area, see the online proceedings of UM97, the Sixth International Conference on User Modeling, which are accessible from sites in the U.S. and in Germany.)
We'll consider in turn the following questions, which can be asked about any
user-adaptive system:
Several general purposes of adaptation can be distinguished. The following table
shows the purposes that are served by the systems represented in the book.
(For each system, the authors of the article are listed; for more information about
the article, see the Preface.) The table also lists the general
purposes that are sometimes served by other user-adaptive systems, though not by any
systems in this book.
Not surprisingly, we see that in
adaptive hypermedia systems, the goals of helping users to access information and presenting
information in an appropriate way are in the foreground. But several examples also
show that similar methods are applied in systems whose main purpose is
different. Those who are working on types of systems not represented in this book
may want to consider adopting adaptive hypermedia techniques to serve particular
functions within their systems. For example, though the main emphasis in a system
for supporting collaboration may be on choosing appropriate collaborators
for U, adaptive hypermedia could be used for presenting information about the
collaborators or the task effectively.
Adaptive hypermedia systems, more frequently than other types, model the current task
or goal of the user, sometimes after having requested an explicit specification of it (see
below). This property has fairly obvious relevance when it comes to deciding what
information to supply or what links to recommend.
But designers of other types of adaptive systems may also get
some ideas by looking at the ways in which this aspect of a user model is acquired
and put to use
in adaptive hypermedia systems.
Assessments of U's general level of domain knowledge likewise have fairly obvious
uses in such systems, but they are less unique to adaptive hypermedia, being used in
many other types of adaptive system as well.
Adaptive hypermedia systems use about every possible type of data about the user
as evidence on which to base their user models. But on the whole, explicit
elicitation of information about U seems to be relatively more frequent here than
in other areas of user modeling. As several of the authors note, many of the
naturally occurring actions of hypermedia users (e.g., selection of particular
links; time spent on a particular screen) are hard to interpret. On the whole, the
emphasis in this area is more on clever ways of adapting hypermedia, given
particular beliefs about the user, and less on sophisticated methods for deriving
such beliefs on the basis of indirect evidence.
One is struck here by the complete absence of general AI techniques for
uncertainty management and machine learning that have gained increasing prominence
in connection with other types of user-adaptive systems.
By contrast, stereotype-based methods for classifying users are strongly represented in this
collection, relative to the field of user-adaptive systems as a whole (though work
on the three systems that use these methods was completed by the early
1990s). Particular user actions (or combinations thereof) trigger the activation of
associated stereotypes, though these inferences would be hard to justify
theoretically.
Specifically developed formulas and rules, which are independent of any general
framework, may constitute the best inference methods when the inferences to be made
are straightforward and deterministic. But where
noisy data or uncertainty are involved, why not leverage the powerful techniques
that have been developed to deal with this sort of inference problem? Arguments are
found at several points in this collection against the use of "unnecessarily complex
and unwieldy" inference methods. But such arguments remain unconvincing in the light
of (a) the demonstrated utility of powerful AI techniques in connection with other
types of user-adaptive systems and (b) the lack of any attempt to apply such
techniques or even to consider how they might be applied.
Relevant examples of the application of such techniques include the various
types of recommender systems that have been developed in recent years - even though
most of these do not recommend hypermedia documents.
Here again, we see an emphasis on relatively specific insights about appropriate
adaptation, as opposed to general methods for making adaptation decisions. In
connection with the exploitation of an existing user model,
Bayesian and machine learning methods could be used for predicting the user's
reactions to a given system action (e.g., whether U would understand or enjoy a
given screen that S might present). Decision-theoretic methods could be used for
the systematic evaluation of possible system actions.
Although Brusilovsky, in his overview, is rather pessimistic about the current empirical
basis of research into adaptive hypermedia, the articles in this collection do
contain some impressive examples along these lines. For example, Höök et al. and
Vassileva went to unusual lengths, in an early stage of system design, to study
potential users' work habits and needs so as to arrive at
a realistic set of system requirements. When reading these accounts, one wonders
how the system in question would ultimately have been received by users if these
preliminary studies had not been carried out.
Another aspect of this research that deserves imitation is represented by the
several thorough experimental studies of system prototypes. These studies did not
merely check whether the new system was better than a more traditional alternative;
they also yielded rich information about the strengths and limitations of the
approaches taken.
The research represented in this collection includes many relatively specific
insights about ways in which a hypermedia system can adapt its behavior given
particular assumptions about properties of the user. The demonstrated effectiveness
of some of these systems can be seen as proof that technical creativity and
attention to the requirements of a particular application scenario can produce
usefully adaptive systems, even in the absence of powerful inference techniques or a
general theoretical foundation.
At the same time, it would seem advisable to devote increased attention in this
area to machine learning and uncertainty management techniques. The reliance of the
systems on explicit user input - which users are in many contexts reluctant or
unable to
provide accurately - could be reduced, and the systems' behavior could become easier to justify
and to explain.
Needless to say, readers may disagree with the opinions expressed in these
comments. To encourage discussion, I hereby offer to send my shiny new copy
of this book, which was supplied to me for the purpose of this discussion (but which
I don't need, having already possessed all of the articles), to the person
who offers the most interesting contribution to this discussion, in the judgment of the
newletter's editor, Elisabeth André, by April 15th, 1999.
Anthony Jameson makes an interesting observation about the papers in the
AHAH book: that they are more concerned with making useful adaptations to
user properties, than with actually inferring these properties. Most of the
systems presente use explicit methods for acquiring the user model, where
the user explicitly states his or her goals, or some other property that is
used to adapt the hypermedia presentation. Some of the systems also use
implicit models (such as the PUSH system developed by myself and my
colleagues at SICS, [Höök et al 1996]), where the users' actions provide a
basis for inferences about user characteristics, in particular, their
information seeking goals. The PUSH system, as well as the other systems
presented in the book, used hand-crafted rules base to infer
characteristics from user actions. Jameson asks an interesting question:
wouldn't it be a good idea to try out some machine learning or statistical
modelling techniques in the domain of adaptive hypermedia? This has been
used with large success in other systems, most notably in several
recommender systems [Resnick and Varian 1997].
I might not be the right person to answer this, because my immediate answer
is "Yes! Yes!" A large advantage advantage for machine learning methods is
that we know what they do, when they perform well and when they run risks.
Hand-crafting rules is difficult (even though I get a bit annoyed with the
term 'ad-hoc', that Jameson uses) and to be trusted, hand-crafted rules
should be evaluated and bench-marked. This means that they also must go
through a validation process, that could be used to automatically train a
model instead. Also, some studies (see e.g. [Alspector et al 1997])
indicate that the models resulting from such automatic training may turn
out to be more correct than their hand-tailored colleagues.
So what I would like to do in my answer is to go through a number of issues
that affect this training or validation process. The bottom line is that it
is not entirely straightforward to construct these situations for adaptive
hypermedia, whether they are to be used for training or for validation. We
must construct some kind of situations in which we know what the system
should model (and how it should adapt to this model), and sometimes also
negative examples of what should not be modelled.
One approach is to use expert judgements as correct answers. This solution
is similar to what actually was done with some of the systems presented in
the book. In this approach, you provide experts with examples of user
behaviour, and let them suggest what the system should present to users, or
what it should believe about users. This kind of training can be done
on-line, in Wizard of OZ experiments, or off-line, by collecting 'correct
answers' to partial problems that do not require a running system to be
tested. Both methods were used in the Lumiere project [Horvitz et al 1998,
Heckerman and Horvitz 1998]. Such information can either be used as
background knowledge underpinning the hand-tailored inference rules for
user modelling and/or adaptation, or used to train a machine learning
model. In the Lumière project, a Bayesian network was trained from this
kind of information. The most serious limitation here is the expertise
involved: neither method allows the acquired knowledge to be better than
the expert judgements used as a basis for the systems.
I do not think we know if hand-tailoring or machine learning is inherently
better for this purpose. I know of no empirical study, where a machine
learned model has been compared to a rule-base, where both were based on
the same expert information. (I would be happy to learn about such
studies!) To venture a guess, I would expect that when the domain is
limited to a set of example expert judgements, a carefully hand-tailored
rule base should be able to excel over machine-learning methods, since the
hand-tailored rules can take both the actual examples into account as well
as a human interpretation of the examples. But I have no empirical evidence
either way, so this is just a guess.
The complementary approach to expert judgements is to base learning on user
feedback. Learning from user feedback has the great advantage that it
allows us to cope with changes over time, such as a changing set of users,
changing user behaviour, changes in the domain of information etc. Note
that learning from user feedback does not necessarily have to be done by
machine learning. For example, a human expert could review log data and
modify the rules used for user modelling to improve system performance.
However, this is an area where machine learning methods have been shown to
do very well, see e.g. [Alspector 1997].
In this setting, I think that the limits on the quality of user modelling
are determined by how good feedback we can get from user behaviour.
Essentially, I think that the limitations depend on how close the system's
user modelling task is to the user's own actions. For example, in
recommender systems the task of the user model is usually seen to be the
same as the task for the user: rate documents. In adaptive hypermedia, the
tasks can be very different. In InterBook [Eklund et al. 1997], the user
modelling task of the system is to keep track of user knowledge. The user
task is to navigate between hypertext pages. In order to train such a
system from user feedback alone, we must be able to determine which user
actions that indicate that the system had an correct model of user
knowledge, and which indicate that the system has an incorrect model. The
observable user actions can be more or less strong indicators. Machine
learning method will in this setting learn no better than our model of user
feedback allows.
In general, I think that it is a good idea to design systems so that the
user modelling task is kept as close to the user task as possible, so that
the amount of interpretation needed for feedback is minimized. One way is
to limit user modelling to usage modelling [Pohl 1996], making inferences
about what users will do next. Recommender systems often do so, and I think
that this is a key factor in the success of recommender systems.
In PUSH [Höök et al 1996], we tried to short-circuit the user modelling
task in a similar way. Essentially, our analysis of the domain showed that
if the user was involved in certain tasks, certain parts of the information
were relevant. We used this knowledge both for user modelling and for
adaptation. If the user reviewed a certain piece of information, we assumed
that he or she had a task in which this information was relevant. And once
the system had inferred a task, it would select to present all pieces of
information relevant for this task. We used a subset of user actions for
feedback, mainly when users opened and closed pieces of information. If the
user closed a piece of information, this was viewed as negative feedback.
This way, the system modelling task and the user task were very similar.
When the mechanism was designed, I really believed that this would make it
possible to train the PUSH system from user feedback, by keeping track of
which pieces of information that the user selected to view together.
However, there is an additional complication with adaptive hypermedia that
made this training of dubious value for PUSH, and possibly for other AH
systems as well. The problem is that in adaptive hypermedia, the central
task is to select and order information based on the user model. This
selection and ordering will in turn affect what the user can do - and does
- with the information. This throws off the possibilities for feedback. A
simple example from the area of recommender systems is that a user cannot
rate a document unless it has been presented to him or her. Suppose that
the system incorrectly infers that a whole class of documents is
uninteresting to a particular user. Since none of the documents ever is
presented to the user, the user can never provide the feedback that the
assumption was incorrect. But an additional complicating factor is that
users tend to make do with what they get. In PUSH, we found that users did
much less opening and closing of information than we expected. They were
essentially quite happy with what they got from the system. I suspect that
this may happen in recommendation systems as well. Note that there is a
difference here between a test situation as used in [Alspector et al 1997,
Breese et al 1998], and real usage. In a test, the system is trained on a
set of user recommendations, and then used to predict recommendations for a
separate set of test recommendations. Here, the user ratings are not
influenced by system selections (the selections are random). In real usage,
the system will select items for the user to rate, based on its previous
experiences with users, and on the model of the current user. This way, the
set of items that get rated is biased by current system knowledge.
One of our current research projects at SICS concern how to build an
information navigation system in which human intelligence and machine
learning from aggregated user behaviour can be combined to provide user
control and support for human maintenance of services [Höök et al 1997].
This approach makes it possible for users to select documents on many
different properties, and not only on the aggregated recommendations of
similar users. It will be very interesting to see how this influences what
users actually do in the systems.
References
Joshua Alspector, Aleksander Koicz, and N. Karunanithi. Feature-Based and
Clicque-based User Models for Movie Selection: A Comparative Study. UMUAI
vol 7: 279-304, 1997. Kluwer Academic Publishers.
John. S. Breese, David Heckerman and Carl Kadie. Empirical Analysis of
Predictive Algorithms for Collaborative Filtering. Technical report
MSR-TR-98-12, Microsoft Research. Available from
http://research.microsoft.com/scripts/pubDB/pubsasp.asp?RecordID=166
Eklund, J., Brusilovsky, P., and Schwarz, E. (1997) Adaptive Textbooks on
the WWW. In: H. Ashman, P. Thistewaite, R. Debreceny and A. Ellis (eds.)
Proceedings of AUSWEB97, The Third Australian Conference on the World Wide
Web, Queensland, Australia, July 5-9, 1997, Southern Cross University
Press, pp. 186-192. Available as
http://ausweb.scu.edu.au/proceedings/eklund/paper.html
D. Heckerman and E. Horvitz. Inferring Informational Goals from Free-Text
Queries. Proceedings of the Fourteenth Conference on Uncertainty in
Artificial Intelligence, July 1998. Available at
http://research.microsoft.com/~horvitz/aw.HTM
E. Horvitz, J. Breese, D. Heckerman, D. Hovel, and K. Rommelse. The Lumiere
Project: Bayesian User Modeling for Inferring the Goals and Needs of
Software Users. Proceedings of the Fourteenth Conference on Uncertainty in
Artificial Intelligence, July 1998. Available at
http://research.microsoft.com/~horvitz/lumiere.HTM
Höök, K., Karlgren, J., Waern, A., Dahlbäck, N., Jansson, C-G., Karlgren,
K., and Lemaire, B. (1996) A Glass Box Approach to Adaptive Hypermedia.
User Modeling and User-Adapted Interaction 6(2/3): 157-184. Also in ed. P.
Brusilowski, A. Kobsa and J. Vassileva (1998) Adaptive Hypertext and
Hypermedia, Kluwer Academic Publishers.
Kristina Höök, Åsa Rudström and Annika Waern. "Edited Adaptive Hypermedia:
Combining Human and Machine Intelligence to Achieve Filtered Information".
Presented at the Flexible Hypertext Workshop held in conjunction with The
Eighth ACM International Hypertext Conference (Hypertext'97). Available
from http://www.sics.se/~mark/EdInfo/papers.htm
W. Pohl. Learning about the user -- user modeling and machine learning. In
V. Moustakis J. Herrmann, editor, Proc. ICML'96 Workshop "Machine Learning
meets Human-Computer Interaction'', pages 29-40, 1996. Available as
http://fit.gmd.de/~pohl/Papers/UM-and-ML.ps
Paul Resnick and Hal R. Varian. "Recommender Systems" Introduction by Guest
Editors, Communications of the ACM, Vol. 40, No. 2, March,1997.
In what way is the system S's adaptation to the user U intended to be beneficial
to U?
What sort of information about U is represented in S's
user model?
On the basis of what types of evidence does S construct its user
model?
According to what principles or inference techniques does S
arrive at the hypotheses about U that are stored in the user
model?
According to what principles or inference techniques does S
decide how to adapt its behavior on the basis of the information in its user model?
What sorts of empirical data give us reason to believe that S's
methods are valid and useful?
Purposes of Adaptation
Indexing of System-Specific Articles
Comments
Properties of users represented
Indexing of System-Specific Articles
Comments
Input for user model construction
Indexing of System-Specific Articles
Comments
Methods for constructing the user model
Indexing of System-Specific Articles
Methods for exploiting the user model
Indexing of System-Specific Articles
Comments
Empirical foundations
Indexing of System-Specific Articles
Comments
Concluding Remarks
A1. Annika Waern (SICS) (29.3.1999):
To contribute, please click [send contribution] above and send your question
or comment as an E-mail message.
For additional details, please click [debate procedure] above.
This debate is moderated by Elisabeth André.