I am originally interested in the extraction of formal rules from
Recurrent Neural Networks (RNNs). RNNs are difficult to train and
difficult to analyse and rule extraction may help us to get a better
understanding of RNNs. I have created an algorithm that I call
Crystallizing Substochastic Sequential Machine Extractor
(CrySSMEx) that seems to work better than all
previous approaches. The paper is available and an open source
(GPL)
distribution is being prepared: cryssmex.sourceforge.net.
-
Jacobsson, H., Hawes, N., Kruijff, G-J, Wyatt, J.
Crossmodal Content Binding in Information-Processing Architectures.
Proceedings of the 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).
March 2008.
Amsterdam, The Netherlands.
[abstract]
[bib]
Operating in a physical context, an intelligent robot faces two fundamental problems. First, it needs to combine information from its different sensors to form a representation of the environment that is more complete than any representation a single sensor could provide. Second, it needs to combine high-level representations (such as those for planning and dialogue) with sensory information, to ensure that the interpretations of these symbolic representations are grounded in the situated context. Previous approaches to this problem have used techniques such as (low-level) information fusion, ontological reasoning, and (high-level) concept learning. This paper presents a framework in which these, and related approaches, can be used to form a shared representation of the current state of the robot in relation to its environment and other agents. Preliminary results from an implemented system are presented to illustrate how the framework supports behaviours commonly required of an intelligent robot.
@InProceedings{jacobsson+hawes+kruijff+wyatt_2008,
author = {Henrik Jacobsson and Nick Hawes and Geert-Jan
Kruijff and Jeremy Wyatt},
title = {Crossmodal Content Binding in Information-Processing
Architectures},
booktitle = {Proceedings of the 3rd ACM/IEEE International
Conference on Human-Robot Interaction (HRI)},
year = {2008},
month = {March 12--15},
address = {Amsterdam, The Netherlands}
}
-
Jacobsson, H., Hawes, N., Skocaj, D., Kruijff,
G-J. Interactive Learning and Cross-Modal Binding - A
Combined Approach. Presented at the Symposium on Language
and Robots, Aveiro, Portugal, 2007.
[abstract]
[bib]
[pdf]
The paper is an extended abstract. Have a look at the
pdf.
@InProceedings{jacobsson+hawes+skocaj+kruijff_2007,
title = {Interactive Learning and Cross-Modal Binding - A
Combined Approach},
author = {Henrik Jacobsson and Nick Hawes and Danijel
Sko\v{c}aj and Geert-Jan Kruijff},
booktitle = {Symposium on Language and Robots},
address = {Aveiro, Portugal},
year = {2007}
}
-
Jacobsson, H., Hawes, N., Kruijff, G-J., Wyatt,
J. Crossmodal Content Binding in Information-Processing
Architectures. Submitted to HRI 2008 and presented at
the Symposium on Language and Robots, Aveiro, Portugal, 2007.
[abstract]
[bib] [pdf]
Operating in a physical context, an intelligent robot
faces two fundamental problems. First, it needs to
combine information from its different sensors to form a
representation of the environment that is more complete
than any of its sensors on its own could
provide. Second, it needs to combine high-level
representations (such as those for planning and
dialogue) with its sensory information, to ensure that
the interpretations of these symbolic representations
are grounded in the situated context. Previous
approaches to this problem have used techniques such as
(low-level) information fusion, ontological reasoning,
and (high-level) concept learning. This paper presents a
framework in which these, and other approaches, can be
combined to form a shared representation of the current
state of the robot in relation to its environment and
other agents. Preliminary results from an implemented
system are presented to illustrate how the framework
supports behaviours commonly required of an intelligent
robot.
@InProceedings{jacobsson+hawes+kruijff+wyatt_2007,
title = {Crossmodal Content Binding in Information-Processing
Architectures},
author = {Henrik Jacobsson and Nick Hawes and Geert-Jan
Kruijff and Jeremy Wyatt},
booktitle = {Symposium on Language and Robots},
address = {Aveiro, Portugal},
year = {2007},
note = {extended version of
jacobsson+hawes+kruijff+wyatt\_2008}
}
@Unpublished{jacobsson+hawes+kruijff+wyatt_2008,
author = {Henrik Jacobsson and Nick Hawes and Geert-Jan
Kruijff and Jeremy Wyatt},
title = {Crossmodal Content Binding in Information-Processing
Architectures},
note = {Submitted to HRI'08}
}
-
Jacobsson, H., Frank, S.L., Federici, D. Automated
abstraction of dynamic neural systems for natural language
processing. Proceedings of IJCNN 2007 (electronic
proceedings). [abstract]
[bib] [pdf]
This paper presents a variant of the Crystallizing
Substochastic Sequential Machine Extractor (CrySSMEx),
an algorithm capable of extracting finite state
descriptions of dynamic systems such as recurrent neural
networks, without any regard to their topology
or weights. The algorithm is applied to a network
performing a language prediction task. The extracted
state machines provide a very detailed view of the
operations of the RNN by abstracting and discretizing
its functional behaviour. Here we extend previous work
also by extracting state machines in Moore, rather than
in Mealy, format. This subtle difference opens up the
rule extractor to more domains, including sensorimotor
modelling of autonomous robotic systems. Experiments
are also conducted on far more input symbols, providing
a greater insight into the behaviour of the algorithm.
@InProceedings{jacobsson+frank+federici_2007_ijcnn,
author = {Henrik Jacobsson and Stefan L. Frank and Diego
Federici},
title = {Automated Abstraction of Dynamic Neural Systems for
Natural Language Processing},
booktitle = {Proceedings of International Conference on Neural
Networks 2007},
year = {to appear},
OPTurl = {http://www.dfki.de/~henrikj/publications/ijcnn2007.pdf},
}
-
N. Hawes, A. Sloman, J. Wyatt, M. Zillich, H. Jacobsson,
G. Kruijff, M. Brenner, G. Berginc, and
D. Skocaj. Towards an integrated robot with multiple
cognitive functions. In Proceedings of the Twenty-Second
Conference on Artificial Intelligence (AAAI-07), 2007.
[abstract]
[bib]
[pdf]
We present integration mechanisms for combining
heterogeneous components in a situated information
processing system, illustrated by a cognitive robot
able to collaborate with a human and display some
understanding of its surroundings. These mechanisms
include an architectural schema that encourages
parallel and incremental information processing, and a
method for binding information from distinct
representations that when faced with rapid change in
the world can maintain a coherent, though distributed,
view of it. Provisional results are demonstrated in a
robot combining vision, manipulation, language,
planning and reasoning capabilities interacting with a
human and manipulable objects.
@inproceedings{Hawes/etal:2007,
author = {Hawes, N and Sloman, A. and Wyatt, J. and Zillich,
M. and Jacobsson, H. and Kruijff, G.J.M and Brenner,
M. and Berginc, G. and Skocaj, D.},
title = {Towards an Integrated Robot with Multiple Cognitive
Functions},
booktitle = {Proceedings of the Twenty-Second Conference on
Artificial Intelligence (AAAI-07)},
year = {2007}
}
-
Jacobsson, H., Kruijff, G-J., Staudte, M. From Rule Extraction to Active
Learning Symbol Grounding. Extended abstract presented on the
Concept
Learning for Embodied Agents workshop at ICRA 2007, Rome.
[abstract]
[bib]
[pdf]
The paper focuses on a fundamental learning problem
in adaptive, embodied cognitive systems: Namely, how
to learn discrete models of situated, embodied
experience which can act as a mediation between
sensori-motoric experience and high-level cognitive
processes. The paper suggests to address the problem
using a combination of bottom up active learning of
embodied concepts solely on the basis of the actions
and perceptions of the robot, and top-down
information obtained through interaction with other
agents. The embodied concepts are constructed to be
informative for the robot in terms of its
sensorimotor prediction capability. From that point
the effort of constructing humanlike concepts is
shifted towards producing a translation between the
sensorimotor based bottom-up on- tology and more
conventional top-down constructed ontologies. The
suggested framework is based on a parameter free
rule extraction algorithm that successfully has been
applied to the problem of creating finite state
descriptions of large, complex and even chaotic
simulated dynamic systems. We will briefly describe
how this algorithm can be ported to an autonomous
robot domain.
@InProceedings{jacobsson+kruijff+staudte_2007_icra,
author = {Jacobsson, H. and Kruijff, G-J. and Staudte, M.},
title = {From Rule Extraction to Active Learning Symbol
Grounding},
booktitle = {Concept Learning for Embodied Agents workshop at
ICRA},
year = {2007},
month = {May},
address = {Rome}
}
-
Jacobsson, H., Kruijff, G-J., Staudte, M. Language
Acquisition from Neural and Sensorimotor
Systems. Extended abstract presented at the PASCAL workshop on
Machine Learning and
Cognitive Science of Language Acquisition
[abstract]
[bib]
[pdf]
[poster(pdf)]
The paper
is an abstract. Have a look at the
pdf instead.
@InProceedings{jacobsson+kruijff+staudte_2007_mlcsla,
author = {Jacobsson, H. and Kruijff, G-J. and Staudte, M.},
title = {Language Acquisition from Neural and Sensorimotor
Systems},
booktitle = {The PASCAL workshop on Machine Learning and
Cognitive Science of Language Acquisition},
year = {2007},
editor = {Alex Clark and Nick Chater},
month = {June}
}
-
Jacobsson, H. (2006). The Crystallizing Substochastic
Sequential Machine Extractor - CrySSMEx. Neural
Computation, 18(9), pp. 2211-2255.
[abstract]
[bib] [pdf]
This article presents an algorithm, CrySSMEx, for
extracting minimal finite state machine descriptions
of dynamic systems such as recurrent neural networks.
Unlike previous algorithms, CrySSMEx is parameter
free, deterministic, and it efficiently generates a
series of increasingly refined models. A novel finite
stochastic model of dynamic systems and a novel vector
quantization function have been developed to take into
account the state space dynamics of the system. The
experiments show that (a) extraction from systems that
can be described as regular grammars is trivial, (b)
extraction from high-dimensional systems is feasible
and (c) extraction of approximative models from
chaotic systems is possible. The results are
promising, but an analysis of shortcomings suggests
some possible further improvements. Some largely
overlooked connections, of the field of rule
extraction from recurrent neural networks, to other
fields are also identified.
@Article{jacobsson_2006,
author = {H. Jacobsson},
title = {The Crystallizing Substochastic Sequential Machine
Extractor - \texttt{CrySSMEx}},
journal = {Neural Computation},
volume = 18,
number = 9,
pages = {2211--2255},
year = {2006}
}
-
Jacobsson, H. (2005). Rule Extraction from Recurrent Neural
Networks: A Taxonomy and Review. In Neural
Computation, 17(6), 1223-1263.
[abstract]
[bib]
[pdf]
Rule extraction (RE) from recurrent
neural networks (RNNs) refers to finding models of the
underlying RNN, typically in the form of finite state
machines, that mimic the network to a satisfactory degree
while having the advantage of being more transparent. RE
from RNNs can be argued to allow a deeper and more profound
form of analysis of RNNs than other, more or less \textit{ad
hoc} methods. RE may give us understanding of RNNs in the
intermediate levels between quite abstract theoretical
knowledge of RNNs as a class of computing devices and
quantitative performance evaluations of RNN
instantiations. The development of techniques for extraction
of rules from RNNs has been an active field since the early
nineties. In this paper, the progress of this development is
reviewed and analysed in detail. In order to structure the
survey and to evaluate the techniques, a taxonomy,
specifically designed for this purpose, has been
developed. Moreover, important open research issues are
identified, that, if addressed properly, possibly can give
the field a significant push forward.
@Article{jacobsson_2005_survey,
author = {H. Jacobsson},
title = {Rule Extraction from Recurrent Neural Networks: A
Taxonomy and Review},
journal = {Neural Computation},
volume = 17,
number = 6,
pages = {1223--1263},
year = {2005}
}
-
Jacobsson, H. and Ziemke T. (2005).
CrySSMEx, a Novel Rule Extractor for Recurrent Neural Networks :
Overview and Case Study. In W. Duch, J. Kacprzyk, E. Oja and S. Zadrozny (Eds.),
Artificial Neural Networks: Formal Models and Their Applications -
ICANN 2005 - Part II (pp. 503-508). Berlin: Springer.
[abstract]
[bib]
[pdf]
In this paper, it will be shown that it is feasible to extract finite
state machines in a domain of, for rule extraction, previously
unencountered complexity. The algorithm used is called the
Crystallizing Substochastic Sequential Machine Extractor, or
CrySSMEx. It extracts the machine from sequence data
generated from the RNN in interaction with its
domain. CrySSMEx is parameter free, deterministic and
generates a sequence of increasingly deterministic extracted
stochastic models until a fully deterministic machine is found.
@InProceedings{jacobsson+ziemke_2005_icann,
author = {Henrik Jacobsson and Tom Ziemke},
title = {\texttt{CrySSMEx}, a Novel Rule Extractor for
Recurrent Neural Networks : Overview and Case Study},
pages = {503--508},
booktitle = {Artificial Neural Networks: Formal Models and Their
Applications - {ICANN} 2005 - Part {II} },
year = 2005,
editor = {W. Duch and J. Kacprzyk and E. Oja and S. Zadrozny},
publisher = {Springer},
address = {Berlin}
}
-
Jacobsson, H. and Ziemke T. (2005).
Towards Automation of "Normal Science" through Empirical Machines.
Presented at ECAP'05: the European Computing And Philosophy
conference, Västerås, Sweden.
-
Stening, J., Jacobsson, H. & Ziemke, T. (2005). Imagination and
Abstraction of Sensorimotor Flow: Towards a Robot Model. In:
AISB'05: Proceedings of the Symposium on Next Generation Approaches
to Machine Consciousness - Imagination, Development, Intersubjectivity
and Embodiment (pp. 50-58). The Society for the Study of
Artificial Intelligence and the Simulation of Behavior, UK. ISBN
1-902956-46-8.
-
Jacobsson, H. and Ziemke T. (2005).
Rethinking Rule Extraction from Recurrent Neural Networks. In A.
d'Avila Garcez and J. Elman and P. Hitzler (Eds.), IJCAI-05
Workshop on
Neural-Symbolic Learning and Reasoning NeSy-05.
[abstract]
[bib]
[pdf]
We will in this paper identify some of the central
problems of current techniques for rule extraction from
recurrent neural networks (RNN-RE). Then we will raise
the expectations of future RNN-RE techniques
considerably and through this, hopefully guide the
research towards a common goal. Some preliminary results
based on work in line with these goals, will also be
presented.
@InProceedings{jacobsson+ziemke_2005_ijcai,
author = {Henrik Jacobsson and Tom Ziemke},
title = {Rethinking Rule Extraction from Recurrent Neural
Networks},
booktitle = {{IJCAI-05} Workshop on Neural-Symbolic Learning and
Reasoning },
year = 2005,
editor = {Artur {d'Avila Garcez} and Jeff Elman and Pascal
Hitzler}
}
-
Jacobsson, H. and Ziemke, T. (2003) Improving Procedures for Evaluation of Connectionist
Context-Free Language Predictors. In IEEE Transactions on Neural Networks,
14(4), 963-966.
[abstract]
[bib]
[pdf]
This paper shows how seemingly minor differences in
training and evaluation procedures used in recent
studies of recurrent neural networks as context free
language predictors can lead to significant differences
in apparent network performance. We therefore suggest
standard testing procedures whose use would facilitate
better reproducability and comparability.
@Article{jacobsson+ziemke_2003_improving,
author = {H. Jacobsson and T. Ziemke},
title = {Improving Procedures for Evaluation of Connectionist
Context-Free Language Predictors},
journal = {{IEEE} Transactions on Neural Networks},
year = 2003,
volume = 14,
number = 4,
pages = {963--966}
}
-
Linåker, F. and Jacobsson, H. (2001).
Mobile Robot Learning of Delayed Response Tasks through Event Extraction:
A Solution to the Road Sign Problem and Beyond.
Seventeenth International Joint Conference on Artificial Intelligence (IJCAI 01),
pp. 777-782.
[abstract]
[bib]
[pdf]
[ps]
We show how event extraction can be used for handling
delayed response tasks with arbitrary delay periods
between the stimulus and the cue for response. Our
approach is based on a number of information processing
levels, where the lowest level works on raw time-stepped
based sensory data. This data is classified using an
unsupervised clustering mechanism. The second level
works on this classified data, but still on the
individual time-step basis. An event extraction
mechanism detects and signals transitions between
classes; this forms the basis for the third level. As
this level only is updated when events occur, it is
independent of the time-scale of the lower level
interaction. We also sketch how an event filtering
mechanism could be constructed which discards irrelevant
data from the event stream. Such a mechanism would
output a fourth level representation which could be used
for delayed response tasks where irrelevant, or
distracting, events could occur during the delay.
@INPROCEEDINGS{linaker+jacobsson_2001_mobile,
author = {F. Lin{\aa}ker and H. Jacobsson},
editor = {B. Nebel},
booktitle = {Proceedings of the Seventeenth International Joint
Conference on Artificial Intelligence, {IJCAI}-2001},
title = {Mobile Robot Learning of Delayed Response Tasks
through Event Extraction: A Solution to the Road
Sign Problem and Beyond},
editor = {Bernhard Nebel},
address = {San Fransisco},
pages = {777--782},
publisher = {Morgan Kaufmann},
year = {2001}
}
-
Linåker, F. and Jacobsson, H. (2001).
Learning Delayed Response Tasks through Unsupervised Event Extraction.
In International Journal of Computational Intelligence and Applications,
1(4), 413-426. (an invited version of the IJCAI 2001 publication.)
[abstract]
[bib]
[pdf]
We show how event extraction can be used for handling
delayed response tasks with arbitrary delay periods
between the stimulus and the cue for response. We use a
simple recurrent network for solving the task. Our
approach is based on a number of information processing
levels, where the lowest level works on raw time-step
based sensory data. This data is classified using an
unsupervised clustering mechanism. The second level
works on this classified data, but still on the
individual time-step basis. An event extraction
mechanism detects and signals transitions between
classes; this forms the basis for the third level. As
this level only is updated when events occur, it is
independent of the time- scale of the lower level
interaction.We also sketch how an event filtering
mechanism could be constructed which discards irrelevant
data from the event stream. Such a mechanism would
output a fourth level representation which could be used
for delayed response tasks where irrelevant, or
distracting, events could occur during the delay.
@Article{linaker+jacobsson_2001_learning,
author = {Fredrik Lin{\aa}ker and Henrik Jacobsson},
title = {Learning Delayed Response Tasks through Unsupervised
Event Extraction},
journal = {International Journal of Computational Intelligence
and Applications},
year = 2001,
pages = {413--426},
volume = 1,
number = 4
}
-
Boden, M., Jacobsson, H. and Ziemke, T. (2000),
Evolving context-free language predictors.
In Proceedings of the Genetic and Evolutionary
Computation Conference. pp. 1033-1040.
[abstract]
[bib]
[pdf]
[ps]
Recurrent neural networks can represent and process
simple context-free languages. However, the difficulty
of finding with gradient-based learning appropriate
weights for context-free language prediction motivates
an investigation on the applicability of evolutionary
algorithms. By empirical studies, an evolutionary
algorithm proves to be more reliable in finding
prediction solutions to a simple CFL. Moreover, the
evolutionary algorithm demonstrates greater diversity by
making use of a larger repertoire of dynamical behaviors
for solving the problem.
@INPROCEEDINGS{boden+jacobsson+ziemke_2000,
author = {M. Bod{\'e}n and H. Jacobsson and T. Ziemke},
title = {Evolving context-free language predictors},
booktitle = {Proceedings of the Genetic and Evolutionary
Computation Conference},
publisher = {Morgan Kaufmann},
editor = {D. Whitley and D. Goldberg and E. Cant\'u-Paz and
L. Spector and I. Parmee and Hans-Georg Beyer},
address = {San Fransisco},
year = 2000,
pages = {1033--1040}
}
-
Jacobsson, H. and Olsson, B. (2000).
An Evolutionary Algorithm for Inversion of ANNs.
In Wang, P.P., ed., Proceedings of The Fifth Joint
Conference on Information Sciences, pp. 1070-1073.
[abstract]
[bib]
[ps]
[pdf]
Before using a trained artificial neural network (ANN)
in an application it is important to identify inputs
which cause incorrect behaviours. We therefore propose
the use of an evolutionary algorithm (EA) to invert the
mappings of ANNs. The EA is used to search for input
patterns which produce strong (distinct) classifications
into one of the classes. Since the input space is
typically very large, multimodal, and poorly understood,
EAs are likely to be more robust than gradient methods,
with a lower probability of getting stuck on local
optima. Analysis of our results supports this
hypothesis. Our evolutionary algorithm also involves the
use of niching, which allows it to simultaneously
explore multiple regions of the search space. The
resulting population of input patterns therefore
typically represents a set of distinctly different
instances. This property is important, since the aim is
to identify inputs which are erroneously classified. We
show how analysis of the set of inputs found by
inversion can lead to detection of flaws in the ANN, and
we discuss the possibilities of using this inversion
method as a tool for validation and re-training.
@INPROCEEDINGS{jacobsson+olsson_2000,
author = {H. Jacobsson and B. Olsson},
title = {An Evolutionary Algorithm for Inversion of
Artificial Neural Networks},
booktitle = {Proceedings of The Fifth Joint Conference on
Information Sciences},
editor = {P. P. Wang},
publisher = {Association for Intelligent Machinery},
year = {2000},
pages = {1070--1073}
}
-
Bodén, M., Jacobsson, H. and Ziemke, T. (2000), Evolving
recurrent networks for context-free language prediction.
Extended abstract presented at the Workshop for Evolutionary Computation in
Cognitive Science (ECCS), Melbourne.
[ps]
[pdf]
These files comes from a project where I wanted to fill
polygons with circles. [pdf]
Then I wanted to fill polygons with interlocking gears. I still have some work to do.