[mary-dev] testing a NLP component
Marc Schroeder
marc.schroeder at dfki.de
Thu Apr 28 08:53:52 CEST 2011
Hi Florent,
good point, thanks for asking it. Since this is about developing, let's
move this discussion to the mary-dev list.
First of all, I now think as much testing as possible should be done
automatically, and on a continuous basis. This way you can let the
machine verify, from now on until the end of time, that what was working
once is still working.
Conceptually one can distinguish two types of testing:
- "unit" testing, which automatically exercises a small piece of code
and asserts that, e.g., a method behaves as expected -- reacts to the
different kinds of possible input in the expected ways, throws
exceptions as promised in the javadoc, etc.
- "integration" testing, which automatically verifies whether the
processing carried out by a subsystem yields the expected result.
My "rule of thumb" test to distinguish one from the other is, do I need
to start up the MARY system (Mary.startup()) in order to run the test?
If so, I think it is an integration test, otherwise I treat it as a unit
test. It's a simplifying approach, but useful.
Practically the difference between the two methods may not be so big for
you when getting started; the key issue is getting started about writing
tests at all.
The tool we use in MARY is junit 4. You can find some examples of tests
(not many yet, but that is going to change over the next few years I
hope) here:
- example of a unit test:
http://mary.opendfki.de/browser/branches/fr-branch/java/marytts/tests/junit4/ByteStringTranslatorTest.java
- example of an integration test:
http://mary.opendfki.de/browser/branches/fr-branch/java/marytts/tests/junit4/RequestTest.java
You can run all tests using "ant test" from the command line; to run a
single test, right-click in Eclipse on the class and select "run as
Junit test". If it is an integration test (i.e. needs to start up mary),
it will fail until you have provided -Dmary.base=... and probably -Xmx1g
or so in the VM arguments of the run target.
Now, to test your own code, all you need to do is to instantiate your
module, send it data from the JUnit test method, and automatically
compare the result with the expected result. I have tried to simplify
this step for MaryModules somewhat by providing a base class,
marytts.tests.modules.MaryModuleTestCase which you can extend.
See java/marytts/tests/junit4/language/de/JTokeniserTest.java for an
example (I confess it fails, which should never happen; I will fix this
but not now).
I hope this can get you started.
Best regards,
Marc
On 27.04.11 16:22, fxavier at ircam.fr wrote:
> Hi all,
>
> I'm trying to build NLP for french.
>
> Is there a way to test my .java (preprocessing) without coding all the
> NLPs, and of course without following the support for new language (that
> requires all the NLP ready and is pretty long)?
>
> By testing, I mean giving a simple text as input, and see the output if
> the preprocessing part is good. I would like to test whether my code is
> correct or not before going any further.
>
> Thanks in advance,
>
>
>
>
--
Dr. Marc Schröder, Senior Researcher at DFKI GmbH
Project leader for DFKI in SSPNet http://sspnet.eu
Team Leader DFKI TTS Group http://mary.dfki.de
Editor W3C EmotionML Working Draft http://www.w3.org/TR/emotionml/
Portal Editor http://emotion-research.net
Homepage: http://www.dfki.de/~schroed
Email: marc.schroeder at dfki.de
Phone: +49-681-85775-5303
Postal address: DFKI GmbH, Campus D3_2, Stuhlsatzenhausweg 3, D-66123
Saarbrücken, Germany
--
Official DFKI coordinates:
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
More information about the Mary-dev
mailing list