[mary-dev] testing a NLP component
fxavier at ircam.fr
fxavier at ircam.fr
Thu May 5 13:15:07 CEST 2011
Hi All,
Thank you very much Marc, it helps a lot. I did everything you just
mentionned.
So if I understood well, adding those simple lines in the method shouldn't
cause problems?
//show output
String output = new String(result.toString());
System.out.println(output);
Problem is, I don't receive anything in the console, and (with or without
the added lines) I can't get the green tag proving the test has passed. I
just have:
Run 0/1 Errors:1
No comment, nothing at all, and there are no errors in the code, so what
is this error?
Florent
PS: so sorry I'm not able to find any solution on my own...
> Hi Florent,
>
> good start, but a few things need fixing.
>
> First, in order to test-run a MARY module, for most MARY modules you
> need to first start the entire MARY system. Also, for log4j to work
> properly you need to configure it; so you need something like:
>
> @BeforeClass
> public static void startMARY() throws Exception {
> if (Mary.currentState() == Mary.STATE_OFF) {
> Mary.startup();
> }
> if (!MaryUtils.isLog4jConfigured()) {
> BasicConfigurator.configure();
> }
> }
>
> This will avoid the warning message you got from log4j:
>
> > log4j:WARN No appenders could be found for logger (marytts.IO).
> > log4j:WARN Please initialize the log4j system properly.
>
>
> Second, you are trying to set plain text for input format TOKENS. But
> TOKENS is a RAWMARYXML format, so when the XML parser tries to make
> sense of your data, it fails:
>
> > [Fatal Error] :1:1: Content is not allowed in prolog.
>
> The way to avoid this is to enter proper XML data, either using
> md.setData() or using md.readFrom(). I'll describe below how I usually
> do this.
>
>
> Third, you are blinding yourself by discarding the helpful exception.
> NEVER EVER do something like this:
>
> > try
> > {
> > pr.process(md);
> > }
> > catch (Exception e)
> > {
> > return;
> > }
>
> Take a look at some slides where I put down what I think are good
> practices regarding exception handling:
> http://mary.opendfki.de/repos/trunk/doc/ErrorHandling.pdf
>
>
> Fourth, the test should verify that the processing result matches
> expectations. A complex example like "M., 06.67.21.05.41, #, 423 Km, 30
> €. 20h14." will be difficult for this purpose; maybe here the expected
> state is not a specific outcome, but rather the fact that the module can
> process this without crashing at all!
>
> One way how to write a test such that it is clean and readable is to
> write it backwards, starting with the verification... for example:
>
> @Test
> public void canProcessWildStuff() throws Exception {
>
> ...
> // verify expected result:
> assertNotNull(result);
> }
>
> Then, how did we get there:
>
>
> @Test
> public void canProcessWildStuff() throws Exception {
>
> ...
> // exercise system under test:
> MaryData result = preprocessor.process(input);
> // verify expected result:
> assertNotNull(result);
> }
>
>
> and finally, set up the system under test:
>
> @Test
> public void canProcessWildStuff() throws Exception {
> // Set up system under test:
> MaryModule preprocessor =
> ModuleRegistry.getModule(marytts.language.de.Preprocess.class);
> MaryData input = new MaryData(MaryDataType.TOKENS, Locale.GERMAN);
> input.readFrom(this.getClass().getResourceAsStream("wildStuff.tokens"));
> // exercise system under test:
> MaryData result = preprocessor.process(input);
> // verify expected result:
> assertNotNull(result);
> }
>
>
> In this example, you see that I am reading the XML document to test from
> a classpath resource, in the same package as the test class, called
> "wildStuff.tokens". How to create that document? Well, assume we have
> components that can create this input format for the target language.
> For German, this is the case for the public demo, so we can go to
> http://mary.dfki.de:59125/documentation.html#synthesis and enter into
> the GET example:
> INPUT_TEXT: M., 06.67.21.05.41, #, 423 Km, 30 €. 20h14.
> INPUT_TYPE: TEXT
> OUTPUT_TYPE: TOKENS
> LOCALE: de
> ... and click "Submit Query". If that works, you should get a response
> containing your test document (see below); copy that into file
> "wildStuff.tokens" in the test package.
>
> Hope this helps. I know it's complex, but I think it's worth doing these
> things properly. It will pay off after the initial investment, I think.
>
>
> Good luck!
>
> Best,
> Marc
>
>
More information about the Mary-dev
mailing list