[mary-dev] testing a NLP component

fxavier at ircam.fr fxavier at ircam.fr
Thu May 5 13:15:13 CEST 2011


Hi All,

Thank you very much Marc, it helps a lot. I did everything you just
mentionned.

So if I understood well, adding those simple lines in the method shouldn't
cause problems?

       //show output
       String output = new String(result.toString());
       System.out.println(output);

Problem is, I don't receive anything in the console, and (with or without
the added lines) I can't get the green tag proving the test has passed. I
just have:

Run 0/1   Errors:1

No comment, nothing at all, and there are no errors in the code, so what
is this error?



Florent



PS: so sorry I'm not able to find any solution on my own...



> Hi Florent,
>
> good start, but a few things need fixing.
>
> First, in order to test-run a MARY module, for most MARY modules you
> need to first start the entire MARY system. Also, for log4j to work
> properly you need to configure it; so you need something like:
>
> @BeforeClass
> public static void startMARY() throws Exception {
>      if (Mary.currentState() == Mary.STATE_OFF) {
>          Mary.startup();
>      }
>      if (!MaryUtils.isLog4jConfigured()) {
>          BasicConfigurator.configure();
>      }
> }
>
> This will avoid the warning message you got from log4j:
>
>  > log4j:WARN No appenders could be found for logger (marytts.IO).
>  > log4j:WARN Please initialize the log4j system properly.
>
>
> Second, you are trying to set plain text for input format TOKENS. But
> TOKENS is a RAWMARYXML format, so when the XML parser tries to make
> sense of your data, it fails:
>
>  > [Fatal Error] :1:1: Content is not allowed in prolog.
>
> The way to avoid this is to enter proper XML data, either using
> md.setData() or using md.readFrom(). I'll describe below how I usually
> do this.
>
>
> Third, you are blinding yourself by discarding the helpful exception.
> NEVER EVER do something like this:
>
>  >          try
>  >          {
>  >              pr.process(md);
>  >          }
>  >          catch (Exception e)
>  >          {
>  >              return;
>  >          }
>
> Take a look at some slides where I put down what I think are good
> practices regarding exception handling:
> http://mary.opendfki.de/repos/trunk/doc/ErrorHandling.pdf
>
>
> Fourth, the test should verify that the processing result matches
> expectations. A complex example like "M., 06.67.21.05.41, #, 423 Km, 30
> €. 20h14." will be difficult for this purpose; maybe here the expected
> state is not a specific outcome, but rather the fact that the module can
> process this without crashing at all!
>
> One way how to write a test such that it is clean and readable is to
> write it backwards, starting with the verification... for example:
>
> @Test
> public void canProcessWildStuff() throws Exception {
>
>    ...
>    // verify expected result:
>    assertNotNull(result);
> }
>
> Then, how did we get there:
>
>
> @Test
> public void canProcessWildStuff() throws Exception {
>
> ...
>    // exercise system under test:
>    MaryData result = preprocessor.process(input);
>    // verify expected result:
>    assertNotNull(result);
> }
>
>
> and finally, set up the system under test:
>
> @Test
> public void canProcessWildStuff() throws Exception {
>    // Set up system under test:
>    MaryModule preprocessor =
> ModuleRegistry.getModule(marytts.language.de.Preprocess.class);
>    MaryData input = new MaryData(MaryDataType.TOKENS, Locale.GERMAN);
>    input.readFrom(this.getClass().getResourceAsStream("wildStuff.tokens"));
>    // exercise system under test:
>    MaryData result = preprocessor.process(input);
>    // verify expected result:
>    assertNotNull(result);
> }
>
>
> In this example, you see that I am reading the XML document to test from
> a classpath resource, in the same package as the test class, called
> "wildStuff.tokens". How to create that document? Well, assume we have
> components that can create this input format for the target language.
> For German, this is the case for the public demo, so we can go to
> http://mary.dfki.de:59125/documentation.html#synthesis and enter into
> the GET example:
> INPUT_TEXT: M., 06.67.21.05.41, #, 423 Km, 30 €. 20h14.
> INPUT_TYPE: TEXT
> OUTPUT_TYPE: TOKENS
> LOCALE: de
> ... and click "Submit Query". If that works, you should get a response
> containing your test document (see below); copy that into file
> "wildStuff.tokens" in the test package.
>
> Hope this helps. I know it's complex, but I think it's worth doing these
> things properly. It will pay off after the initial investment, I think.
>
>
> Good luck!
>
> Best,
> Marc
>
>


More information about the Mary-dev mailing list