[mary-dev] WikipediaProcessor: Japanese Processing Exception

Marc Schroeder schroed at dfki.de
Mon Nov 9 08:52:42 CET 2009


Thanks for this update -- good to hear that more memory solves this 
problem. Of course it seems curious that 2GB of RAM should be required 
for running this code; if anyone would like to try and reduce the 
footprint, let me know.

Best,
Marc

Hind Abdul-Khaleq schrieb:
> The problem solved with "-Xmx2000m"   given the vm and without other 
> changes to the source .
> Thanks a lot and All the Best.
> 
> 
> 
>     --- On *Wed, 10/28/09, Hind Abdul-Khaleq /<habdolkhaleq at yahoo.com>/*
>     wrote:
> 
> 
>         From: Hind Abdul-Khaleq <habdolkhaleq at yahoo.com>
>         Subject: Re: [mary-dev] WikipediaProcessor: Japanese Processing
>         Exception
>         To: mary-dev at dfki.de
>         Date: Wednesday, October 28, 2009, 11:45 AM
> 
>         I'm getting this exception while processing Japanese
>         I changed the encoding to "EUC_JP" at the line
> 
>                     word = new String(wordBytes, "UTF8"); 
>         in 
>         marytts.tools.dbselection.DBHandler.getMostFrequentWords(DBHandler.java:1366)
> 
>         but it produced another exception at the next line:
>                  wordList.put(word, new Integer(rs.getInt(2)));
>         Exception in thread "main" java.lang.OutOfMemoryError: Java heap
>         space
>             at java.util.HashMap.resize(HashMap.java:462)
>             at java.util.HashMap.addEntry(HashMap.java:755)
>             at java.util.HashMap.put(HashMap.java:385)
>             at
>         marytts.tools.dbselection.DBHandler.getMostFrequentWords(DBHandler.java:1367)
>             at
>         marytts.tools.dbselection.WikipediaMarkupCleaner.updateWordList(WikipediaMarkupCleaner.java:953)
>             at
>         marytts.tools.dbselection.WikipediaMarkupCleaner.processWikipediaPages(WikipediaMarkupCleaner.java:1133)
>             at
>         marytts.tools.dbselection.WikipediaProcessor.main(WikipediaProcessor.java:368)
> 
>            
>         also I do "-Xmx1000m",... so what to do?
> 
>         --- On *Wed, 10/28/09, Hind Abdul-Khaleq
>         /<habdolkhaleq at yahoo.com>/* wrote:
> 
> 
>             From: Hind Abdul-Khaleq <habdolkhaleq at yahoo.com>
>             Subject: [mary-dev] WikipediaProcessor: Japanese Exception
>             To: mary-dev at dfki.de
>             Date: Wednesday, October 28, 2009, 11:34 AM
> 
>             Exception in thread "main" java.lang.OutOfMemoryError: GC
>             overhead limit exceeded
>                 at java.util.Arrays.copyOf(Arrays.java:2882)
>                 at java.lang.StringCoding.safeTrim(StringCoding.java:75)
>                 at java.lang.StringCoding.access$100(StringCoding.java:34)
>                 at
>             java.lang.StringCoding$StringDecoder.decode(StringCoding.java:151)
>                 at java.lang.StringCoding.decode(StringCoding.java:173)
>                 at java.lang.String.<init>(String.java:443)
>                 at java.lang.String.<init>(String.java:515)
>                 at
>             marytts.tools.dbselection.DBHandler.getMostFrequentWords(DBHandler.java:1366)
>                 at
>             marytts.tools.dbselection.WikipediaMarkupCleaner.updateWordList(WikipediaMarkupCleaner.java:953)
>                 at
>             marytts.tools.dbselection.WikipediaMarkupCleaner.processWikipediaPages(WikipediaMarkupCleaner.java:1133)
>                 at
>             marytts.tools.dbselection.WikipediaProcessor.main(WikipediaProcessor.java:368)
> 
> 
> 
>             -----Inline Attachment Follows-----
> 
>             _______________________________________________
>             Mary-dev mailing list
>             Mary-dev at dfki.de
>             http://www.dfki.de/mailman/cgi-bin/listinfo/mary-dev
> 
> 
> 
>         -----Inline Attachment Follows-----
> 
>         _______________________________________________
>         Mary-dev mailing list
>         Mary-dev at dfki.de
>         http://www.dfki.de/mailman/cgi-bin/listinfo/mary-dev
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Mary-dev mailing list
> Mary-dev at dfki.de
> http://www.dfki.de/mailman/cgi-bin/listinfo/mary-dev

-- 
Dr. Marc Schröder, Senior Researcher at DFKI GmbH
Coordinator EU FP7 Project SEMAINE http://www.semaine-project.eu
Portal Editor http://emotion-research.net
Team Leader DFKI Speech Group http://mary.dfki.de

Homepage: http://www.dfki.de/~schroed
Email: schroed at dfki.de
Phone: +49-681-302-5303
Postal address: DFKI GmbH, Campus D3_2, Stuhlsatzenhausweg 3, D-66123 
Saarbrücken, Germany
--
Official DFKI coordinates:
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany
Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313


More information about the Mary-dev mailing list