Package dataprocessor :: Package input :: Module genericreader :: Class GenericReader
[hide private]
[frames] | no frames]

Class GenericReader

source code

Known Subclasses:

Abstract base class to describe the basic functionality of a reader, i.e. a mechanism that can import data from external entities (e.g. files) for use with the framework.

Instance Methods [hide private]
 
__init__(self, input_xml_filename, load=True)
Constructor.
source code
 
load(self) source code
 
unload(self) source code
 
get_parallelsentence(self, XMLEntry) source code
 
get_dataset(self)
Returns the contents of the parsed file into an object structure, which is represented by the DataSet object Note that this will cause all the data of the file to be loaded into system memory at once.
source code
 
get_parallelsentences(self)
Returns the contents of the parsed file into an a list with ParallelSentence objects.
source code
Method Details [hide private]

__init__(self, input_xml_filename, load=True)
(Constructor)

source code 

Constructor. Creates a memory object that handles file data

Parameters:
  • input_xml_filename (string) - the name of file
  • load (boolean) - by turning this option to false, the instance will be initialized without loading everything into memory. This can be done later by calling .load() function

get_dataset(self)

source code 

Returns the contents of the parsed file into an object structure, which is represented by the DataSet object Note that this will cause all the data of the file to be loaded into system memory at once. For big data sets this may not be optimal, so consider sentence-by-sentence reading with SAX, or CElementTree (e.g. saxjcml.py) @return the formed data set @rtype DataSet

get_parallelsentences(self)

source code 

Returns the contents of the parsed file into an a list with ParallelSentence objects. Note that this will cause all the data of the file to be loaded into system memory at once. For big data sets this may not be optimal, so consider sentence-by-sentence reading with SAX or CElementTree (e.g. saxjcml.py) @return the list of parallel sentences @rtype [ParallelSentence, ...]