Package featuregenerator :: Module diff_generator
[hide private]
[frames] | no frames]

Source Code for Module featuregenerator.diff_generator

 1  """ 
 2   
 3  @author: Eleftherios Avramidis 
 4  """ 
 5  from __future__ import division 
 6  from featuregenerator import FeatureGenerator 
 7   
 8   
9 -class DiffGenerator(FeatureGenerator):
10 """ 11 Operates on a ParallelSentence with two target sentences (pairwise). Computes subtraction of numerical features with the same name in the two target sentences 12 """ 13 14
15 - def get_features_parallelsentence(self, parallelsentence):
16 """ 17 Gets executed once per parallel sentence. Performs subtraction of the respective numerical features of the 2 target sentences. 18 Features with the same name get subtracted and the new feature gets added to the level of the parallel sentence. 19 This is because the feature generation is being used on the last part of the generation process, upon pairwise comparison 20 @param parallesentence: the object of the parallelsentence, already containing the simplesentences for the target translations 21 @type parallelsentence: sentence.parallelsentence.ParallelSentence 22 """ 23 translations = parallelsentence.get_translations() 24 ps_attributes = {} 25 if len(translations)!=2: #diff features make sense only for pairwise comparisons 26 return ps_attributes 27 tgt1_attributes = translations[0].get_attributes() 28 tgt2_attributes = translations[1].get_attributes() 29 for tgt1_attribute_key in tgt1_attributes: 30 if tgt1_attribute_key in tgt2_attributes: 31 #Check if the attribute is actually a float/int number 32 try: 33 tgt1_value = float(tgt1_attributes[tgt1_attribute_key]) 34 tgt2_value = float(tgt2_attributes[tgt1_attribute_key]) 35 except ValueError: 36 #if not possible, jump to the next attribute 37 continue 38 #calculate difference 39 diff = tgt2_value - tgt1_value 40 att_name = "%s_diff" % tgt1_attribute_key 41 ps_attributes[att_name] = str(diff) 42 return ps_attributes
43