Package evaluation :: Package ranking :: Module segment
[hide private]
[frames] | no frames]

Module segment

source code

This module allows for the calculation of the basic rank metrics that evaluate on a segment level (i.e. one ranking list at a time)

Created on Nov 25, 2012


Author: Eleftherios Avramidis

Functions [hide private]
float
kendall_tau_prob(tau, pairs)
Calculation of Kendall tau hypothesis testing based on scipy calculation
source code
namedtuple(float, float, int, int, int, int, int, int)
kendall_tau(predicted_rank_vector, original_rank_vector, **kwargs)
This is the refined calculation of segment-level Kendall tau of predicted vs human ranking according to WMT12 (Birch et.
source code
[float, ...]
_calculate_gains(predicted_rank_vector, original_rank_vector, verbose=True)
Calculate the gain for each one of the predicted ranks
source code
float
idcg(gains, k)
Calculate the Ideal Discounted Cumulative Gain, for the given vector of ranking gains
source code
tuple(float,float)
ndgc_err(predicted_rank_vector, original_rank_vector, k=None)
Calculate the normalize Discounted Cumulative Gain and the Expected Reciprocal Rank on a sentence level This follows the definition of DCG and ERR, and the implementation of Yahoo Learning to Rank challenge
source code
 
reciprocal_rank(predicted_rank_vector, original_rank_vector, **kwargs)
Calculates the First Answer Reciprocal Rank according to Radev et.
source code
Variables [hide private]
  __package__ = 'evaluation.ranking'
Function Details [hide private]

kendall_tau_prob(tau, pairs)

source code 

Calculation of Kendall tau hypothesis testing based on scipy calculation

Parameters:
  • tau (float) - already calculated tau coefficient
  • pairs (int) - count of pairs
Returns: float
the probability for the null hypothesis of X and Y being independent

kendall_tau(predicted_rank_vector, original_rank_vector, **kwargs)

source code 

This is the refined calculation of segment-level Kendall tau of predicted vs human ranking according to WMT12 (Birch et. al 2012)

Parameters:
  • predicted_rank_vector ([str, ..]) - a list of integers representing the predicted ranks
  • original_rank_vector ([str, ..]) - the name of the attribute containing the human rank
  • ties (string) - way of handling ties, passed to sentence.ranking.Ranking object
Returns: namedtuple(float, float, int, int, int, int, int, int)
the Kendall tau score, the probability for the null hypothesis of X and Y being independent the count of concordant pairs, the count of discordant pairs, the count of pairs used for calculating tau (excluding "invalid" pairs) the count of original ties, the count of predicted ties, the count of all pairs

_calculate_gains(predicted_rank_vector, original_rank_vector, verbose=True)

source code 

Calculate the gain for each one of the predicted ranks

Parameters:
  • predicted_rank_vector ([int, ...]) - list of integers representing the predicted ranks
  • original_rank_vector ([int, ...]) - list of integers containing the original ranks
Returns: [float, ...]
a list of gains, relevant to the DCG calculation

idcg(gains, k)

source code 

Calculate the Ideal Discounted Cumulative Gain, for the given vector of ranking gains

Parameters:
  • gains ([float, ...]) - a list of integers pointing to the ranks
  • k (int) - the DCG cut-off
Returns: float
the calculated Ideal Discounted Cumulative Gain

ndgc_err(predicted_rank_vector, original_rank_vector, k=None)

source code 

Calculate the normalize Discounted Cumulative Gain and the Expected Reciprocal Rank on a sentence level This follows the definition of DCG and ERR, and the implementation of Yahoo Learning to Rank challenge

Parameters:
  • predicted_rank_vector ([int, ...]) - list of integers representing the predicted ranks
  • original_rank_vector ([int, ...]) - list of integers containing the original ranks.
  • k (int) - the cut-off for the calculation of the gains. If not specified, the length of the ranking is used
Returns: tuple(float,float)
a tuple containing the values for the two metrics

reciprocal_rank(predicted_rank_vector, original_rank_vector, **kwargs)

source code 

Calculates the First Answer Reciprocal Rank according to Radev et. al (2002)

Parameters:
  • predicted_rank_vector ([int, ...]) - list of integers representing the predicted ranks
  • original_rank_vector ([int, ...]) - list of integers containing the original ranks.
  • ties (string) - way of handling ties, passed to sentence.ranking.Ranking object
Returns:
the reciprocal rank value