Package evaluation :: Package ranking :: Module segment

Module segment

This module allows for the calculation of the basic rank metrics that evaluate on a segment level (i.e. one ranking list at a time)

Created on Nov 25, 2012

Author: Eleftherios Avramidis

Functions

[hide private]

float

kendall_tau_prob(tau, pairs)
Calculation of Kendall tau hypothesis testing based on scipy calculation

source code

namedtuple(float, float, int, int, int, int, int, int)

kendall_tau(predicted_rank_vector, original_rank_vector, **kwargs)
This is the refined calculation of segment-level Kendall tau of predicted vs human ranking according to WMT12 (Birch et.

source code

[float, ...]

_calculate_gains(predicted_rank_vector, original_rank_vector, verbose=True)
Calculate the gain for each one of the predicted ranks

source code

float

idcg(gains, k)
Calculate the Ideal Discounted Cumulative Gain, for the given vector of ranking gains

source code

tuple(float,float)

ndgc_err(predicted_rank_vector, original_rank_vector, k=None)
Calculate the normalize Discounted Cumulative Gain and the Expected Reciprocal Rank on a sentence level This follows the definition of DCG and ERR, and the implementation of Yahoo Learning to Rank challenge

source code

reciprocal_rank(predicted_rank_vector, original_rank_vector, **kwargs)
Calculates the First Answer Reciprocal Rank according to Radev et.

source code

Variables

[hide private]

__package__ = 'evaluation.ranking'

Function Details

[hide private]

kendall_tau_prob(tau, pairs)

source code

Calculation of Kendall tau hypothesis testing based on scipy calculation

Parameters:

tau (float) - already calculated tau coefficient
pairs (int) - count of pairs

Returns: float

the probability for the null hypothesis of X and Y being independent

kendall_tau(predicted_rank_vector, original_rank_vector, **kwargs)

source code

This is the refined calculation of segment-level Kendall tau of predicted vs human ranking according to WMT12 (Birch et. al 2012)

Parameters:

predicted_rank_vector ([str, ..]) - a list of integers representing the predicted ranks
original_rank_vector ([str, ..]) - the name of the attribute containing the human rank
ties (string) - way of handling ties, passed to sentence.ranking.Ranking object

Returns: namedtuple(float, float, int, int, int, int, int, int)

the Kendall tau score, the probability for the null hypothesis of X and Y being independent the count of concordant pairs, the count of discordant pairs, the count of pairs used for calculating tau (excluding "invalid" pairs) the count of original ties, the count of predicted ties, the count of all pairs

_calculate_gains(predicted_rank_vector, original_rank_vector, verbose=True)

source code

Calculate the gain for each one of the predicted ranks

Parameters:

predicted_rank_vector ([int, ...]) - list of integers representing the predicted ranks
original_rank_vector ([int, ...]) - list of integers containing the original ranks

Returns: [float, ...]

a list of gains, relevant to the DCG calculation

idcg(gains, k)

source code

Calculate the Ideal Discounted Cumulative Gain, for the given vector of ranking gains

Parameters:

gains ([float, ...]) - a list of integers pointing to the ranks
k (int) - the DCG cut-off

Returns: float

the calculated Ideal Discounted Cumulative Gain

ndgc_err(predicted_rank_vector, original_rank_vector, k=None)

source code

Calculate the normalize Discounted Cumulative Gain and the Expected Reciprocal Rank on a sentence level This follows the definition of DCG and ERR, and the implementation of Yahoo Learning to Rank challenge

Parameters:

predicted_rank_vector ([int, ...]) - list of integers representing the predicted ranks
original_rank_vector ([int, ...]) - list of integers containing the original ranks.
k (int) - the cut-off for the calculation of the gains. If not specified, the length of the ranking is used

Returns: tuple(float,float)

a tuple containing the values for the two metrics

reciprocal_rank(predicted_rank_vector, original_rank_vector, **kwargs)

source code

Calculates the First Answer Reciprocal Rank according to Radev et. al (2002)

Parameters:

predicted_rank_vector ([int, ...]) - list of integers representing the predicted ranks
original_rank_vector ([int, ...]) - list of integers containing the original ranks.
ties (string) - way of handling ties, passed to sentence.ranking.Ranking object

Returns:

the reciprocal rank value