Classification Metrics
This package allows you to use a variety of Classification metrics for the performance analysis of Classification models based on the provided y_true
and y_pred
. The metrics that you choose to evaluate your machine learning model is very important. Choice of metrics influences how the performance of machine learning algorithms is measured and compared. For most of these function, it is expected that the provided
Functions
Metrics.binary_accuracy
— Functionbinary_accuracy(y_pred, y_true; threshold=0.5)
Calculates Averaged Binary Accuracy based on y_pred
and y_true
. Argument threshold
is used to specify the minimum predicted probability y_pred
required to be labelled as 1
. Default value set as 0.5
.
Metrics.categorical_accuracy
— Functioncategorical_accuracy(y_pred, y_true)
Calculates Averaged Categorical Accuracy based on y_pred
and y_true
.
Metrics.cohen_kappa
— Functioncohen_kappa(y_pred, y_true)
Measures the agreement between two raters (predicted and ground truth, here) who each classify N items into C mutually exclusive categories, using the observed data to calculate the probabilities of each observer randomly seeing each category. If the raters are in complete agreement then κ = 1. If there is no agreement among the raters other than what would be expected by chance, κ = 0.
Ref: Cohen's Kappa
Metrics.confusion_matrix
— Functionconfusion_matrix(y_pred, y_true)
Function to create a confusionmatrix for classification problems based on provided `ypredand
ytrue. Expects
ytrue`, to be onehot_enocded already.
Metrics.f_beta_score
— Functionf_beta_score(y_pred, y_true; β=1, avg_type="macro", sample_weights=nothing)
Compute fbeta score. The F_beta score is the weighted harmonic mean of precision and recall, reaching its optimal value at 1 and its worst value at 0.
Arguments
y_pred
: predicted values.y_true
: ground truth values on the basis of which predicted values are to be assessed.β=1
: the weight of precision in the combined score. Ifβ<1
, more weight given toprecision
, whileβ>1
favors recall.avg_type="macro"
: Type of average to be used while calculating precision of multiclass models. Can take values asmacro
,micro
andweighted
. Default set tomacro
.sample_weights
: Class weights to be provided whenavg_type
is set toweighted
. Useful in case of imbalanced classes.
Metrics.false_alarm_rate
— Functionfalse_alarm_rate(y_pred, y_true; avg_type="macro", sample_weights=nothing)
Computes the falsealarmraye of the predictions with respect to the labels as 1 - specificity(y_pred, y_true, avg_type, sample_weights)
Arguments
y_pred
: predicted values.y_true
: ground truth values on the basis of which predicted values are to be assessed.avg_type="macro"
: Type of average to be used while calculating precision of multiclass models. Can take values asmacro
,micro
andweighted
. Default set tomacro
.sample_weights
: Class weights to be provided whenavg_type
is set toweighted
. Useful in case of imbalanced classes.
See also: specificity
Metrics.precision
— Functionprecision(y_pred, y_true; avg_type="macro", sample_weights=nothing)
Computes the precision of the predictions with respect to the labels.
Arguments
y_pred
: predicted values.y_true
: ground truth values on the basis of which predicted values are to be assessed.avg_type="macro"
: Type of average to be used while calculating precision of multiclass models. Can take values asmacro
,micro
andweighted
. Default set tomacro
.sample_weights
: Class weights to be provided whenavg_type
is set toweighted
. Useful in case of imbalanced classes.
Metrics.recall
— Functionrecall(y_pred, y_true; avg_type="macro", sample_weights=nothing)
Computes the recall of the predictions with respect to the labels.
Arguments
y_pred
: predicted values.y_true
: ground truth values on the basis of which predicted values are to be assessed.avg_type="macro"
: Type of average to be used while calculating precision of multiclass models. Can take values asmacro
,micro
andweighted
. Default set tomacro
.sample_weights
: Class weights to be provided whenavg_type
is set toweighted
. Useful in case of imbalanced classes.
Aliases: sensitivity
and detection_rate
Metrics.sparse_categorical
— Functionsparse_categorical(y_pred, y_true)
Calculated Sparse Categorical Accuracy based on y_pred
and y_true
. It evaluates the maximal true value is equal to the index of the maximal predicted value. Here, y_true
is expected to provide only an integer (start from 0
index) as label for each data element (ie. not one hot encoded).
Metrics.specificity
— Functionspecificity(y_pred, y_true; avg_type="macro", sample_weights=nothing)
Computes the specificity of the predictions with respect to the labels.
Arguments
y_pred
: predicted values.y_true
: ground truth values on the basis of which predicted values are to be assessed.avg_type="macro"
: Type of average to be used while calculating precision of multiclass models. Can take values asmacro
,micro
andweighted
. Default set tomacro
.sample_weights
: Class weights to be provided whenavg_type
is set toweighted
. Useful in case of imbalanced classes.
Metrics.top_k_categorical
— Functiontop_k_categorical(y_pred, y_true; k=3)
Evaluates if the index of true value is equal to any of the indices of top k predicted values. Default value of k
set to 3
.
Metrics.top_k_sparse_categorical
— Functiontop_k_sparse_categorical(y_pred, y_true; k=3)
Evaluates if the true value is equal to any of the indices of top k predicted values. Default value of k
set to 3
. Similar to sparse_categorical
, expects the y_true
to provide only an integer (start from 0
index) as label for each data element (ie. not one hot encoded).
Combined Stats
There are some functions that return you the overall analysis of the model performance within a single function. They are:
Metrics.statsfromTFPN
— FunctionstatsfromTFPN(TP, TN, FP, FN)
Computes statistics in case of binary classification or one-vs-all statsitics in case of multiclass classification.
Arguments:
TP
: true positive valuesTN
: true negative valuesFP
: false positive valuesFN
: false negative values
Return the result stats as a dictionary.
Metrics.classwise_stats
— Functionclasswise_stats(y_pred, y_true)
Computes statistics for each of the class for multiclass classification based on provided y_pred
and y_true
.
Return the result stats as a dictionary.
Metrics.global_stats
— Functionglobal_stats(y_pred, y_true; avg_type="macro")
Computes the overall statistics based on provided y_pred
and y_true
. avg_type
allows to specify the type of average to be used while evaluating the stats. Currently, it can take values as "macro" or "micro".
Return the result stats as a dictionary.
Utils
These are some utility functions to aid the overall performance analysis.
Metrics.bin_to_cat
— Functionbin_to_cat(y_pred, y_true)
Function to convert binary type of data to categorical with two categories. Return y_pred
and y_true
of shape (2, length(y_pred))
as tuple. Utility function to support performance metrics like Precision
, Recall
etc, where the function first need to be converted to categorical form before applying metric.
Metrics.TFPN
— FunctionTFPN(y_pred, y_true)
Returns Confusion Matrix
and True Positive
, True Negative
, False Positive
and False Negative
for each class based on y_pred
and y_true
. Expects y_true
, to be onehot_enocded already.