Classification Metrics

This package allows you to use a variety of Classification metrics for the performance analysis of Classification models based on the provided y_true and y_pred. The metrics that you choose to evaluate your machine learning model is very important. Choice of metrics influences how the performance of machine learning algorithms is measured and compared. For most of these function, it is expected that the provided

Functions

Metrics.binary_accuracyFunction
binary_accuracy(y_pred, y_true; threshold=0.5)

Calculates Averaged Binary Accuracy based on y_pred and y_true. Argument threshold is used to specify the minimum predicted probability y_pred required to be labelled as 1. Default value set as 0.5.

source
Metrics.cohen_kappaFunction
cohen_kappa(y_pred, y_true)

Measures the agreement between two raters (predicted and ground truth, here) who each classify N items into C mutually exclusive categories, using the observed data to calculate the probabilities of each observer randomly seeing each category. If the raters are in complete agreement then κ = 1. If there is no agreement among the raters other than what would be expected by chance, κ = 0.

Ref: Cohen's Kappa

source
Metrics.confusion_matrixFunction
confusion_matrix(y_pred, y_true)

Function to create a confusionmatrix for classification problems based on provided `ypredandytrue. Expectsytrue`, to be onehot_enocded already.

source
Metrics.f_beta_scoreFunction
f_beta_score(y_pred, y_true; β=1, avg_type="macro", sample_weights=nothing)

Compute fbeta score. The F_beta score is the weighted harmonic mean of precision and recall, reaching its optimal value at 1 and its worst value at 0.

Arguments

  • y_pred: predicted values.
  • y_true: ground truth values on the basis of which predicted values are to be assessed.
  • β=1: the weight of precision in the combined score. If β<1, more weight given to precision, while β>1 favors recall.
  • avg_type="macro": Type of average to be used while calculating precision of multiclass models. Can take values as macro, micro and weighted. Default set to macro.
  • sample_weights: Class weights to be provided when avg_type is set to weighted. Useful in case of imbalanced classes.
source
Metrics.false_alarm_rateFunction
false_alarm_rate(y_pred, y_true; avg_type="macro", sample_weights=nothing)

Computes the falsealarmraye of the predictions with respect to the labels as 1 - specificity(y_pred, y_true, avg_type, sample_weights)

Arguments

  • y_pred: predicted values.
  • y_true: ground truth values on the basis of which predicted values are to be assessed.
  • avg_type="macro": Type of average to be used while calculating precision of multiclass models. Can take values as macro, micro and weighted. Default set to macro.
  • sample_weights: Class weights to be provided when avg_type is set to weighted. Useful in case of imbalanced classes.

See also: specificity

source
Metrics.precisionFunction
precision(y_pred, y_true; avg_type="macro", sample_weights=nothing)

Computes the precision of the predictions with respect to the labels.

Arguments

  • y_pred: predicted values.
  • y_true: ground truth values on the basis of which predicted values are to be assessed.
  • avg_type="macro": Type of average to be used while calculating precision of multiclass models. Can take values as macro, micro and weighted. Default set to macro.
  • sample_weights: Class weights to be provided when avg_type is set to weighted. Useful in case of imbalanced classes.
source
Metrics.recallFunction
recall(y_pred, y_true; avg_type="macro", sample_weights=nothing)

Computes the recall of the predictions with respect to the labels.

Arguments

  • y_pred: predicted values.
  • y_true: ground truth values on the basis of which predicted values are to be assessed.
  • avg_type="macro": Type of average to be used while calculating precision of multiclass models. Can take values as macro, micro and weighted. Default set to macro.
  • sample_weights: Class weights to be provided when avg_type is set to weighted. Useful in case of imbalanced classes.

Aliases: sensitivity and detection_rate

source
Metrics.sparse_categoricalFunction
sparse_categorical(y_pred, y_true)

Calculated Sparse Categorical Accuracy based on y_pred and y_true. It evaluates the maximal true value is equal to the index of the maximal predicted value. Here, y_true is expected to provide only an integer (start from 0 index) as label for each data element (ie. not one hot encoded).

source
Metrics.specificityFunction
specificity(y_pred, y_true; avg_type="macro", sample_weights=nothing)

Computes the specificity of the predictions with respect to the labels.

Arguments

  • y_pred: predicted values.
  • y_true: ground truth values on the basis of which predicted values are to be assessed.
  • avg_type="macro": Type of average to be used while calculating precision of multiclass models. Can take values as macro, micro and weighted. Default set to macro.
  • sample_weights: Class weights to be provided when avg_type is set to weighted. Useful in case of imbalanced classes.
source
Metrics.top_k_categoricalFunction
top_k_categorical(y_pred, y_true; k=3)

Evaluates if the index of true value is equal to any of the indices of top k predicted values. Default value of k set to 3.

source
Metrics.top_k_sparse_categoricalFunction
top_k_sparse_categorical(y_pred, y_true; k=3)

Evaluates if the true value is equal to any of the indices of top k predicted values. Default value of k set to 3. Similar to sparse_categorical, expects the y_true to provide only an integer (start from 0 index) as label for each data element (ie. not one hot encoded).

source

Combined Stats

There are some functions that return you the overall analysis of the model performance within a single function. They are:

Metrics.statsfromTFPNFunction
statsfromTFPN(TP, TN, FP, FN)

Computes statistics in case of binary classification or one-vs-all statsitics in case of multiclass classification.

Arguments:

  • TP: true positive values
  • TN: true negative values
  • FP: false positive values
  • FN: false negative values

Return the result stats as a dictionary.

source
Metrics.classwise_statsFunction
classwise_stats(y_pred, y_true)

Computes statistics for each of the class for multiclass classification based on provided y_pred and y_true.

Return the result stats as a dictionary.

source
Metrics.global_statsFunction
global_stats(y_pred, y_true; avg_type="macro")

Computes the overall statistics based on provided y_pred and y_true. avg_type allows to specify the type of average to be used while evaluating the stats. Currently, it can take values as "macro" or "micro".

Return the result stats as a dictionary.

source

Utils

These are some utility functions to aid the overall performance analysis.

Metrics.bin_to_catFunction
bin_to_cat(y_pred, y_true)

Function to convert binary type of data to categorical with two categories. Return y_pred and y_true of shape (2, length(y_pred)) as tuple. Utility function to support performance metrics like Precision, Recall etc, where the function first need to be converted to categorical form before applying metric.

source
Metrics.TFPNFunction
TFPN(y_pred, y_true)

Returns Confusion Matrix and True Positive, True Negative, False Positive and False Negative for each class based on y_pred and y_true. Expects y_true, to be onehot_enocded already.

source