cafaeval.evaluation¶

cafaeval.evaluation.solidify_prediction(pred, tau)[source]¶

cafaeval.evaluation.compute_f(pr, rc)[source]¶

cafaeval.evaluation.compute_s(ru, mi)[source]¶

cafaeval.evaluation.compute_confusion_matrix(tau_arr, g, pred_matrix, toi, n_gt, ic_arr=None)[source]¶: Perform the evaluation at the matrix level for all tau thresholds The calculation is

cafaeval.evaluation.compute_confusion_matrix_exclude(tau_arr, g_perprotein, pred_matrix, toi_perprotein, n_gt, ic_arr=None)[source]¶

Perform the evaluation at the matrix level for all tau thresholds The calculation is

Here, g is the full ground truth matrix without filtering terms of interest (toi). Instead,

cafaeval.evaluation.compute_confusion_matrix_sparse(tau_arr, g, pred_matrix, toi, n_gt, ic_arr=None)[source]¶

Sparse alternative to compute_confusion_matrix().

Scatters each non-zero prediction into the highest tau bin at which it is still active, then recovers per-threshold per-protein sums via a single right-to-left cumulative sum. Total cost is O(nnz + n_prot * n_tau) instead of O(n_tau * n_prot * n_toi) for the dense scan, which is the dominant win on real-world corpora where the prediction matrix is wide and mostly zero above typical thresholds.

Accepts the same inputs as compute_confusion_matrix and returns a (n_tau, 6) array with the same column order (n, tp, fp, fn, pr, rc). tau_arr must be sorted in ascending order, which is already the case for all call sites (np.arange(th_step, 1, th_step)).

cafaeval.evaluation.compute_confusion_matrix_exclude_sparse(tau_arr, pred_sub, gt_sub, toi_mask, excluded_mask, n_gt, ic_arr=None)[source]¶

Sparse alternative to compute_confusion_matrix_exclude().

Same single-pass scatter + right-to-left cumsum strategy as compute_confusion_matrix_sparse(), extended to the PK setting where the valid column set is protein-specific. A prediction at (row, col) contributes to protein row only if:

col belongs to the global toi (toi_mask), and
col is not in this protein’s exclude set (~excluded_mask).

Because the filter is expressed as a boolean AND over dense matrices, the sparse scatter sees only the surviving non-zeros — we never materialise the Python list of per-protein toi arrays. Cost drops from O(n_tau * sum_p |toi_p|) to O(nnz_valid + n_prot * n_tau).

Parameters:

pred_sub – (n_prot, n_terms) float — predictions restricted to proteins with at least one GT annotation in the toi.
gt_sub – (n_prot, n_terms) bool/int — ground truth for the same proteins.
toi_mask – (n_terms,) bool — global terms-of-interest flag.
excluded_mask – (n_prot, n_terms) bool — per-protein exclude flag.
n_gt – (n_prot,) float — pre-computed weight of valid GT annotations per protein (dense sum over gt_sub & toi_mask & ~excluded_mask, optionally multiplied by ic_arr).
ic_arr – (n_terms,) float or None — optional per-term weight.

cafaeval.evaluation.compute_metrics(pred, gt_matrix, tau_arr, toi, gt_exclude=None, ic_arr=None, n_cpu=0)[source]¶: Takes the prediction and the ground truth and for each threshold in tau_arr calculates the confusion matrix and returns the coverage, precision, recall, remaining uncertainty and misinformation. Toi is the list of terms (indexes) to be considered

cafaeval.evaluation.normalize(metrics, ns, tau_arr, ne, normalization)[source]¶

cafaeval.evaluation.evaluate_prediction(prediction, gt, ontologies, tau_arr, gt_exclude=None, normalization='cafa', n_cpu=0, weighted_only=False)[source]¶

cafaeval.evaluation.cafa_eval(obo_file, pred_dir, gt_file, ia=None, no_orphans=False, norm='cafa', prop='max', exclude=None, toi_file=None, max_terms=None, th_step=0.01, n_cpu=1, weighted_only=False)[source]¶

cafaeval.evaluation.write_results(df, dfs_best, out_dir='results', th_step=0.01)[source]¶