API
- class effiara.annotator_reliability.Annotations(df: DataFrame, label_generator: LabelGenerator | None = None, agreement_metric: str = 'krippendorff', agreement_suffix: str = '_label', agreement_type: str = 'nominal', overlap_threshold: int = 15, merge_labels: dict | None = None, reliability_alpha: float = 0.5, reannotations: bool = False, strength: float = 1)[source]
Class to hold all annotation information for the EffiARA annotation framework. Methods include inter- and intra- annotator agreement calculations, as well the overall reliability calculation and other utilities.
- label_generator
label generator to create individual annotation labels and soft/hard aggregations.
- Type:
effiara.LabelGenerator
- annotators
list of annotator names.
- Type:
list
- num_annotators
number of annotators
- Type:
int
- label_mapping
label mapping of what is in the dataframe to what should be used for agreement/training.
- Type:
dict
- num_classes
number of classes.
- Type:
int
- agreement_metric
agreement metric to be used.
- Type:
str
- agreement_suffix
label suffix to get the agreement from (such as “_label” as the default).
- Type:
str
- agreement_type
type of agreement (e.g. nominal, ordinal).
- Type:
str
- merge_labels
dict of labels to merge.
- Type:
dict
- reannotation
whether the dataframe contains re-annotations under the re_* columns.
- Type:
bool
- strength
strength of reliability calculations (higher strength will lead to more polarised reliability values).
- Type:
float
- calculate_annotator_reliability(alpha=0.5, epsilon=0.001)[source]
- Recursively calculate annotator reliability, using
intra-annotator agreement, inter-annotator agreement, or a mixture, controlled by the alpha and beta parameters. Alpha and Beta must sum to 1.0.
- Parameters:
alpha (float) – Default 0.5. Value between 0 and 1 controlling weight of intra-annotator agreement. # noqa
beta (float) – Default 0.5. Value between 0 and 1, controlling weight of inter-annotator agreement. # noqa
epsilon (float) – Default 0.001. Controls the maximum change from the last iteration to indicate convergence. # noqa
- calculate_avg_inter_annotator_agreement()[source]
Calculate each annotator’s average agreement using using a weighted average from the annotators around them. The average is weighted by the overall reliability score of each annotator.
- calculate_inter_annotator_agreement()[source]
Calculate the inter-annotator agreement between each pair of annotators. Each agreement value will be represented on the edges of the graph between nodes that are representative of each annotator.
- calculate_overall_inter_annnotator_agreement()[source]
Calculate the overall inter-annotator agreement metric for the whole dataset. Currently only Krippendorff’s alpha is implemented.
- display_agreement_heatmap(annotators: list | None = None, other_annotators: list | None = None, display_upper=False)[source]
Plot a heatmap of agreement metric values for the annotators.
If both annotators and other_annotators are specifed, compares users in annotators to those in other_annotators. Otherwise, compare all project annotators to each other.
- Parameters:
annotators (list) – Optional.
other_annotators (list) – Optional.
- Returns:
A matrix of the data displayed on the graph. List[str]: List of annotators in the order of the matrix rows.
- Return type:
np.ndarray
- generate_final_labels_and_sample_weights()[source]
Generate the final labels and sample weights for the dataframe.
- get_agreement(user_1, user_2) float | None[source]
Get the agreement between two annotators.
- Parameters:
user_1 (str) – the name of the first annotator.
user_2 (str) – the name of the second annotator.
- Returns:
agreement between the two annotators (or None).
- Return type:
Optional[float]
- get_reliability_dict()[source]
Get a dictionary of reliability scores per username.
- Returns:
dictionary of key=username, value=reliability.
- Return type:
dict
- get_user_reliability(username)[source]
Get the reliability of a given annotator.
- Parameters:
username (str) – username of the annotator.
- Returns:
reliability score of the annotator.
- Return type:
float
- init_annotator_graph()[source]
Initialise the annotator graph with an initial reliability of 1. This means each annotator will initially be weighted equally.
- normalise_edge_property(property)[source]
Normalise an edge property to have a mean of 1.
- Parameters:
property (str) – the name of the edge property to normalise.
- class effiara.preparation.SampleDistributor(annotators: List[str] | None = None, time_available: float | None = None, annotation_rate: float | None = None, num_samples: int | None = None, double_proportion: float | None = None, re_proportion: float | None = None)[source]
- annotators
- Type:
list
- num_annotators
- Type:
int
- time_available
- Type:
float
- annotation_rate
- Type:
float
- num_samples
- Type:
int
- double_proportion
- Type:
float
- re_proportion
- Type:
float
- distribute_samples(df: DataFrame, save_path: str | None = None, all_reannotation: bool = False) Dict[str, DataFrame][source]
- Distribute samples based on sample distributor
settings.
- Parameters:
df (pd.DataFrame) – dataframe containing samples with each row being a separate sample - using a copy is recommended.
save_path (str) – (Optional) If not None, dir path to save all data to. If not supplied, a dict of allocations is returned. Default None.
all_reannotation (bool) – whether re-annotations should be sampled from all the user’s annotations rather than just single annotations. In this case, a double annotation project amount is sampled from all their annotations.
- Returns:
Mapping from usernames to assigned samples.
- Return type:
dict
- get_variables(num_annotators: int | None = None, time_available: float | None = None, annotation_rate: float | None = None, num_samples: int | None = None, double_proportion: float | None = None, re_proportion: float | None = None)[source]
Solves the annotation framework equation to find the missing variable. Only one of the available arguments should be ommitted.
- Parameters:
num_annotators (int) – number of annotators available [n].
time_available (float) – time available for each annotator (assuming they all have the same time available) [t].
annotation_rate (float) – expected rate of annotation per unit time (same unit as time_available) [rho].
num_samples (int) – number of desired samples [k].
double_proportion (float) – proportion of the whole dataset that should be double-annotated samples (0 <= n <= 1) [d].
re_proportion (float) – proportion of single-annotated samples that should be re-annotated (0 <= n <= 1) [r].
- class effiara.label_generators.LabelGenerator(annotators: list, label_mapping: dict, label_suffixes: List[str] | None = None)[source]
Abstract class for generation of labels for set of annotations.
This class should be subclassed for each individual annotation project. The subclass should override the following methods:
add_annotation_prob_labels,add_sample_prob_labels,add_sample_hard_labelsThat is, create a new file with the following:
from effiara import LabelGenerator class MyLabelGenerator(LabelGenerator): def add_annotation_prob_labels(self, df): ... def add_sample_prob_labels(self, df, reliability_dict): ... def add_sample_hard_labels(self, df): ...
- abstractmethod add_annotation_prob_labels(df: DataFrame) DataFrame[source]
- Add probability distribution (soft) labels
to each individual annotation.
- Parameters:
df (pd.DataFrame) – dataframe with all annotation data to add probability label column to.
- Returns:
dataframe with added labels.
- Return type:
(pd.DataFrame)
- abstractmethod add_sample_hard_labels(df) DataFrame[source]
- Implemented to give each sample a one-hot
hard label for use in the classification task.
- Parameters:
df (pd.DataFrame) – dataframe with all annotation data to add probability label column to.
- Returns:
dataframe with added labels.
- Return type:
(pd.DataFrame)
- abstractmethod add_sample_prob_labels(df: DataFrame, reliability_dict: dict) DataFrame[source]
- Add probability distribution (soft) labels
to each individual sample, likely using some combination of annotation probability labels. Can optionally add a sample_weight column to weight samples in training based on annotator reliability.
- Parameters:
df (pd.DataFrame) – dataframe with all annotation data to add probability label column to.
reliability_dict (dict) – dict of each annotator and their reliability score.
- Returns:
dataframe with added labels.
- Return type:
(pd.DataFrame)
Functions for computing agreement metrics.
- effiara.agreement.calculate_krippendorff_alpha_per_label(pair_df, annotator_1_col, annotator_2_col, agreement_type='nominal')[source]
Calculate Krippendorff’s alpha for each label and return the average.
Requires the data in the given columns to be a binarised array of each label (i.e. whether the label is present in the given sample).
- Parameters:
annotator_1_col (str) – column containing the binarised annotations for the first annotator.
annotator_2_col (str) – column containing the binarised annotations for the second annotator.
agreement_type (str) – type of agreement: - nominal - ordinal - interval - ratio
- Returns:
average Krippendorff’s alpha across labels.
- Return type:
float
- effiara.agreement.cosine_similarity(vector_a, vector_b)[source]
Calculate the cosine similarity between two vectors.
- Parameters:
vector_a (np.ndarray)
vector_b (np.ndarray)
- Returns:
cosine similarity between the two vectors.
- Return type:
float
- Raises:
ZeroDivisionError – when vector_a or vector_b is the zero vector.
- effiara.agreement.inter_annotator_agreement_krippendorff(df, label_cols, label_mapping)[source]
Calculate overall Krippendorff’s alpha inter-annotator agreement metric.
- Parameters:
df (pd.DataFrame) – dataframe containing all labels.
label_cols (List[str]) – annotators’ label columns to calculate agreement among.
label_mapping (dict) – mapping between labels in datasets to numeric label.
- Returns:
Krippendorff’s alpha agreement metric.
- Return type:
float
- effiara.agreement.pairwise_agreement(df, user_x, user_y, label_mapping, num_classes, metric='krippendorff', agreement_type='nominal', label_suffix='_label')[source]
Get the pairwise annotator agreement given the full dataframe.
- Parameters:
df (pd.DataFrame) – full dataframe containing the whole dataset.
user_x (str) – name of the user in the form user_x.
user_y (str) – name of the user in the form user_y.
metric (str) –
agreement metric to use for inter-/intra-annotator agreement.
krippendorff: nominal krippendorff’s alpha similarity metric on hard labels only.
cohen: nominal cohen’s kappa similarity metric on hard labels only.
fleiss: nominal fleiss kappa similarity metric on hard labels only.
multi_krippendorff: krippendorff similarity by label for multilabel classification.
cosine: the cosine similarity metric to be used on soft labels.
percentage: simple percentage agreement between the two annotators.
agreement_type (str) –
type of agreement. * nominal
ordinal
interval
ratio
NOTE: currently only working for multi_krippendorff.
label_suffix (str) – suffix for the label being compared.
- Returns:
agreement between user_x and user_y.
- Return type:
float
- effiara.agreement.pairwise_cohens_kappa_agreement(pair_df, heading_1, heading_2, label_mapping)[source]
- Cohen’s kappa agreement metric between two annotators, given two
headings for each annotator column containing their primary label for each sample.
Does not require any specific formatting of labels within the columns heading_1 and heading_2.
- Parameters:
pair_df (pd.DataFrame) – dataframe filtered to contain only the samples that allow agreement calculations.
heading_1 (str) – heading of the first column required to calculate agreement.
heading_2 (str) – heading of the second column required to calculate agreement.
label_mapping (dict) – mapping of labels to numeric values.
- Returns:
Cohen’s Kappa.
- Return type:
float
- effiara.agreement.pairwise_cosine_similarity(pair_df, heading_1, heading_2, num_classes=3)[source]
Calculate the cosine similarity between two columns of soft labels.
Requires the two headings to be formatted as a soft label (list or np.array filled with floats summing to 1).
- Parameters:
pair_df (pd.DataFrame) – data frame containing annotation data.
heading_1 (str) – heading of first column containing soft labels.
heading_2 (str) – heading of second column containing soft labels.
- Returns:
average cosine similarity between the two sets of soft labels.
- Return type:
float
- effiara.agreement.pairwise_fleiss_kappa_agreement(pair_df, heading_1, heading_2, label_mapping)[source]
- Fleiss kappa agreement metric between two annotators, given two
headings for each annotator column containing their primary label for each sample.
Does not require any specific formatting of labels within the columns heading_1 and heading_2.
- Parameters:
pair_df (pd.DataFrame) – dataframe filtered to contain only the samples that allow agreement calculations.
heading_1 (str) – heading of the first column required to calculate agreement.
heading_2 (str) – heading of the second column required to calculate agreement.
label_mapping (dict) – mapping of labels to numeric values.
- Returns:
Fleiss’ Kappa.
- Return type:
float
- effiara.agreement.pairwise_nominal_krippendorff_agreement(pair_df, heading_1, heading_2, label_mapping)[source]
- Get the nominal krippendorff agreement between two annotators,
given two headings for each annotator column containing their primary label for each sample.
Does not require any specific formatting of labels within the columns heading_1 and heading_2.
- Parameters:
pair_df (pd.DataFrame) – dataframe filtered to contain only the samples that allow agreement calculations.
heading_1 (str) – heading of the first column required to calculate agreement.
heading_2 (str) – heading of the second column required to calculate agreement.
label_mapping (dict) – mapping of labels to numeric values.
- Returns:
Krippendorff’s Alpha.
- Return type:
float
- effiara.agreement.pairwise_percentage_agreement(pair_df, heading_1, heading_2)[source]
- Pairwise percentage agreement between two annotators, given two
headings for each annotator column containing their primary label for each sample.
Does not require any specific formatting of labels within the columns heading_1 and heading_2.
- Parameters:
pair_df (pd.DataFrame) – dataframe filtered to contain only the samples that allow agreement calculations.
heading_1 (str) – heading of the first column required to calculate agreement.
heading_2 (str) – heading of the second column required to calculate agreement.
- Returns:
Percentage agreement.
- Return type:
float