Accuracy#

class Accuracy(df, test_params)[source]#

A subclass of Dimension focused on assessing the accuracy of a dataset against a predefined gold standard.

This class performs accuracy tests by comparing data in the dataset against equivalent, trusted data in the gold standard.

Parameters:
  • df (pandas.DataFrame) – The dataset to be evaluated, imported via pandas’ read_csv() function.

  • test_params (pandas.DataFrame) – The parameters defining how tests should be conducted.

  • gold_standard (pandas.DataFrame) – A DataFrame that serves as the gold standard for comparison. Must be set using the set_gold_standard method before running tests.

  • tests (dict) – A dictionary mapping test names to their relevant information and methods. Currently supports a gold standard comparison test.

gold_standard_comparison(test)[source]#

Compares the dataset against the gold standard for the specified columns.

Parameters:

test (dict) – The test configuration.

set_gold_standard(gs)[source]#

Sets the gold standard DataFrame against which the dataset will be compared.

Parameters:

gs (pandas.DataFrame) – The DataFrame to set as the gold standard.