UtilitiesDQMaRC#

overall_quality_fx(avg_prop_good)[source]#

Determines the overall quality level based on the average proportion of ‘good’ data.

Parameters:: avg_prop_good (float) – The average proportion (percentage) of ‘good’ quality data across all metrics.
Returns:: A string representing the overall quality level. Possible values are “Outstanding”, “Good”, “Requires Improvement”, or “Inadequate” with corresponding colours for background and text.
Return type:: str

class DonutChartGenerator(data)[source]#

Bases: object

A class for generating donut charts to visualise data quality metrics.

data#

The data containing quality metrics to be visualised.

Type:: pandas.DataFrame

plot_donut_charts()[source]#

Generates a subplot of donut charts for each quality metric in the data.

Returns:: A Plotly Figure object containing the subplot of donut charts.
Return type:: plotly.graph_objs._figure.Figure

class BarPlotGenerator(data, chosen_metric)[source]#

Bases: object

A class for generating bar plots to visualise data quality metrics for a chosen metric.

data#

The data containing quality metrics to be visualised.

Type:: pandas.DataFrame

chosen_metric#

The metric for which to generate the bar plot.

Type:: str

plot_bar()[source]#

Generates a bar plot for the chosen metric.

Returns:: A Plotly Figure object containing the bar plot.
Return type:: plotly.graph_objs._figure.Figure

class MetricCalculator(data)[source]#

Bases: object

A class designed to calculate and compile data quality metrics from a provided dataset.

data#

The input dataset containing various quality metrics and fields.

Type:: pandas.DataFrame

result#

A DataFrame initialised to store the calculated metrics, including counts and proportions of good, bad, and N/A data.

Type:: pandas.DataFrame

calculate_metrics()[source]#: Calculates aggregate metrics for each field and metric combination present in the input data, updating the result attribute.

col_bad(row)[source]#

Assigns a color code to a data quality metric indicating a “bad” quality status.

Parameters:: row (pandas.Series) – A row from a DataFrame, expected to contain a ‘Metric’ column specifying the data quality metric.
Returns:: A hexadecimal color code associated with the “bad” quality status of the specified metric.
Return type:: str

Notes

The function maps different data quality metrics to specific color codes, enhancing visual distinction in graphical representations.

col_good(row)[source]#

Assigns a color code to a data quality metric indicating a “good” quality status.

Parameters:: row (pandas.Series) – A row from a DataFrame, expected to contain a ‘Metric’ column specifying the data quality metric.
Returns:: A hexadecimal color code associated with the “good” quality status of the specified metric.
Return type:: str

Notes

Similar to col_bad, this function provides a way to visually differentiate between various data quality metrics in graphical representations by mapping them to specific color codes for “good” quality status.

UtilitiesDQMaRC

Contents

UtilitiesDQMaRC#