UtilitiesDQMaRC#
- overall_quality_fx(avg_prop_good)[source]#
Determines the overall quality level based on the average proportion of ‘good’ data.
- Parameters:
avg_prop_good (float) – The average proportion (percentage) of ‘good’ quality data across all metrics.
- Returns:
A string representing the overall quality level. Possible values are “Outstanding”, “Good”, “Requires Improvement”, or “Inadequate” with corresponding colours for background and text.
- Return type:
str
- class DonutChartGenerator(data)[source]#
Bases:
object
A class for generating donut charts to visualise data quality metrics.
- data#
The data containing quality metrics to be visualised.
- Type:
pandas.DataFrame
- class BarPlotGenerator(data, chosen_metric)[source]#
Bases:
object
A class for generating bar plots to visualise data quality metrics for a chosen metric.
- data#
The data containing quality metrics to be visualised.
- Type:
pandas.DataFrame
- chosen_metric#
The metric for which to generate the bar plot.
- Type:
str
- class MetricCalculator(data)[source]#
Bases:
object
A class designed to calculate and compile data quality metrics from a provided dataset.
- data#
The input dataset containing various quality metrics and fields.
- Type:
pandas.DataFrame
- result#
A DataFrame initialised to store the calculated metrics, including counts and proportions of good, bad, and N/A data.
- Type:
pandas.DataFrame
- col_bad(row)[source]#
Assigns a color code to a data quality metric indicating a “bad” quality status.
- Parameters:
row (pandas.Series) – A row from a DataFrame, expected to contain a ‘Metric’ column specifying the data quality metric.
- Returns:
A hexadecimal color code associated with the “bad” quality status of the specified metric.
- Return type:
str
Notes
The function maps different data quality metrics to specific color codes, enhancing visual distinction in graphical representations.
- col_good(row)[source]#
Assigns a color code to a data quality metric indicating a “good” quality status.
- Parameters:
row (pandas.Series) – A row from a DataFrame, expected to contain a ‘Metric’ column specifying the data quality metric.
- Returns:
A hexadecimal color code associated with the “good” quality status of the specified metric.
- Return type:
str
Notes
Similar to col_bad, this function provides a way to visually differentiate between various data quality metrics in graphical representations by mapping them to specific color codes for “good” quality status.