034 — QC + window audit (visualization)

034 — QC + window audit (visualization)#

Visual audit of QC flags + calibration window selection results across the full dataset. Used for periodic sanity checks before XPalm calibration runs.

Runs on the bundled synthetic sample.

import pandas as pd
import matplotlib.pyplot as plt

from palmwtc.config import DataPaths
from palmwtc.viz import set_style

set_style()
paths = DataPaths.resolve()
print(paths.describe())

DataPaths (source=sample (bundled synthetic), site=libz):
  raw_dir       = /home/runner/work/palmwtc/palmwtc/src/palmwtc/data/sample/synthetic
  processed_dir = /home/runner/work/palmwtc/palmwtc/src/palmwtc/data/sample/Data/Integrated_QC_Data
  exports_dir   = /home/runner/work/palmwtc/palmwtc/src/palmwtc/data/sample/exports
  config_dir    = /home/runner/work/palmwtc/palmwtc/src/palmwtc/data/sample/config
  extras        = <none>

# Read QC flag summary from the synthetic parquet.
qc_path = paths.raw_dir / "QC_Flagged_Data_synthetic.parquet"
df = pd.read_parquet(qc_path, columns=["TIMESTAMP", "CO2_C1_qc_flag", "CO2_C2_qc_flag", "H2O_C1_qc_flag", "H2O_C2_qc_flag"])
counts = df[["CO2_C1_qc_flag", "CO2_C2_qc_flag", "H2O_C1_qc_flag", "H2O_C2_qc_flag"]].sum()
print("QC flag totals (1 = fail):")
print(counts.to_string())

QC flag totals (1 = fail):
CO2_C1_qc_flag    94
CO2_C2_qc_flag     0
H2O_C1_qc_flag     1
H2O_C2_qc_flag     2

fig, ax = plt.subplots(figsize=(8, 3))
counts.plot(kind="barh", ax=ax, color="#a23b72")
ax.set_xlabel("Flagged rows")
ax.set_title("QC flag counts per variable (synthetic sample, 1 week)")
plt.tight_layout()
plt.show()

../_images/2ffd36b6374c95b1b0dd999b4764aa8186ba996486dc3b401fc1cb062b1117e5.png

Real LIBZ data typically shows < 5% flagged rows per variable. Sudden spikes in any column point at a specific instrument event — chase via docs/measurement_log/<sensor>.md to attribute.