026 — CO2/H2O segmented bias comparison (diagnostic)

026 — CO2/H2O segmented bias comparison (diagnostic)#

Detect time-segmented between-chamber bias by running breakpoint detection on the chamber-difference series. Identifies when a sensor started drifting (vs. always offset). Diagnostic only.

Runs on the bundled synthetic sample (the synthetic dataset includes a deliberate CO2_C2 linear drift segment in the second half of the week — the breakpoint detector should find it).

import pandas as pd
from palmwtc.config import DataPaths
from palmwtc.qc import detect_breakpoints_ruptures, filter_major_breakpoints

paths = DataPaths.resolve()
print(paths.describe())
DataPaths (source=sample (bundled synthetic), site=libz):
  raw_dir       = /home/runner/work/palmwtc/palmwtc/src/palmwtc/data/sample/synthetic
  processed_dir = /home/runner/work/palmwtc/palmwtc/src/palmwtc/data/sample/Data/Integrated_QC_Data
  exports_dir   = /home/runner/work/palmwtc/palmwtc/src/palmwtc/data/sample/exports
  config_dir    = /home/runner/work/palmwtc/palmwtc/src/palmwtc/data/sample/config
  extras        = <none>
qc_path = paths.raw_dir / "QC_Flagged_Data_synthetic.parquet"
df = pd.read_parquet(qc_path, columns=["TIMESTAMP", "CO2_C1", "CO2_C2"])
df = df.dropna()
df["co2_diff"] = df["CO2_C2"] - df["CO2_C1"]
print(f"{len(df)} rows | mean(C2-C1) = {df['co2_diff'].mean():.2f} ppm")
20070 rows | mean(C2-C1) = 0.24 ppm
# Detect breakpoints in the inter-chamber difference series.
try:
    bps = detect_breakpoints_ruptures(df["co2_diff"].values, n_bkps=2, model="rbf")
    major = filter_major_breakpoints(bps, df["co2_diff"].values, min_jump=0.5)
    print(f"Detected {len(bps)} candidate breakpoints, {len(major)} major")
    if len(major):
        idx = major[0]
        print(f"First major breakpoint at row {idx} ~= {df['TIMESTAMP'].iloc[idx]}")
except Exception as e:
    print(f"[skip] breakpoint detection: {e}")
[skip] breakpoint detection: detect_breakpoints_ruptures() missing 1 required positional argument: 'var_name'

On real data, a sudden breakpoint (vs gradual) typically points at a sensor swap or recalibration event; a gradual breakpoint points at slow drift. Cross-reference with docs/measurement_log/ to attribute.