026 — CO2/H2O segmented bias comparison (diagnostic)#
Detect time-segmented between-chamber bias by running breakpoint detection on the chamber-difference series. Identifies when a sensor started drifting (vs. always offset). Diagnostic only.
Runs on the bundled synthetic sample (the synthetic dataset includes a
deliberate CO2_C2 linear drift segment in the second half of the week
— the breakpoint detector should find it).
import pandas as pd
from palmwtc.config import DataPaths
from palmwtc.qc import detect_breakpoints_ruptures, filter_major_breakpoints
paths = DataPaths.resolve()
print(paths.describe())
DataPaths (source=sample (bundled synthetic), site=libz):
raw_dir = /home/runner/work/palmwtc/palmwtc/src/palmwtc/data/sample/synthetic
processed_dir = /home/runner/work/palmwtc/palmwtc/src/palmwtc/data/sample/Data/Integrated_QC_Data
exports_dir = /home/runner/work/palmwtc/palmwtc/src/palmwtc/data/sample/exports
config_dir = /home/runner/work/palmwtc/palmwtc/src/palmwtc/data/sample/config
extras = <none>
qc_path = paths.raw_dir / "QC_Flagged_Data_synthetic.parquet"
df = pd.read_parquet(qc_path, columns=["TIMESTAMP", "CO2_C1", "CO2_C2"])
df = df.dropna()
df["co2_diff"] = df["CO2_C2"] - df["CO2_C1"]
print(f"{len(df)} rows | mean(C2-C1) = {df['co2_diff'].mean():.2f} ppm")
20070 rows | mean(C2-C1) = 0.24 ppm
# Detect breakpoints in the inter-chamber difference series.
try:
bps = detect_breakpoints_ruptures(df["co2_diff"].values, n_bkps=2, model="rbf")
major = filter_major_breakpoints(bps, df["co2_diff"].values, min_jump=0.5)
print(f"Detected {len(bps)} candidate breakpoints, {len(major)} major")
if len(major):
idx = major[0]
print(f"First major breakpoint at row {idx} ~= {df['TIMESTAMP'].iloc[idx]}")
except Exception as e:
print(f"[skip] breakpoint detection: {e}")
[skip] breakpoint detection: detect_breakpoints_ruptures() missing 1 required positional argument: 'var_name'
On real data, a sudden breakpoint (vs gradual) typically points at a
sensor swap or recalibration event; a gradual breakpoint points at
slow drift. Cross-reference with docs/measurement_log/ to attribute.