palmwtc.flux.chamber#

Chamber data preparation, WPL correction, and per-cycle flux batch computation.

This module is the entry point for the full CO₂ and H₂O flux calculation pipeline for automated whole-tree chambers instrumented with LI-COR LI-850. It ties together sensor-stream preparation, WPL dilution correction, cycle identification, regression-based flux extraction, QC scoring, and tree biophysical data.

Pipeline overview#

  1. prepare_chamber_data() — selects the correct sensor columns for one chamber (C1 or C2), applies QC flag filtering, runs WPL correction, and returns a clean DataFrame ready for cycle identification.

  2. calculate_flux_cycles() — identifies measurement cycles inside the prepared DataFrame and runs palmwtc.flux.cycles.evaluate_cycle() on every cycle (in parallel when the dataset is large), returning one row per cycle with slope, R², AICc, monotonicity, flux, and QC fields.

  3. calculate_h2o_flux_cycles() — H₂O analogue: uses Theil-Sen + OLS regression with relaxed QC thresholds appropriate for water-vapour noise levels.

  4. compute_closure_confidence() — converts per-cycle R², NRMSE, and global radiation into a 0–1 confidence score for chamber closure quality.

Tree biophysics helpers#

  • load_tree_biophysics() — reads Vigor_Index_PalmStudio.xlsx and returns palm geometry time series (height, radius, estimated volume).

  • get_tree_volume_at_date() — time-interpolates the Vigor Index (m³) for a specific tree and date from the biophysics table.

WPL diagnostic helpers#

Configuration constants#

All functions accept explicit parameters. Use the constants below as starting points and override what you need:

Quick start:

from palmwtc.flux.chamber import (
    prepare_chamber_data,
    calculate_flux_cycles,
    calculate_h2o_flux_cycles,
)

# Use default config, override just one key:
cfg = {**DEFAULT_CONFIG, "min_points": 10, "cycle_gap_sec": 240}
chamber_df = prepare_chamber_data(raw_df, "C1", **cfg)
flux_df    = calculate_flux_cycles(chamber_df, "Chamber 1", **cfg)
h2o_df     = calculate_h2o_flux_cycles(chamber_df, "Chamber 1", **cfg)

Attributes#

DEFAULT_CONFIG

Default pipeline configuration for cycle detection, regression, and WPL.

DEFAULT_CO2_QC_THRESHOLDS

Daytime CO₂ QC grading thresholds for palmwtc.flux.cycles.score_cycle().

NIGHTTIME_CO2_QC_THRESHOLDS

Relaxed CO₂ QC thresholds for nighttime cycles (Global_Radiation < 10 W m⁻²).

DEFAULT_H2O_QC_THRESHOLDS

Daytime H₂O QC grading thresholds for score_h2o_flux_qc().

NIGHTTIME_H2O_QC_THRESHOLDS

Relaxed H₂O QC thresholds for nighttime cycles (Global_Radiation < 10 W m⁻²).

DEFAULT_WPL_QC_THRESHOLDS

Per-cycle WPL correction validity thresholds used by apply_wpl_qc_overrides().

Functions#

apply_wpl_correction(co2_wet, h2o_mmol_mol)

Convert wet CO₂ (ppm) to dry CO₂ using the WPL dilution correction.

_choose_h2o_column(data, chamber_suffix[, ...])

Pick the best available H2O column for chamber_suffix.

prepare_chamber_data(data, chamber_suffix[, ...])

Select, filter, and WPL-correct sensor streams for one chamber.

summarize_wpl_correction(chamber_df)

Return a dataset-level summary of WPL correction statistics.

build_cycle_wpl_metrics(chamber_df, chamber_name[, ...])

Aggregate WPL correction metrics per measurement cycle.

calculate_flux_cycles(chamber_df, chamber_name[, ...])

Identify measurement cycles and compute CO₂ flux for each cycle.

calculate_h2o_flux_for_cycle(cycle_data[, gas_col, ...])

Compute H₂O slope and fit statistics for a single measurement cycle.

score_h2o_flux_qc(h2o_metrics[, h2o_qc_thresholds, ...])

Assign a QC grade to a single H₂O flux cycle.

calculate_h2o_flux_cycles(chamber_df, chamber_name[, ...])

Compute H₂O flux for every cycle in chamber_df.

load_tree_biophysics(base_dir)

Load palm tree biophysical parameters from the PalmStudio spreadsheet.

get_tree_volume_at_date(df_vigor, tree_id, target_date)

Time-interpolate the Vigor Index (m³) for a tree at a specific date.

apply_wpl_qc_overrides(row, model_qc, flux_qc, reason_text)

Apply WPL-specific checks and upgrade QC tiers if needed.

compute_closure_confidence(r2, nrmse, global_radiation)

Compute a chamber closure confidence score between 0 and 1.

Module Contents#

palmwtc.flux.chamber.DEFAULT_CONFIG#

Default pipeline configuration for cycle detection, regression, and WPL.

Pass {**DEFAULT_CONFIG, "key": new_value} to any pipeline function to override individual settings without losing the other defaults.

Keys#

cycle_gap_secint

Minimum gap in seconds between two successive measurements that triggers the start of a new measurement cycle (default 300).

start_cutoff_secint

Number of seconds to skip from the beginning of a cycle before starting the regression window search (default 50). Skips the initial chamber-mixing transient.

start_search_secint

How far into the cycle (seconds) the window-start search extends (default 60).

min_pointsint

Minimum number of valid data points required to attempt a flux regression (default 20).

min_duration_secint

Minimum regression window length in seconds (default 180).

outlier_zfloat

Z-score threshold for iterative outlier removal before re-fitting (default 2).

max_outlier_refit_fracfloat

Maximum fraction of points that may be removed as outliers; if exceeded the original fit is kept (default 0.2).

noise_eps_ppmfloat

Noise floor in ppm used when computing the monotonicity fraction (steps smaller than this are treated as noise, not signal direction, default 0.5).

accepted_co2_qc_flagslist of int

Only rows whose CO2_{suffix}_qc_flag is in this list are kept (default [0]; None keeps all rows).

accepted_h2o_qc_flagslist of int

Same for H2O_{suffix}_qc_flag (default [0, 1]; H₂O flag 1 is a minor sensor warning that still produces usable data).

prefer_corrected_h2obool

When True, use H2O_{suffix}_corrected over raw H2O_{suffix} if the corrected column is present (default True).

require_h2o_for_wplbool

When True, prepare_chamber_data() raises ValueError if no H₂O column is found and WPL correction is requested (default True). Set to False to fall back to wet CO₂.

h2o_valid_rangetuple of float

Physical validity bounds for H₂O in mmol mol⁻¹ as (lo, hi) (default (0.0, 60.0)). Values outside this range are set to NaN before WPL correction.

max_abs_wpl_rel_changefloat

Maximum plausible absolute relative WPL correction (default 0.12, i.e. 12 %). Rows with larger corrections get a Flag upgrade to 2.

use_multiprocessingbool

Use multiprocessing.Pool for cycle batches larger than 50 cycles (default True).

n_jobsint

Number of parallel worker processes (default min(8, cpu_count)).

palmwtc.flux.chamber.DEFAULT_CO2_QC_THRESHOLDS#

Daytime CO₂ QC grading thresholds for palmwtc.flux.cycles.score_cycle().

Each threshold has an _A (Grade A boundary) and _B (Grade B boundary) variant. A cycle that passes all _A tests is Grade A (tier 0). A cycle that fails one or more _A tests but passes all _B tests is Grade B (tier 1). Failing any _B test downgrades to Grade C (tier 2).

Keys#

r2_A, r2_Bfloat

Minimum R² of the OLS linear fit. Daytime photosynthesis and respiration cycles have large, clean CO₂ signals so the bar is high (0.90 / 0.70).

nrmse_A, nrmse_Bfloat

Maximum normalized RMSE (RMSE divided by CO₂ concentration range). Low values (0.10 / 0.20) indicate a clean linear trend.

snr_A, snr_Bfloat

Minimum signal-to-noise ratio, defined as (|slope| × duration) / RMSE. Measures whether the CO₂ trend is distinguishable from measurement noise.

monotonic_A, monotonic_Bfloat

Minimum fraction of consecutive concentration steps that move in the direction of the fitted slope (steps smaller than noise_eps_ppm are ignored). Daytime CO₂ should rise or fall steadily inside a closed chamber.

outlier_A, outlier_Bfloat

Maximum fraction of points removed as statistical outliers before re-fitting (0.05 / 0.15).

curvature_aiccfloat

AICc difference (quadratic minus linear) threshold. Values more negative than this indicate significant curvature, flagging possible leaks or mixing issues. Note: this key is read by palmwtc.flux.cycles.score_cycle(), not by functions in this module directly.

slope_diff_A, slope_diff_Bfloat

Maximum relative difference between OLS slope and Theil-Sen slope (|slope_ols - slope_ts| / |slope_ols|). Large differences indicate leverage points or non-linearity.

signal_ppm_guardfloat

Total CO₂ change (ppm) below which the monotonic_A/B thresholds are scaled down proportionally. Prevents mass rejection of low-flux cycles where noise-to-signal ratio is inherently higher.

See Also#

NIGHTTIME_CO2_QC_THRESHOLDS : Relaxed version for dark/respiration cycles. palmwtc.flux.cycles.score_cycle : Function that consumes these thresholds.

palmwtc.flux.chamber.NIGHTTIME_CO2_QC_THRESHOLDS#

Relaxed CO₂ QC thresholds for nighttime cycles (Global_Radiation < 10 W m⁻²).

This is an alias for palmwtc.flux.cycles.NIGHTTIME_QC_THRESHOLDS. It is exposed here so callers that work only with palmwtc.flux.chamber do not need to import from the lower-level palmwtc.flux.cycles module.

Why nighttime cycles need relaxed thresholds#

During the day, photosynthesis drives a strong, fast CO₂ drawdown inside the closed chamber (often 20–100 ppm over 5 minutes). This yields high R², SNR, and monotonicity, making the daytime _A thresholds easy to meet.

At night, only leaf + soil respiration remain. CO₂ rise rates are typically 3–15 ppm over 5 minutes — a much smaller signal that sits closer to instrument noise (~0.2–0.5 ppm RMS for LI-COR LI-850). Applying daytime thresholds to these cycles rejects most valid nighttime measurements.

Relaxed values (compared to DEFAULT_CO2_QC_THRESHOLDS)#

  • r2_A 0.90 → 0.70, r2_B 0.70 → 0.40 — lower R² is expected when the signal is small relative to noise.

  • snr_A 10.0 → 5.0, snr_B 3.0 → 2.0 — smaller CO₂ trends mean lower SNR even in well-sealed chambers.

  • monotonic_A 0.80 → 0.50, monotonic_B 0.45 → 0.30 — a 5 ppm respiration signal with 0.5 ppm noise gives ~50 % monotonicity even when the signal is real.

  • signal_ppm_guard 5.0 → 3.0 — the guard activates earlier for the smaller nighttime signals.

See Also#

DEFAULT_CO2_QC_THRESHOLDS : Daytime thresholds. palmwtc.flux.cycles.NIGHTTIME_QC_THRESHOLDS : Canonical source of these values.

palmwtc.flux.chamber.DEFAULT_H2O_QC_THRESHOLDS#

Daytime H₂O QC grading thresholds for score_h2o_flux_qc().

H₂O thresholds are systematically looser than the CO₂ counterparts in DEFAULT_CO2_QC_THRESHOLDS. Two reasons:

  1. The LI-COR LI-850 H₂O channel has higher absolute noise (~0.1–0.2 mmol mol⁻¹ RMS) than the CO₂ channel, reducing R² and SNR for the same physical signal size.

  2. Transpiration signals in humid tropical conditions are often 0.5–3 mmol mol⁻¹ over a 5-minute cycle — smaller fractional change than CO₂ during active photosynthesis.

Keys#

r2_A, r2_Bfloat

Minimum R² of the OLS linear fit (0.70 / 0.50).

nrmse_A, nrmse_Bfloat

Maximum normalized RMSE (0.15 / 0.25).

snr_A, snr_Bfloat

Minimum SNR, computed as (|Theil-Sen slope| × duration) / residual std (5.0 / 3.0).

monotonic_A, monotonic_Bfloat

Minimum fraction of H₂O steps larger than 0.05 mmol mol⁻¹ that move in the fitted-slope direction (0.70 / 0.40). The 0.05 mmol mol⁻¹ noise floor prevents sensor jitter from deflating the fraction.

outlier_A, outlier_Bfloat

Maximum fraction of outlier points allowed before downgrading (0.15 / 0.25). Looser than CO₂ because H₂O droplets on the optical path can cause isolated spikes.

signal_mmol_guardfloat

H₂O concentration range (mmol mol⁻¹) below which nrmse_B and monotonic_B are relaxed proportionally (default 0.3). Prevents mass rejection of valid but low-transpiration cycles.

See Also#

NIGHTTIME_H2O_QC_THRESHOLDS : Relaxed version for nocturnal cycles. score_h2o_flux_qc : Function that consumes these thresholds.

palmwtc.flux.chamber.NIGHTTIME_H2O_QC_THRESHOLDS#

Relaxed H₂O QC thresholds for nighttime cycles (Global_Radiation < 10 W m⁻²).

Why nighttime H₂O needs the most relaxed thresholds#

Stomata close at night, so transpiration drops to near zero. A typical nighttime H₂O slope is 0.0–0.1 mmol mol⁻¹ min⁻¹ — often indistinguishable from sensor drift. Applying daytime thresholds to these cycles would grade nearly all of them C, making nighttime water-balance closure impossible. The physical expectation at night is a flat or very slowly rising H₂O trace, not a steep linear increase.

Relaxed values (compared to DEFAULT_H2O_QC_THRESHOLDS)#

  • r2_A 0.70 → 0.50, r2_B 0.50 → 0.25 — a flat trace has R² ≈ 0 by definition; low R² at night is not a data-quality failure.

  • nrmse_A 0.15 → 0.25, nrmse_B 0.25 → 0.45 — when the H₂O range is 0.1–0.2 mmol mol⁻¹, sensor noise dominates NRMSE.

  • snr_A 5.0 → 3.0, snr_B 3.0 → 1.5 — near-zero signal means SNR is near noise floor even in a well-sealed chamber.

  • monotonic_A 0.50 → 0.50, monotonic_B 0.40 → 0.30 — random-walk noise on a flat trace produces ~50 % monotonicity by chance.

  • signal_mmol_guard 0.30 → 0.15 — the guard activates at even smaller H₂O changes to protect valid low-transpiration cycles.

See Also#

DEFAULT_H2O_QC_THRESHOLDS : Daytime thresholds. score_h2o_flux_qc : Function that applies these thresholds.

palmwtc.flux.chamber.DEFAULT_WPL_QC_THRESHOLDS#

Per-cycle WPL correction validity thresholds used by apply_wpl_qc_overrides().

These thresholds check whether the WPL correction was well-conditioned for a given cycle, not whether the underlying CO₂ flux regression was good. A cycle can have perfect R² but still have a poor WPL correction if many H₂O readings were out-of-range or the humidity was unusually high.

Keys#

valid_frac_A, valid_frac_Bfloat

Minimum fraction of points in the cycle for which a valid WPL factor could be computed (i.e. H₂O was within h2o_valid_range and non-NaN). Grade A requires 98 % coverage; Grade B requires 95 %.

rel_change_p95_A, rel_change_p95_Bfloat

95th percentile of the absolute relative WPL correction (|wpl_delta_ppm / CO2_raw|) within the cycle. Values above 7 % indicate unusually large humidity-driven adjustments that can distort the flux. Values above 4 % are flagged as moderate (Grade B).

factor_max_Bfloat

Maximum WPL multiplication factor (1 + χ_w / (1000 χ_w)) seen in the cycle. A factor above 1.08 corresponds to approximately 86 mmol mol⁻¹ H₂O (86 % relative humidity at ~30 °C at sea level), which is outside the normal operating range and may indicate a wet-sensor event.

See Also#

apply_wpl_qc_overrides : Function that applies these thresholds. DEFAULT_CONFIG : Contains h2o_valid_range and max_abs_wpl_rel_change

which are checked at the point level (before cycle aggregation) by prepare_chamber_data().

palmwtc.flux.chamber.apply_wpl_correction(co2_wet, h2o_mmol_mol)#

Convert wet CO₂ (ppm) to dry CO₂ using the WPL dilution correction.

The Webb-Pearman-Leuning (WPL) correction removes the apparent dilution of CO₂ caused by the simultaneous presence of water vapour in the air sample. The formula is:

\[CO_{2,dry} = CO_{2,wet} \times \left(1 + \frac{\chi_w}{1000 - \chi_w}\right)\]

where \(\chi_w\) is the H₂O mole fraction in mmol mol⁻¹.

This is a simplified single-pass WPL for closed-chamber systems where temperature and pressure are treated as constant within a cycle.

Parameters#

co2_wetarray-like

Wet CO₂ mole fraction in ppm (µmol mol⁻¹).

h2o_mmol_molarray-like

Water vapour mole fraction in mmol mol⁻¹. Values that would make the denominator (1000 - χ_w) non-positive are treated as invalid.

Returns#

co2_drypd.Series

Dry CO₂ in ppm. NaN where either input is NaN or H₂O ≥ 1000 mmol mol⁻¹ (physically impossible, but guarded against).

factorpd.Series

WPL multiplication factor 1 + χ_w / (1000 - χ_w). NaN where inputs are invalid.

validpd.Series of bool

True for rows where both inputs were valid and a WPL factor could be computed.

Notes#

The WPL factor for typical tropical conditions (25 mmol mol⁻¹ H₂O, ~50 % RH at 30 °C) is approximately 1.026, adding ~2.6 % to the raw CO₂ reading. At 40 mmol mol⁻¹ (high humidity), the factor is ~1.042.

See Also#

prepare_chamber_data : Calls this function and attaches outputs as columns.

palmwtc.flux.chamber._choose_h2o_column(data, chamber_suffix, prefer_corrected=True)#

Pick the best available H2O column for chamber_suffix.

palmwtc.flux.chamber.prepare_chamber_data(data, chamber_suffix, accepted_co2_qc_flags=(0,), accepted_h2o_qc_flags=(0, 1), prefer_corrected_h2o=True, require_h2o_for_wpl=False, apply_wpl=False, h2o_valid_range=(0.0, 60.0), max_abs_wpl_rel_change=0.12, **kwargs)#

Select, filter, and WPL-correct sensor streams for one chamber.

This is the first step in the flux pipeline. It takes the full multi-chamber dataset (as loaded by palmwtc.io), extracts the columns for a single chamber, applies QC flag row-filtering, and runs the WPL dilution correction. The returned DataFrame is the direct input for calculate_flux_cycles() and calculate_h2o_flux_cycles().

Parameters#

datapd.DataFrame

Full QC-flagged dataset. Expected columns (where {s} = suffix):

  • TIMESTAMP — datetime column.

  • CO2_{s} — raw (wet) CO₂ in ppm.

  • H2O_{s} or H2O_{s}_corrected — water vapour in mmol mol⁻¹.

  • Temp_1_{s} — air temperature inside the chamber in °C.

  • CO2_{s}_qc_flag — integer QC flag for CO₂ (0 = good).

  • H2O_{s}_qc_flag — integer QC flag for H₂O (0 = good, 1 = minor).

Missing columns are silently skipped; only TIMESTAMP and CO2 are required in the output.

chamber_suffixstr

Chamber identifier appended to column names. Typically 'C1' or 'C2' for the two whole-tree chambers.

accepted_co2_qc_flagslist of int or None

Keep only rows whose CO2_{suffix}_qc_flag is in this list. Pass None to skip CO₂ flag filtering entirely. Default from DEFAULT_CONFIG: [0].

accepted_h2o_qc_flagslist of int or None

Same for H₂O. Default from DEFAULT_CONFIG: [0, 1] (flag 1 is a minor sensor warning that still yields usable H₂O).

prefer_corrected_h2obool

When True (default), use H2O_{suffix}_corrected if present; fall back to H2O_{suffix} otherwise.

require_h2o_for_wplbool

When True (default), raise ValueError if no H₂O column is found and apply_wpl=True. Set to False to fall back to the uncorrected wet CO₂ value.

apply_wplbool

When True (default), run apply_wpl_correction() and expose diagnostic columns. When False, CO2 is set equal to CO2_raw and all WPL columns are NaN/0.

h2o_valid_rangetuple of float

(lo, hi) physical validity range for H₂O in mmol mol⁻¹. Values outside this range are set to NaN before WPL correction. Default: (0.0, 60.0).

max_abs_wpl_rel_changefloat

Rows where |wpl_delta_ppm / CO2_raw| exceeds this value get their Flag upgraded to 2 (bad). Default: 0.12 (12 %).

**kwargs

Extra keyword arguments are accepted but ignored. This allows passing **DEFAULT_CONFIG directly without unpacking individual keys.

Returns#

pd.DataFrame

One row per retained timestamp, sorted by TIMESTAMP, with a reset integer index. Columns:

  • TIMESTAMP — datetime.

  • CO2 — working CO₂ in ppm: WPL-corrected when possible, raw when WPL is disabled or H₂O is unavailable.

  • CO2_raw — original wet CO₂ measurement in ppm.

  • CO2_corrected — WPL-corrected CO₂ in ppm (NaN if WPL disabled or H₂O missing for a given row).

  • H2O — water vapour in mmol mol⁻¹ (NaN outside valid range).

  • Temp — air temperature in °C (NaN if column absent in input).

  • CO2_Flag — original CO₂ hardware QC flag (int).

  • H2O_Flag — original H₂O hardware QC flag (int).

  • Flag — combined flag: max(CO2_Flag, H2O_Flag), upgraded to 2 for rows with excessive WPL correction.

  • wpl_factor — WPL multiplication factor per row (NaN if WPL disabled or H₂O missing).

  • wpl_valid_input — 1 where a valid WPL factor was computed, 0 otherwise.

  • wpl_delta_ppmCO2_corrected - CO2_raw in ppm.

  • wpl_rel_changewpl_delta_ppm / CO2_raw (dimensionless).

Raises#

ValueError

If apply_wpl=True and require_h2o_for_wpl=True but no H₂O column is found for the requested chamber_suffix.

See Also#

calculate_flux_cycles : Consumes the output of this function for CO₂ flux. calculate_h2o_flux_cycles : Consumes the output for H₂O flux. summarize_wpl_correction : Computes dataset-level WPL statistics.

Examples#

# doctest: +SKIP # Requires a real multi-chamber DataFrame from palmwtc.io.load_chamber_data. chamber_df = prepare_chamber_data(raw_df, “C1”) print(chamber_df.columns.tolist())

palmwtc.flux.chamber.summarize_wpl_correction(chamber_df)#

Return a dataset-level summary of WPL correction statistics.

Useful for a quick sanity check: if the median WPL factor or p95 relative change looks unusual, it may indicate sensor drift, water condensation on the optical path, or a humidity calibration issue.

Parameters#

chamber_dfpd.DataFrame

Output of prepare_chamber_data(). Must contain columns wpl_delta_ppm, wpl_rel_change, wpl_factor, and optionally CO2_corrected.

Returns#

dict

Empty dict if chamber_df is empty or missing WPL columns. Otherwise, keys are:

  • n_points — total row count.

  • valid_points — rows where CO2_corrected is not NaN.

  • median_factor — median WPL multiplication factor.

  • median_delta_ppm — median WPL additive correction (ppm).

  • p95_abs_rel_change — 95th percentile of |wpl_rel_change|.

See Also#

build_cycle_wpl_metrics : Per-cycle version of the same diagnostics. apply_wpl_qc_overrides : Uses per-cycle metrics to upgrade QC tiers.

palmwtc.flux.chamber.build_cycle_wpl_metrics(chamber_df, chamber_name, cycle_gap_sec=300)#

Aggregate WPL correction metrics per measurement cycle.

Produces one row per cycle with mean/max WPL factor, mean/max WPL delta, valid-data fraction, p95 relative change, and H₂O statistics. These per-cycle values are the input for apply_wpl_qc_overrides().

Parameters#

chamber_dfpd.DataFrame

Output of prepare_chamber_data().

chamber_namestr

Chamber label (e.g. 'Chamber 1'), stored in the output column Source_Chamber.

cycle_gap_secint

Gap in seconds that marks the boundary between cycles, passed to palmwtc.flux.cycles.identify_cycles().

Returns#

pd.DataFrame

One row per cycle. Columns:

  • cycle_id — integer cycle identifier.

  • Source_Chamberchamber_name.

  • wpl_factor_mean — mean WPL factor within the cycle.

  • wpl_factor_max — maximum WPL factor within the cycle.

  • wpl_delta_ppm_mean — mean WPL additive correction (ppm).

  • wpl_delta_ppm_max — maximum WPL additive correction (ppm).

  • wpl_valid_fraction — fraction of rows with a non-NaN CO2_corrected value.

  • wpl_abs_rel_change_p95 — 95th percentile of absolute relative WPL correction within the cycle.

  • h2o_mean — mean H₂O (mmol mol⁻¹) within the cycle.

  • h2o_max — maximum H₂O (mmol mol⁻¹) within the cycle.

Returns an empty DataFrame if chamber_df is empty.

See Also#

apply_wpl_qc_overrides : Consumes the per-cycle metrics produced here. summarize_wpl_correction : Dataset-level WPL summary.

palmwtc.flux.chamber.calculate_flux_cycles(chamber_df, chamber_name, cycle_gap_sec=300, start_cutoff_sec=50, start_search_sec=60, min_points=20, min_duration_sec=180, outlier_z=2, max_outlier_refit_frac=0.2, use_multiprocessing=True, n_jobs=None, **kwargs)#

Identify measurement cycles and compute CO₂ flux for each cycle.

This is the main CO₂ flux batch function. It calls palmwtc.flux.cycles.identify_cycles() to segment the time series into closed-chamber measurement cycles, then dispatches each cycle to palmwtc.flux.cycles.evaluate_cycle() (optionally in parallel via multiprocessing.Pool).

Parameters#

chamber_dfpd.DataFrame

Output of prepare_chamber_data(). Must contain TIMESTAMP, CO2, and optionally Temp and Flag.

chamber_namestr

Human-readable chamber label, stored in the output column Source_Chamber (e.g. 'Chamber 1').

cycle_gap_secint

Time gap in seconds that triggers a new cycle boundary. Default 300 (5 minutes).

start_cutoff_secint

Seconds to skip from cycle start before beginning the regression window search. Removes the initial chamber-mixing transient. Default 50.

start_search_secint

How far into the cycle (seconds) the window-start search extends. Default 60.

min_pointsint

Minimum number of valid points required for a cycle to be processed. Default 20.

min_duration_secint

Minimum regression window length in seconds. Default 180.

outlier_zfloat

Z-score threshold for iterative outlier removal. Default 2.

max_outlier_refit_fracfloat

Maximum fraction of points that may be removed as outliers; if exceeded the original fit is used. Default 0.2.

use_multiprocessingbool

When True and there are more than 50 cycles, process in parallel using multiprocessing.Pool. Falls back to serial on any multiprocessing error. Default True.

n_jobsint or None

Number of parallel workers. Defaults to min(8, cpu_count).

**kwargs

Absorbed silently so callers can pass **DEFAULT_CONFIG directly.

Returns#

pd.DataFrame

One row per successfully processed cycle. Columns (from palmwtc.flux.cycles.evaluate_cycle()):

  • Source_Chamberchamber_name.

  • cycle_id — integer cycle identifier.

  • flux_date — start timestamp of the cycle.

  • cycle_end — end timestamp of the cycle.

  • cycle_duration_sec — total cycle duration in seconds.

  • window_start_sec, window_end_sec — regression window boundaries relative to cycle start.

  • duration_sec — regression window duration in seconds.

  • n_points_total — total points in the full cycle.

  • n_points_used — points used in the final regression.

  • flux_slope — OLS slope of CO₂ vs. time (ppm s⁻¹).

  • flux_intercept — OLS intercept (ppm).

  • r2 — R² of the OLS linear fit.

  • p_value, std_err — regression statistics.

  • rmse — root-mean-square error of the fit (ppm).

  • nrmse — RMSE normalized by the CO₂ range in the window.

  • snr — signal-to-noise ratio: |slope| × duration / rmse.

  • snr_noise — SNR using early-cycle noise estimate (NaN if not computed).

  • noise_sigma — early-cycle noise standard deviation (ppm).

  • monotonicity — fraction of consecutive CO₂ steps moving in the slope direction (noise-filtered).

  • outlier_frac — fraction of points removed as outliers.

  • aicc_linear, aicc_quadratic, delta_aicc — AICc of the linear and quadratic fits; large negative delta_aicc flags curvature.

  • slope_ts, slope_ts_low, slope_ts_high — Theil-Sen slope and 95 % confidence interval (ppm s⁻¹).

  • slope_diff_pct — relative difference between OLS and Theil-Sen slopes.

  • mean_temp — mean air temperature in the cycle (°C).

  • qc_flag — max hardware QC flag in the cycle.

  • co2_range — CO₂ concentration range in the window (ppm).

  • bimodal_flagTrue if a bimodal CO₂ distribution was detected (possible closure gap).

  • bimodal_gap_ppm, bimodal_lower_mean, bimodal_upper_mean — bimodal split statistics.

  • flux_absolute — absolute flux in µmol m⁻² s⁻¹ computed by palmwtc.flux.absolute.calculate_absolute_flux().

Returns an empty DataFrame if chamber_df is empty or contains no valid cycles.

See Also#

prepare_chamber_data : Produces the required chamber_df input. calculate_h2o_flux_cycles : H₂O analogue. palmwtc.flux.cycles.evaluate_cycle : Called for each individual cycle. palmwtc.flux.cycles.score_cycle : QC scoring applied after this step.

Examples#

# doctest: +SKIP # Requires prepared chamber data from prepare_chamber_data(). flux_df = calculate_flux_cycles(chamber_df, “Chamber 1”) print(flux_df[[“flux_date”, “flux_slope”, “r2”, “flux_absolute”]].head())

palmwtc.flux.chamber.calculate_h2o_flux_for_cycle(cycle_data, gas_col='H2O', min_points=20, min_duration_sec=180)#

Compute H₂O slope and fit statistics for a single measurement cycle.

Uses Theil-Sen regression to estimate the slope (robust to outliers) and OLS for R², RMSE, and residual statistics. SNR is computed as |slope_ts × duration| / residual_std, matching the CO₂ definition. Monotonicity is computed only on H₂O steps larger than 0.05 mmol mol⁻¹ (approximately 5× LI-COR H₂O RMS noise) to avoid deflation by sensor jitter.

Parameters#

cycle_datapd.DataFrame

Single-cycle data slice. Must contain TIMESTAMP and gas_col.

gas_colstr

Name of the H₂O column (default 'H2O').

min_pointsint

Minimum number of non-NaN H₂O values required (default 20).

min_duration_secfloat

Minimum span of the cycle in seconds (default 180).

Returns#

dict or None

None if the cycle has fewer than min_points valid rows or shorter than min_duration_sec. Otherwise a dict with keys:

  • h2o_slope — Theil-Sen slope (mmol mol⁻¹ s⁻¹).

  • h2o_intercept — Theil-Sen intercept (mmol mol⁻¹).

  • h2o_r2 — OLS R² (dimensionless, 0–1).

  • h2o_nrmse — NRMSE: OLS RMSE divided by H₂O range; NaN if range is zero.

  • h2o_snr — signal-to-noise ratio.

  • h2o_outlier_frac — fraction of points more than 2.5× MAD from the OLS fit.

  • h2o_monotonic_frac — fraction of noise-filtered consecutive steps in the slope direction; NaN if all steps are below the noise floor.

  • h2o_n_points — number of non-NaN points used.

  • h2o_duration — cycle duration in seconds.

  • h2o_conc_mean — mean H₂O concentration (mmol mol⁻¹).

  • h2o_conc_range — H₂O concentration range in the cycle (mmol mol⁻¹).

See Also#

calculate_h2o_flux_cycles : Calls this function for every cycle. score_h2o_flux_qc : Uses the returned dict to assign a QC grade.

palmwtc.flux.chamber.score_h2o_flux_qc(h2o_metrics, h2o_qc_thresholds=None, is_nighttime=False)#

Assign a QC grade to a single H₂O flux cycle.

Applies a two-tier threshold system: a cycle that passes all _A tests is Grade A (tier 0); failing any _A test but passing all _B tests gives Grade B (tier 1); failing any _B test gives Grade C (tier 2).

A signal-size guard relaxes the nrmse_B and monotonic_B thresholds proportionally for cycles where the H₂O range is smaller than signal_mmol_guard — preventing mass rejection of valid but low-transpiration cycles.

Parameters#

h2o_metricsdict or None

Output of calculate_h2o_flux_for_cycle(). If None, returns tier 2 / Grade C with reason 'No valid H2O data'.

h2o_qc_thresholdsdict or None

Override the default thresholds. When None, selects NIGHTTIME_H2O_QC_THRESHOLDS if is_nighttime=True, otherwise DEFAULT_H2O_QC_THRESHOLDS.

is_nighttimebool

Switches to the nighttime threshold set when True and no explicit thresholds are supplied.

Returns#

tierint

0 for Grade A, 1 for Grade B, 2 for Grade C.

labelstr

'A', 'B', or 'C'.

reasonslist of str

Each failing test appends a human-readable string such as 'R2=0.45<0.70'. Empty when all tests pass.

See Also#

DEFAULT_H2O_QC_THRESHOLDS : Daytime threshold values and key descriptions. NIGHTTIME_H2O_QC_THRESHOLDS : Nighttime threshold values. calculate_h2o_flux_cycles : Calls this function for every cycle.

palmwtc.flux.chamber.calculate_h2o_flux_cycles(chamber_df, chamber_name, cycle_gap_sec=300, min_points=20, min_duration_sec=180, h2o_qc_thresholds=None, **kwargs)#

Compute H₂O flux for every cycle in chamber_df.

Mirrors calculate_flux_cycles() for water vapour. For each cycle, calls calculate_h2o_flux_for_cycle() and then score_h2o_flux_qc(), automatically switching to nighttime thresholds when Global_Radiation < 10 W m⁻² (or when the cycle starts before 06:00 or after 18:00, if radiation is not available).

Parameters#

chamber_dfpd.DataFrame

Output of prepare_chamber_data(). Must contain TIMESTAMP and H2O; optionally Global_Radiation for nighttime detection.

chamber_namestr

Chamber label stored in Source_Chamber (e.g. 'Chamber 1').

cycle_gap_secint

Gap in seconds that marks cycle boundaries. Default 300.

min_pointsint

Minimum valid H₂O points required per cycle. Default 20.

min_duration_secfloat

Minimum cycle duration in seconds. Default 180.

h2o_qc_thresholdsdict or None

Override the daytime H₂O thresholds. Nighttime thresholds are always selected automatically from NIGHTTIME_H2O_QC_THRESHOLDS regardless of this parameter. Default: DEFAULT_H2O_QC_THRESHOLDS.

**kwargs

Absorbed silently so callers can pass **DEFAULT_CONFIG directly.

Returns#

pd.DataFrame

One row per valid cycle. Columns:

  • cycle_id — integer cycle identifier.

  • Source_Chamberchamber_name.

  • h2o_qc — QC tier: 0 = A, 1 = B, 2 = C.

  • h2o_qc_label'A', 'B', or 'C'.

  • h2o_qc_reason — semicolon-separated failing-test strings.

  • All keys returned by calculate_h2o_flux_for_cycle(): h2o_slope, h2o_intercept, h2o_r2, h2o_nrmse, h2o_snr, h2o_outlier_frac, h2o_monotonic_frac, h2o_n_points, h2o_duration, h2o_conc_mean, h2o_conc_range.

Returns an empty DataFrame if chamber_df is empty, has no H2O column, or all H₂O values are NaN.

See Also#

calculate_flux_cycles : CO₂ analogue. prepare_chamber_data : Produces the required chamber_df input. score_h2o_flux_qc : H₂O QC grading function.

Examples#

# doctest: +SKIP # Requires prepared chamber data from prepare_chamber_data(). h2o_df = calculate_h2o_flux_cycles(chamber_df, “Chamber 1”) print(h2o_df[[“cycle_id”, “h2o_slope”, “h2o_qc_label”]].head())

palmwtc.flux.chamber.load_tree_biophysics(base_dir)#

Load palm tree biophysical parameters from the PalmStudio spreadsheet.

Reads Vigor_Index_PalmStudio.xlsx (expected at {base_dir}/), converts Indonesian column names to English, converts measurements from centimetres to metres, and extracts the clone identifier from the tree ID string.

The Vigor Index is the estimated above-ground biomass volume (cm³ in the spreadsheet, converted to m³ here). It is computed by PalmStudio from measured height and canopy radii. It is used by get_tree_volume_at_date() to time-interpolate tree volume for any given measurement date.

Parameters#

base_dirstr or Path

Directory that contains Vigor_Index_PalmStudio.xlsx.

Returns#

pd.DataFrame or None

One row per measurement visit per tree. Columns:

  • Tree ID — tree identifier string (e.g. 'EKA1-001').

  • Date — measurement date (datetime).

  • Height_m — total tree height in metres.

  • Max_Radius_m — maximum canopy radius in metres.

  • Est_Width_m — estimated canopy width (2 × mean radius) in metres.

  • Vigor_Index_m3 — estimated tree volume in m³ (converted from cm³ by dividing by 1 000 000).

  • Clone — clone name extracted from Tree ID (e.g. 'EKA 1').

Returns None (with a printed warning) if the file is not found.

Notes#

The spreadsheet uses Indonesian column headings (Tanggal, Kode pohon, Tinggi Pohon (cm)). This function handles the renaming automatically.

See Also#

get_tree_volume_at_dateTime-interpolates Vigor Index from the table

returned by this function.

Examples#

# doctest: +SKIP # Requires Vigor_Index_PalmStudio.xlsx in the data directory. df_vigor = load_tree_biophysics(“/path/to/data”) print(df_vigor[[“Tree ID”, “Date”, “Vigor_Index_m3”]].head())

palmwtc.flux.chamber.get_tree_volume_at_date(df_vigor, tree_id, target_date)#

Time-interpolate the Vigor Index (m³) for a tree at a specific date.

If an exact measurement exists on target_date, that value is returned directly. Otherwise, the Vigor Index time series for the tree is linearly interpolated between the two nearest measurements. No extrapolation is performed — dates outside the measurement range return None because the time-based interpolation does not fill beyond the index boundaries.

Parameters#

df_vigorpd.DataFrame or None

Output of load_tree_biophysics(). None returns None immediately.

tree_idstr

Tree identifier matching the Tree ID column in df_vigor (e.g. 'EKA1-001').

target_datestr or datetime-like

The date for which to estimate the tree volume. String values are parsed via pandas.to_datetime().

Returns#

float or None

Vigor Index in m³ at target_date, or None if df_vigor is None, tree_id is not found, or the date is outside the measured range.

Notes#

The interpolation method is pandas 'time', which assumes a constant growth rate between measurement visits. Palm canopy volume grows roughly monotonically over the study period, so linear interpolation is appropriate for the typical visit interval of a few months.

See Also#

load_tree_biophysics : Loads and parses the biophysical spreadsheet.

Examples#

# doctest: +SKIP # Requires a DataFrame from load_tree_biophysics(). vol = get_tree_volume_at_date(df_vigor, “EKA1-001”, “2023-06-15”) print(f”Tree volume: {vol:.4f} m3”)

palmwtc.flux.chamber.apply_wpl_qc_overrides(row, model_qc, flux_qc, reason_text, wpl_qc_thresholds=None, h2o_valid_range=(0.0, 60.0))#

Apply WPL-specific checks and upgrade QC tiers if needed.

Checks whether the WPL correction was well-conditioned for a given cycle (sufficient valid H₂O data, reasonable correction magnitude, plausible WPL factor). If any check fails, the model_qc and flux_qc tiers are upgraded (never downgraded) and a reason string is appended.

This function is called after build_cycle_wpl_metrics() and palmwtc.flux.cycles.score_cycle() in the post-processing pipeline, not by calculate_flux_cycles() directly.

Parameters#

rowpd.Series or dict

A single cycle row containing WPL metrics produced by build_cycle_wpl_metrics(): wpl_valid_fraction, wpl_abs_rel_change_p95, wpl_factor_max, and h2o_max.

model_qcint

Current model QC tier (0 = A, 1 = B, 2 = C) to be potentially upgraded.

flux_qcint

Current flux QC tier to be potentially upgraded.

reason_textstr

Semicolon-separated QC reasons accumulated so far. New reasons are appended and duplicates are removed.

wpl_qc_thresholdsdict or None

Override DEFAULT_WPL_QC_THRESHOLDS.

h2o_valid_rangetuple of float

(lo, hi) valid H₂O range in mmol mol⁻¹ (default (0.0, 60.0)). H₂O values above hi trigger a Grade C downgrade.

Returns#

tuple of (int, int, int, str)

(model_qc, flux_qc, wpl_qc, reason_text) where:

  • model_qc, flux_qc are the (possibly upgraded) input tiers.

  • wpl_qc is the WPL-specific tier (0, 1, or 2) that drove the upgrade.

  • reason_text is the updated semicolon-separated reason string.

See Also#

DEFAULT_WPL_QC_THRESHOLDS : Threshold values and key descriptions. build_cycle_wpl_metrics : Produces the per-cycle WPL metrics consumed here.

palmwtc.flux.chamber.compute_closure_confidence(r2, nrmse, global_radiation, rad_max=800.0)#

Compute a chamber closure confidence score between 0 and 1.

Combines R², NRMSE, and global radiation into a single scalar that expresses how confident we are that the chamber was properly sealed during a flux cycle.

Physical reasoning: poor fit quality (low R², high NRMSE) is more likely to indicate a physical leak when photosynthetic demand is high (bright conditions). The same poor fit at night or on a cloudy day could simply reflect a small signal close to sensor noise. The score therefore penalizes low R² and high NRMSE more strongly when radiation is high.

Formula#

\[ \begin{align}\begin{aligned}r2\_conf = clip\left(\frac{R^2 - 0.25}{0.94 - 0.25}, 0, 1\right)\\rad\_norm = clip\left(\frac{G}{G_{max}}, 0, 1\right)\\confidence = clip\left(r2\_conf - 0.4 \times rad\_norm \times (1 - r2\_conf) - 0.2 \times rad\_norm \times clip(NRMSE / 0.20, 0, 1), 0, 1\right)\end{aligned}\end{align} \]

Parameters#

r2float or array-like

R² of the OLS linear CO₂ vs. time fit (0–1). NaN is treated as 0.

nrmsefloat or array-like

Normalized RMSE (RMSE / CO₂ range). NaN is treated as 0.

global_radiationfloat or array-like

Incoming solar radiation in W m⁻². NaN is treated as 0 (worst-case penalty removed).

rad_maxfloat

Radiation level at which the radiation penalty is at its maximum. Default 800.0 W m⁻² (typical clear-sky midday value in the tropics).

Returns#

float or numpy.ndarray

Closure confidence score in [0, 1]. A score near 1 indicates a well-sealed chamber with a clean linear CO₂ trend. A score near 0 indicates likely leakage or strong non-linearity under high light.

Notes#

The R² bounds (0.25 to 0.94) and penalty weights (0.4, 0.2) were calibrated against manual inspection of gap-width experiment data.

See Also#

calculate_flux_cyclesProduces the R², NRMSE, and radiation values

consumed here.

Examples#

>>> from palmwtc.flux.chamber import compute_closure_confidence
>>> round(float(compute_closure_confidence(0.98, 0.03, 0.0)), 3)
1.0
>>> round(float(compute_closure_confidence(0.95, 0.05, 200.0)), 3)
0.988
>>> round(float(compute_closure_confidence(0.50, 0.25, 600.0)), 3)
0.021
>>> round(float(compute_closure_confidence(0.40, 0.30, 700.0)), 3)
0.0