palmwtc.flux.chamber#
Chamber data preparation, WPL correction, and per-cycle flux batch computation.
This module is the entry point for the full CO₂ and H₂O flux calculation pipeline for automated whole-tree chambers instrumented with LI-COR LI-850. It ties together sensor-stream preparation, WPL dilution correction, cycle identification, regression-based flux extraction, QC scoring, and tree biophysical data.
Pipeline overview#
prepare_chamber_data()— selects the correct sensor columns for one chamber (C1orC2), applies QC flag filtering, runs WPL correction, and returns a clean DataFrame ready for cycle identification.calculate_flux_cycles()— identifies measurement cycles inside the prepared DataFrame and runspalmwtc.flux.cycles.evaluate_cycle()on every cycle (in parallel when the dataset is large), returning one row per cycle with slope, R², AICc, monotonicity, flux, and QC fields.calculate_h2o_flux_cycles()— H₂O analogue: uses Theil-Sen + OLS regression with relaxed QC thresholds appropriate for water-vapour noise levels.compute_closure_confidence()— converts per-cycle R², NRMSE, and global radiation into a 0–1 confidence score for chamber closure quality.
Tree biophysics helpers#
load_tree_biophysics()— readsVigor_Index_PalmStudio.xlsxand returns palm geometry time series (height, radius, estimated volume).get_tree_volume_at_date()— time-interpolates the Vigor Index (m³) for a specific tree and date from the biophysics table.
WPL diagnostic helpers#
summarize_wpl_correction()— dataset-level WPL statistics (median factor, p95 relative change, valid-point count).build_cycle_wpl_metrics()— cycle-level WPL diagnostics table used to detect humidity-driven flux artefacts.
Configuration constants#
All functions accept explicit parameters. Use the constants below as starting points and override what you need:
DEFAULT_CONFIG— cycle detection, regression window, QC flag filtering, WPL, and parallel-processing defaults.DEFAULT_CO2_QC_THRESHOLDS— daytime CO₂ grading thresholds (R², NRMSE, SNR, monotonicity, outlier fraction).NIGHTTIME_CO2_QC_THRESHOLDS— relaxed CO₂ thresholds for cycles where Global_Radiation < 10 W m⁻² (respiration-only, smaller dynamic range).DEFAULT_H2O_QC_THRESHOLDS— daytime H₂O grading thresholds.NIGHTTIME_H2O_QC_THRESHOLDS— relaxed H₂O thresholds for nighttime (near-zero transpiration means tiny, flat H₂O signals are expected).DEFAULT_WPL_QC_THRESHOLDS— per-cycle WPL validity thresholds (valid-data fraction, p95 relative correction, maximum WPL factor).
Quick start:
from palmwtc.flux.chamber import (
prepare_chamber_data,
calculate_flux_cycles,
calculate_h2o_flux_cycles,
)
# Use default config, override just one key:
cfg = {**DEFAULT_CONFIG, "min_points": 10, "cycle_gap_sec": 240}
chamber_df = prepare_chamber_data(raw_df, "C1", **cfg)
flux_df = calculate_flux_cycles(chamber_df, "Chamber 1", **cfg)
h2o_df = calculate_h2o_flux_cycles(chamber_df, "Chamber 1", **cfg)
Attributes#
Default pipeline configuration for cycle detection, regression, and WPL. |
|
Daytime CO₂ QC grading thresholds for |
|
Relaxed CO₂ QC thresholds for nighttime cycles (Global_Radiation < 10 W m⁻²). |
|
Daytime H₂O QC grading thresholds for |
|
Relaxed H₂O QC thresholds for nighttime cycles (Global_Radiation < 10 W m⁻²). |
|
Per-cycle WPL correction validity thresholds used by |
Functions#
|
Convert wet CO₂ (ppm) to dry CO₂ using the WPL dilution correction. |
|
Pick the best available H2O column for chamber_suffix. |
|
Select, filter, and WPL-correct sensor streams for one chamber. |
|
Return a dataset-level summary of WPL correction statistics. |
|
Aggregate WPL correction metrics per measurement cycle. |
|
Identify measurement cycles and compute CO₂ flux for each cycle. |
|
Compute H₂O slope and fit statistics for a single measurement cycle. |
|
Assign a QC grade to a single H₂O flux cycle. |
|
Compute H₂O flux for every cycle in chamber_df. |
|
Load palm tree biophysical parameters from the PalmStudio spreadsheet. |
|
Time-interpolate the Vigor Index (m³) for a tree at a specific date. |
|
Apply WPL-specific checks and upgrade QC tiers if needed. |
|
Compute a chamber closure confidence score between 0 and 1. |
Module Contents#
- palmwtc.flux.chamber.DEFAULT_CONFIG#
Default pipeline configuration for cycle detection, regression, and WPL.
Pass
{**DEFAULT_CONFIG, "key": new_value}to any pipeline function to override individual settings without losing the other defaults.Keys#
- cycle_gap_secint
Minimum gap in seconds between two successive measurements that triggers the start of a new measurement cycle (default
300).- start_cutoff_secint
Number of seconds to skip from the beginning of a cycle before starting the regression window search (default
50). Skips the initial chamber-mixing transient.- start_search_secint
How far into the cycle (seconds) the window-start search extends (default
60).- min_pointsint
Minimum number of valid data points required to attempt a flux regression (default
20).- min_duration_secint
Minimum regression window length in seconds (default
180).- outlier_zfloat
Z-score threshold for iterative outlier removal before re-fitting (default
2).- max_outlier_refit_fracfloat
Maximum fraction of points that may be removed as outliers; if exceeded the original fit is kept (default
0.2).- noise_eps_ppmfloat
Noise floor in ppm used when computing the monotonicity fraction (steps smaller than this are treated as noise, not signal direction, default
0.5).- accepted_co2_qc_flagslist of int
Only rows whose
CO2_{suffix}_qc_flagis in this list are kept (default[0];Nonekeeps all rows).- accepted_h2o_qc_flagslist of int
Same for
H2O_{suffix}_qc_flag(default[0, 1]; H₂O flag 1 is a minor sensor warning that still produces usable data).- prefer_corrected_h2obool
When
True, useH2O_{suffix}_correctedover rawH2O_{suffix}if the corrected column is present (defaultTrue).- require_h2o_for_wplbool
When
True,prepare_chamber_data()raisesValueErrorif no H₂O column is found and WPL correction is requested (defaultTrue). Set toFalseto fall back to wet CO₂.- h2o_valid_rangetuple of float
Physical validity bounds for H₂O in mmol mol⁻¹ as
(lo, hi)(default(0.0, 60.0)). Values outside this range are set to NaN before WPL correction.- max_abs_wpl_rel_changefloat
Maximum plausible absolute relative WPL correction (default
0.12, i.e. 12 %). Rows with larger corrections get a Flag upgrade to 2.- use_multiprocessingbool
Use
multiprocessing.Poolfor cycle batches larger than 50 cycles (defaultTrue).- n_jobsint
Number of parallel worker processes (default
min(8, cpu_count)).
- palmwtc.flux.chamber.DEFAULT_CO2_QC_THRESHOLDS#
Daytime CO₂ QC grading thresholds for
palmwtc.flux.cycles.score_cycle().Each threshold has an
_A(Grade A boundary) and_B(Grade B boundary) variant. A cycle that passes all_Atests is Grade A (tier 0). A cycle that fails one or more_Atests but passes all_Btests is Grade B (tier 1). Failing any_Btest downgrades to Grade C (tier 2).Keys#
- r2_A, r2_Bfloat
Minimum R² of the OLS linear fit. Daytime photosynthesis and respiration cycles have large, clean CO₂ signals so the bar is high (0.90 / 0.70).
- nrmse_A, nrmse_Bfloat
Maximum normalized RMSE (RMSE divided by CO₂ concentration range). Low values (0.10 / 0.20) indicate a clean linear trend.
- snr_A, snr_Bfloat
Minimum signal-to-noise ratio, defined as (|slope| × duration) / RMSE. Measures whether the CO₂ trend is distinguishable from measurement noise.
- monotonic_A, monotonic_Bfloat
Minimum fraction of consecutive concentration steps that move in the direction of the fitted slope (steps smaller than
noise_eps_ppmare ignored). Daytime CO₂ should rise or fall steadily inside a closed chamber.- outlier_A, outlier_Bfloat
Maximum fraction of points removed as statistical outliers before re-fitting (0.05 / 0.15).
- curvature_aiccfloat
AICc difference (quadratic minus linear) threshold. Values more negative than this indicate significant curvature, flagging possible leaks or mixing issues. Note: this key is read by
palmwtc.flux.cycles.score_cycle(), not by functions in this module directly.- slope_diff_A, slope_diff_Bfloat
Maximum relative difference between OLS slope and Theil-Sen slope (
|slope_ols - slope_ts| / |slope_ols|). Large differences indicate leverage points or non-linearity.- signal_ppm_guardfloat
Total CO₂ change (ppm) below which the
monotonic_A/Bthresholds are scaled down proportionally. Prevents mass rejection of low-flux cycles where noise-to-signal ratio is inherently higher.
See Also#
NIGHTTIME_CO2_QC_THRESHOLDS : Relaxed version for dark/respiration cycles. palmwtc.flux.cycles.score_cycle : Function that consumes these thresholds.
- palmwtc.flux.chamber.NIGHTTIME_CO2_QC_THRESHOLDS#
Relaxed CO₂ QC thresholds for nighttime cycles (Global_Radiation < 10 W m⁻²).
This is an alias for
palmwtc.flux.cycles.NIGHTTIME_QC_THRESHOLDS. It is exposed here so callers that work only withpalmwtc.flux.chamberdo not need to import from the lower-levelpalmwtc.flux.cyclesmodule.Why nighttime cycles need relaxed thresholds#
During the day, photosynthesis drives a strong, fast CO₂ drawdown inside the closed chamber (often 20–100 ppm over 5 minutes). This yields high R², SNR, and monotonicity, making the daytime
_Athresholds easy to meet.At night, only leaf + soil respiration remain. CO₂ rise rates are typically 3–15 ppm over 5 minutes — a much smaller signal that sits closer to instrument noise (~0.2–0.5 ppm RMS for LI-COR LI-850). Applying daytime thresholds to these cycles rejects most valid nighttime measurements.
Relaxed values (compared to
DEFAULT_CO2_QC_THRESHOLDS)#r2_A0.90 → 0.70,r2_B0.70 → 0.40 — lower R² is expected when the signal is small relative to noise.snr_A10.0 → 5.0,snr_B3.0 → 2.0 — smaller CO₂ trends mean lower SNR even in well-sealed chambers.monotonic_A0.80 → 0.50,monotonic_B0.45 → 0.30 — a 5 ppm respiration signal with 0.5 ppm noise gives ~50 % monotonicity even when the signal is real.signal_ppm_guard5.0 → 3.0 — the guard activates earlier for the smaller nighttime signals.
See Also#
DEFAULT_CO2_QC_THRESHOLDS : Daytime thresholds. palmwtc.flux.cycles.NIGHTTIME_QC_THRESHOLDS : Canonical source of these values.
- palmwtc.flux.chamber.DEFAULT_H2O_QC_THRESHOLDS#
Daytime H₂O QC grading thresholds for
score_h2o_flux_qc().H₂O thresholds are systematically looser than the CO₂ counterparts in
DEFAULT_CO2_QC_THRESHOLDS. Two reasons:The LI-COR LI-850 H₂O channel has higher absolute noise (~0.1–0.2 mmol mol⁻¹ RMS) than the CO₂ channel, reducing R² and SNR for the same physical signal size.
Transpiration signals in humid tropical conditions are often 0.5–3 mmol mol⁻¹ over a 5-minute cycle — smaller fractional change than CO₂ during active photosynthesis.
Keys#
- r2_A, r2_Bfloat
Minimum R² of the OLS linear fit (0.70 / 0.50).
- nrmse_A, nrmse_Bfloat
Maximum normalized RMSE (0.15 / 0.25).
- snr_A, snr_Bfloat
Minimum SNR, computed as (|Theil-Sen slope| × duration) / residual std (5.0 / 3.0).
- monotonic_A, monotonic_Bfloat
Minimum fraction of H₂O steps larger than 0.05 mmol mol⁻¹ that move in the fitted-slope direction (0.70 / 0.40). The 0.05 mmol mol⁻¹ noise floor prevents sensor jitter from deflating the fraction.
- outlier_A, outlier_Bfloat
Maximum fraction of outlier points allowed before downgrading (0.15 / 0.25). Looser than CO₂ because H₂O droplets on the optical path can cause isolated spikes.
- signal_mmol_guardfloat
H₂O concentration range (mmol mol⁻¹) below which
nrmse_Bandmonotonic_Bare relaxed proportionally (default 0.3). Prevents mass rejection of valid but low-transpiration cycles.
See Also#
NIGHTTIME_H2O_QC_THRESHOLDS : Relaxed version for nocturnal cycles. score_h2o_flux_qc : Function that consumes these thresholds.
- palmwtc.flux.chamber.NIGHTTIME_H2O_QC_THRESHOLDS#
Relaxed H₂O QC thresholds for nighttime cycles (Global_Radiation < 10 W m⁻²).
Why nighttime H₂O needs the most relaxed thresholds#
Stomata close at night, so transpiration drops to near zero. A typical nighttime H₂O slope is 0.0–0.1 mmol mol⁻¹ min⁻¹ — often indistinguishable from sensor drift. Applying daytime thresholds to these cycles would grade nearly all of them C, making nighttime water-balance closure impossible. The physical expectation at night is a flat or very slowly rising H₂O trace, not a steep linear increase.
Relaxed values (compared to
DEFAULT_H2O_QC_THRESHOLDS)#r2_A0.70 → 0.50,r2_B0.50 → 0.25 — a flat trace has R² ≈ 0 by definition; low R² at night is not a data-quality failure.nrmse_A0.15 → 0.25,nrmse_B0.25 → 0.45 — when the H₂O range is 0.1–0.2 mmol mol⁻¹, sensor noise dominates NRMSE.snr_A5.0 → 3.0,snr_B3.0 → 1.5 — near-zero signal means SNR is near noise floor even in a well-sealed chamber.monotonic_A0.50 → 0.50,monotonic_B0.40 → 0.30 — random-walk noise on a flat trace produces ~50 % monotonicity by chance.signal_mmol_guard0.30 → 0.15 — the guard activates at even smaller H₂O changes to protect valid low-transpiration cycles.
See Also#
DEFAULT_H2O_QC_THRESHOLDS : Daytime thresholds. score_h2o_flux_qc : Function that applies these thresholds.
- palmwtc.flux.chamber.DEFAULT_WPL_QC_THRESHOLDS#
Per-cycle WPL correction validity thresholds used by
apply_wpl_qc_overrides().These thresholds check whether the WPL correction was well-conditioned for a given cycle, not whether the underlying CO₂ flux regression was good. A cycle can have perfect R² but still have a poor WPL correction if many H₂O readings were out-of-range or the humidity was unusually high.
Keys#
- valid_frac_A, valid_frac_Bfloat
Minimum fraction of points in the cycle for which a valid WPL factor could be computed (i.e. H₂O was within
h2o_valid_rangeand non-NaN). Grade A requires 98 % coverage; Grade B requires 95 %.- rel_change_p95_A, rel_change_p95_Bfloat
95th percentile of the absolute relative WPL correction (
|wpl_delta_ppm / CO2_raw|) within the cycle. Values above 7 % indicate unusually large humidity-driven adjustments that can distort the flux. Values above 4 % are flagged as moderate (Grade B).- factor_max_Bfloat
Maximum WPL multiplication factor (
1 + χ_w / (1000 − χ_w)) seen in the cycle. A factor above 1.08 corresponds to approximately 86 mmol mol⁻¹ H₂O (86 % relative humidity at ~30 °C at sea level), which is outside the normal operating range and may indicate a wet-sensor event.
See Also#
apply_wpl_qc_overrides : Function that applies these thresholds. DEFAULT_CONFIG : Contains
h2o_valid_rangeandmax_abs_wpl_rel_changewhich are checked at the point level (before cycle aggregation) by
prepare_chamber_data().
- palmwtc.flux.chamber.apply_wpl_correction(co2_wet, h2o_mmol_mol)#
Convert wet CO₂ (ppm) to dry CO₂ using the WPL dilution correction.
The Webb-Pearman-Leuning (WPL) correction removes the apparent dilution of CO₂ caused by the simultaneous presence of water vapour in the air sample. The formula is:
\[CO_{2,dry} = CO_{2,wet} \times \left(1 + \frac{\chi_w}{1000 - \chi_w}\right)\]where \(\chi_w\) is the H₂O mole fraction in mmol mol⁻¹.
This is a simplified single-pass WPL for closed-chamber systems where temperature and pressure are treated as constant within a cycle.
Parameters#
- co2_wetarray-like
Wet CO₂ mole fraction in ppm (µmol mol⁻¹).
- h2o_mmol_molarray-like
Water vapour mole fraction in mmol mol⁻¹. Values that would make the denominator
(1000 - χ_w)non-positive are treated as invalid.
Returns#
- co2_drypd.Series
Dry CO₂ in ppm. NaN where either input is NaN or H₂O ≥ 1000 mmol mol⁻¹ (physically impossible, but guarded against).
- factorpd.Series
WPL multiplication factor
1 + χ_w / (1000 - χ_w). NaN where inputs are invalid.- validpd.Series of bool
Truefor rows where both inputs were valid and a WPL factor could be computed.
Notes#
The WPL factor for typical tropical conditions (25 mmol mol⁻¹ H₂O, ~50 % RH at 30 °C) is approximately 1.026, adding ~2.6 % to the raw CO₂ reading. At 40 mmol mol⁻¹ (high humidity), the factor is ~1.042.
See Also#
prepare_chamber_data : Calls this function and attaches outputs as columns.
- palmwtc.flux.chamber._choose_h2o_column(data, chamber_suffix, prefer_corrected=True)#
Pick the best available H2O column for chamber_suffix.
- palmwtc.flux.chamber.prepare_chamber_data(data, chamber_suffix, accepted_co2_qc_flags=(0,), accepted_h2o_qc_flags=(0, 1), prefer_corrected_h2o=True, require_h2o_for_wpl=False, apply_wpl=False, h2o_valid_range=(0.0, 60.0), max_abs_wpl_rel_change=0.12, **kwargs)#
Select, filter, and WPL-correct sensor streams for one chamber.
This is the first step in the flux pipeline. It takes the full multi-chamber dataset (as loaded by
palmwtc.io), extracts the columns for a single chamber, applies QC flag row-filtering, and runs the WPL dilution correction. The returned DataFrame is the direct input forcalculate_flux_cycles()andcalculate_h2o_flux_cycles().Parameters#
- datapd.DataFrame
Full QC-flagged dataset. Expected columns (where
{s}= suffix):TIMESTAMP— datetime column.CO2_{s}— raw (wet) CO₂ in ppm.H2O_{s}orH2O_{s}_corrected— water vapour in mmol mol⁻¹.Temp_1_{s}— air temperature inside the chamber in °C.CO2_{s}_qc_flag— integer QC flag for CO₂ (0 = good).H2O_{s}_qc_flag— integer QC flag for H₂O (0 = good, 1 = minor).
Missing columns are silently skipped; only
TIMESTAMPandCO2are required in the output.- chamber_suffixstr
Chamber identifier appended to column names. Typically
'C1'or'C2'for the two whole-tree chambers.- accepted_co2_qc_flagslist of int or None
Keep only rows whose
CO2_{suffix}_qc_flagis in this list. PassNoneto skip CO₂ flag filtering entirely. Default fromDEFAULT_CONFIG:[0].- accepted_h2o_qc_flagslist of int or None
Same for H₂O. Default from
DEFAULT_CONFIG:[0, 1](flag 1 is a minor sensor warning that still yields usable H₂O).- prefer_corrected_h2obool
When
True(default), useH2O_{suffix}_correctedif present; fall back toH2O_{suffix}otherwise.- require_h2o_for_wplbool
When
True(default), raiseValueErrorif no H₂O column is found andapply_wpl=True. Set toFalseto fall back to the uncorrected wet CO₂ value.- apply_wplbool
When
True(default), runapply_wpl_correction()and expose diagnostic columns. WhenFalse,CO2is set equal toCO2_rawand all WPL columns are NaN/0.- h2o_valid_rangetuple of float
(lo, hi)physical validity range for H₂O in mmol mol⁻¹. Values outside this range are set to NaN before WPL correction. Default:(0.0, 60.0).- max_abs_wpl_rel_changefloat
Rows where
|wpl_delta_ppm / CO2_raw|exceeds this value get theirFlagupgraded to 2 (bad). Default:0.12(12 %).- **kwargs
Extra keyword arguments are accepted but ignored. This allows passing
**DEFAULT_CONFIGdirectly without unpacking individual keys.
Returns#
- pd.DataFrame
One row per retained timestamp, sorted by
TIMESTAMP, with a reset integer index. Columns:TIMESTAMP— datetime.CO2— working CO₂ in ppm: WPL-corrected when possible, raw when WPL is disabled or H₂O is unavailable.CO2_raw— original wet CO₂ measurement in ppm.CO2_corrected— WPL-corrected CO₂ in ppm (NaN if WPL disabled or H₂O missing for a given row).H2O— water vapour in mmol mol⁻¹ (NaN outside valid range).Temp— air temperature in °C (NaN if column absent in input).CO2_Flag— original CO₂ hardware QC flag (int).H2O_Flag— original H₂O hardware QC flag (int).Flag— combined flag: max(CO2_Flag, H2O_Flag), upgraded to 2 for rows with excessive WPL correction.wpl_factor— WPL multiplication factor per row (NaN if WPL disabled or H₂O missing).wpl_valid_input— 1 where a valid WPL factor was computed, 0 otherwise.wpl_delta_ppm—CO2_corrected - CO2_rawin ppm.wpl_rel_change—wpl_delta_ppm / CO2_raw(dimensionless).
Raises#
- ValueError
If
apply_wpl=Trueandrequire_h2o_for_wpl=Truebut no H₂O column is found for the requestedchamber_suffix.
See Also#
calculate_flux_cycles : Consumes the output of this function for CO₂ flux. calculate_h2o_flux_cycles : Consumes the output for H₂O flux. summarize_wpl_correction : Computes dataset-level WPL statistics.
Examples#
# doctest: +SKIP # Requires a real multi-chamber DataFrame from palmwtc.io.load_chamber_data. chamber_df = prepare_chamber_data(raw_df, “C1”) print(chamber_df.columns.tolist())
- palmwtc.flux.chamber.summarize_wpl_correction(chamber_df)#
Return a dataset-level summary of WPL correction statistics.
Useful for a quick sanity check: if the median WPL factor or p95 relative change looks unusual, it may indicate sensor drift, water condensation on the optical path, or a humidity calibration issue.
Parameters#
- chamber_dfpd.DataFrame
Output of
prepare_chamber_data(). Must contain columnswpl_delta_ppm,wpl_rel_change,wpl_factor, and optionallyCO2_corrected.
Returns#
- dict
Empty dict if chamber_df is empty or missing WPL columns. Otherwise, keys are:
n_points— total row count.valid_points— rows whereCO2_correctedis not NaN.median_factor— median WPL multiplication factor.median_delta_ppm— median WPL additive correction (ppm).p95_abs_rel_change— 95th percentile of|wpl_rel_change|.
See Also#
build_cycle_wpl_metrics : Per-cycle version of the same diagnostics. apply_wpl_qc_overrides : Uses per-cycle metrics to upgrade QC tiers.
- palmwtc.flux.chamber.build_cycle_wpl_metrics(chamber_df, chamber_name, cycle_gap_sec=300)#
Aggregate WPL correction metrics per measurement cycle.
Produces one row per cycle with mean/max WPL factor, mean/max WPL delta, valid-data fraction, p95 relative change, and H₂O statistics. These per-cycle values are the input for
apply_wpl_qc_overrides().Parameters#
- chamber_dfpd.DataFrame
Output of
prepare_chamber_data().- chamber_namestr
Chamber label (e.g.
'Chamber 1'), stored in the output columnSource_Chamber.- cycle_gap_secint
Gap in seconds that marks the boundary between cycles, passed to
palmwtc.flux.cycles.identify_cycles().
Returns#
- pd.DataFrame
One row per cycle. Columns:
cycle_id— integer cycle identifier.Source_Chamber— chamber_name.wpl_factor_mean— mean WPL factor within the cycle.wpl_factor_max— maximum WPL factor within the cycle.wpl_delta_ppm_mean— mean WPL additive correction (ppm).wpl_delta_ppm_max— maximum WPL additive correction (ppm).wpl_valid_fraction— fraction of rows with a non-NaNCO2_correctedvalue.wpl_abs_rel_change_p95— 95th percentile of absolute relative WPL correction within the cycle.h2o_mean— mean H₂O (mmol mol⁻¹) within the cycle.h2o_max— maximum H₂O (mmol mol⁻¹) within the cycle.
Returns an empty DataFrame if chamber_df is empty.
See Also#
apply_wpl_qc_overrides : Consumes the per-cycle metrics produced here. summarize_wpl_correction : Dataset-level WPL summary.
- palmwtc.flux.chamber.calculate_flux_cycles(chamber_df, chamber_name, cycle_gap_sec=300, start_cutoff_sec=50, start_search_sec=60, min_points=20, min_duration_sec=180, outlier_z=2, max_outlier_refit_frac=0.2, use_multiprocessing=True, n_jobs=None, **kwargs)#
Identify measurement cycles and compute CO₂ flux for each cycle.
This is the main CO₂ flux batch function. It calls
palmwtc.flux.cycles.identify_cycles()to segment the time series into closed-chamber measurement cycles, then dispatches each cycle topalmwtc.flux.cycles.evaluate_cycle()(optionally in parallel viamultiprocessing.Pool).Parameters#
- chamber_dfpd.DataFrame
Output of
prepare_chamber_data(). Must containTIMESTAMP,CO2, and optionallyTempandFlag.- chamber_namestr
Human-readable chamber label, stored in the output column
Source_Chamber(e.g.'Chamber 1').- cycle_gap_secint
Time gap in seconds that triggers a new cycle boundary. Default
300(5 minutes).- start_cutoff_secint
Seconds to skip from cycle start before beginning the regression window search. Removes the initial chamber-mixing transient. Default
50.- start_search_secint
How far into the cycle (seconds) the window-start search extends. Default
60.- min_pointsint
Minimum number of valid points required for a cycle to be processed. Default
20.- min_duration_secint
Minimum regression window length in seconds. Default
180.- outlier_zfloat
Z-score threshold for iterative outlier removal. Default
2.- max_outlier_refit_fracfloat
Maximum fraction of points that may be removed as outliers; if exceeded the original fit is used. Default
0.2.- use_multiprocessingbool
When
Trueand there are more than 50 cycles, process in parallel usingmultiprocessing.Pool. Falls back to serial on any multiprocessing error. DefaultTrue.- n_jobsint or None
Number of parallel workers. Defaults to
min(8, cpu_count).- **kwargs
Absorbed silently so callers can pass
**DEFAULT_CONFIGdirectly.
Returns#
- pd.DataFrame
One row per successfully processed cycle. Columns (from
palmwtc.flux.cycles.evaluate_cycle()):Source_Chamber— chamber_name.cycle_id— integer cycle identifier.flux_date— start timestamp of the cycle.cycle_end— end timestamp of the cycle.cycle_duration_sec— total cycle duration in seconds.window_start_sec,window_end_sec— regression window boundaries relative to cycle start.duration_sec— regression window duration in seconds.n_points_total— total points in the full cycle.n_points_used— points used in the final regression.flux_slope— OLS slope of CO₂ vs. time (ppm s⁻¹).flux_intercept— OLS intercept (ppm).r2— R² of the OLS linear fit.p_value,std_err— regression statistics.rmse— root-mean-square error of the fit (ppm).nrmse— RMSE normalized by the CO₂ range in the window.snr— signal-to-noise ratio:|slope| × duration / rmse.snr_noise— SNR using early-cycle noise estimate (NaN if not computed).noise_sigma— early-cycle noise standard deviation (ppm).monotonicity— fraction of consecutive CO₂ steps moving in the slope direction (noise-filtered).outlier_frac— fraction of points removed as outliers.aicc_linear,aicc_quadratic,delta_aicc— AICc of the linear and quadratic fits; large negativedelta_aiccflags curvature.slope_ts,slope_ts_low,slope_ts_high— Theil-Sen slope and 95 % confidence interval (ppm s⁻¹).slope_diff_pct— relative difference between OLS and Theil-Sen slopes.mean_temp— mean air temperature in the cycle (°C).qc_flag— max hardware QC flag in the cycle.co2_range— CO₂ concentration range in the window (ppm).bimodal_flag—Trueif a bimodal CO₂ distribution was detected (possible closure gap).bimodal_gap_ppm,bimodal_lower_mean,bimodal_upper_mean— bimodal split statistics.flux_absolute— absolute flux in µmol m⁻² s⁻¹ computed bypalmwtc.flux.absolute.calculate_absolute_flux().
Returns an empty DataFrame if chamber_df is empty or contains no valid cycles.
See Also#
prepare_chamber_data : Produces the required chamber_df input. calculate_h2o_flux_cycles : H₂O analogue. palmwtc.flux.cycles.evaluate_cycle : Called for each individual cycle. palmwtc.flux.cycles.score_cycle : QC scoring applied after this step.
Examples#
# doctest: +SKIP # Requires prepared chamber data from prepare_chamber_data(). flux_df = calculate_flux_cycles(chamber_df, “Chamber 1”) print(flux_df[[“flux_date”, “flux_slope”, “r2”, “flux_absolute”]].head())
- palmwtc.flux.chamber.calculate_h2o_flux_for_cycle(cycle_data, gas_col='H2O', min_points=20, min_duration_sec=180)#
Compute H₂O slope and fit statistics for a single measurement cycle.
Uses Theil-Sen regression to estimate the slope (robust to outliers) and OLS for R², RMSE, and residual statistics. SNR is computed as
|slope_ts × duration| / residual_std, matching the CO₂ definition. Monotonicity is computed only on H₂O steps larger than 0.05 mmol mol⁻¹ (approximately 5× LI-COR H₂O RMS noise) to avoid deflation by sensor jitter.Parameters#
- cycle_datapd.DataFrame
Single-cycle data slice. Must contain
TIMESTAMPand gas_col.- gas_colstr
Name of the H₂O column (default
'H2O').- min_pointsint
Minimum number of non-NaN H₂O values required (default
20).- min_duration_secfloat
Minimum span of the cycle in seconds (default
180).
Returns#
- dict or None
Noneif the cycle has fewer than min_points valid rows or shorter than min_duration_sec. Otherwise a dict with keys:h2o_slope— Theil-Sen slope (mmol mol⁻¹ s⁻¹).h2o_intercept— Theil-Sen intercept (mmol mol⁻¹).h2o_r2— OLS R² (dimensionless, 0–1).h2o_nrmse— NRMSE: OLS RMSE divided by H₂O range; NaN if range is zero.h2o_snr— signal-to-noise ratio.h2o_outlier_frac— fraction of points more than 2.5× MAD from the OLS fit.h2o_monotonic_frac— fraction of noise-filtered consecutive steps in the slope direction; NaN if all steps are below the noise floor.h2o_n_points— number of non-NaN points used.h2o_duration— cycle duration in seconds.h2o_conc_mean— mean H₂O concentration (mmol mol⁻¹).h2o_conc_range— H₂O concentration range in the cycle (mmol mol⁻¹).
See Also#
calculate_h2o_flux_cycles : Calls this function for every cycle. score_h2o_flux_qc : Uses the returned dict to assign a QC grade.
- palmwtc.flux.chamber.score_h2o_flux_qc(h2o_metrics, h2o_qc_thresholds=None, is_nighttime=False)#
Assign a QC grade to a single H₂O flux cycle.
Applies a two-tier threshold system: a cycle that passes all
_Atests is Grade A (tier 0); failing any_Atest but passing all_Btests gives Grade B (tier 1); failing any_Btest gives Grade C (tier 2).A signal-size guard relaxes the
nrmse_Bandmonotonic_Bthresholds proportionally for cycles where the H₂O range is smaller thansignal_mmol_guard— preventing mass rejection of valid but low-transpiration cycles.Parameters#
- h2o_metricsdict or None
Output of
calculate_h2o_flux_for_cycle(). IfNone, returns tier 2 / Grade C with reason'No valid H2O data'.- h2o_qc_thresholdsdict or None
Override the default thresholds. When
None, selectsNIGHTTIME_H2O_QC_THRESHOLDSifis_nighttime=True, otherwiseDEFAULT_H2O_QC_THRESHOLDS.- is_nighttimebool
Switches to the nighttime threshold set when
Trueand no explicit thresholds are supplied.
Returns#
- tierint
0 for Grade A, 1 for Grade B, 2 for Grade C.
- labelstr
'A','B', or'C'.- reasonslist of str
Each failing test appends a human-readable string such as
'R2=0.45<0.70'. Empty when all tests pass.
See Also#
DEFAULT_H2O_QC_THRESHOLDS : Daytime threshold values and key descriptions. NIGHTTIME_H2O_QC_THRESHOLDS : Nighttime threshold values. calculate_h2o_flux_cycles : Calls this function for every cycle.
- palmwtc.flux.chamber.calculate_h2o_flux_cycles(chamber_df, chamber_name, cycle_gap_sec=300, min_points=20, min_duration_sec=180, h2o_qc_thresholds=None, **kwargs)#
Compute H₂O flux for every cycle in chamber_df.
Mirrors
calculate_flux_cycles()for water vapour. For each cycle, callscalculate_h2o_flux_for_cycle()and thenscore_h2o_flux_qc(), automatically switching to nighttime thresholds when Global_Radiation < 10 W m⁻² (or when the cycle starts before 06:00 or after 18:00, if radiation is not available).Parameters#
- chamber_dfpd.DataFrame
Output of
prepare_chamber_data(). Must containTIMESTAMPandH2O; optionallyGlobal_Radiationfor nighttime detection.- chamber_namestr
Chamber label stored in
Source_Chamber(e.g.'Chamber 1').- cycle_gap_secint
Gap in seconds that marks cycle boundaries. Default
300.- min_pointsint
Minimum valid H₂O points required per cycle. Default
20.- min_duration_secfloat
Minimum cycle duration in seconds. Default
180.- h2o_qc_thresholdsdict or None
Override the daytime H₂O thresholds. Nighttime thresholds are always selected automatically from
NIGHTTIME_H2O_QC_THRESHOLDSregardless of this parameter. Default:DEFAULT_H2O_QC_THRESHOLDS.- **kwargs
Absorbed silently so callers can pass
**DEFAULT_CONFIGdirectly.
Returns#
- pd.DataFrame
One row per valid cycle. Columns:
cycle_id— integer cycle identifier.Source_Chamber— chamber_name.h2o_qc— QC tier: 0 = A, 1 = B, 2 = C.h2o_qc_label—'A','B', or'C'.h2o_qc_reason— semicolon-separated failing-test strings.All keys returned by
calculate_h2o_flux_for_cycle():h2o_slope,h2o_intercept,h2o_r2,h2o_nrmse,h2o_snr,h2o_outlier_frac,h2o_monotonic_frac,h2o_n_points,h2o_duration,h2o_conc_mean,h2o_conc_range.
Returns an empty DataFrame if chamber_df is empty, has no
H2Ocolumn, or all H₂O values are NaN.
See Also#
calculate_flux_cycles : CO₂ analogue. prepare_chamber_data : Produces the required chamber_df input. score_h2o_flux_qc : H₂O QC grading function.
Examples#
# doctest: +SKIP # Requires prepared chamber data from prepare_chamber_data(). h2o_df = calculate_h2o_flux_cycles(chamber_df, “Chamber 1”) print(h2o_df[[“cycle_id”, “h2o_slope”, “h2o_qc_label”]].head())
- palmwtc.flux.chamber.load_tree_biophysics(base_dir)#
Load palm tree biophysical parameters from the PalmStudio spreadsheet.
Reads
Vigor_Index_PalmStudio.xlsx(expected at{base_dir}/), converts Indonesian column names to English, converts measurements from centimetres to metres, and extracts the clone identifier from the tree ID string.The Vigor Index is the estimated above-ground biomass volume (cm³ in the spreadsheet, converted to m³ here). It is computed by PalmStudio from measured height and canopy radii. It is used by
get_tree_volume_at_date()to time-interpolate tree volume for any given measurement date.Parameters#
- base_dirstr or Path
Directory that contains
Vigor_Index_PalmStudio.xlsx.
Returns#
- pd.DataFrame or None
One row per measurement visit per tree. Columns:
Tree ID— tree identifier string (e.g.'EKA1-001').Date— measurement date (datetime).Height_m— total tree height in metres.Max_Radius_m— maximum canopy radius in metres.Est_Width_m— estimated canopy width (2 × mean radius) in metres.Vigor_Index_m3— estimated tree volume in m³ (converted from cm³ by dividing by 1 000 000).Clone— clone name extracted fromTree ID(e.g.'EKA 1').
Returns
None(with a printed warning) if the file is not found.
Notes#
The spreadsheet uses Indonesian column headings (
Tanggal,Kode pohon,Tinggi Pohon (cm)). This function handles the renaming automatically.See Also#
- get_tree_volume_at_dateTime-interpolates Vigor Index from the table
returned by this function.
Examples#
# doctest: +SKIP # Requires Vigor_Index_PalmStudio.xlsx in the data directory. df_vigor = load_tree_biophysics(“/path/to/data”) print(df_vigor[[“Tree ID”, “Date”, “Vigor_Index_m3”]].head())
- palmwtc.flux.chamber.get_tree_volume_at_date(df_vigor, tree_id, target_date)#
Time-interpolate the Vigor Index (m³) for a tree at a specific date.
If an exact measurement exists on target_date, that value is returned directly. Otherwise, the Vigor Index time series for the tree is linearly interpolated between the two nearest measurements. No extrapolation is performed — dates outside the measurement range return
Nonebecause the time-based interpolation does not fill beyond the index boundaries.Parameters#
- df_vigorpd.DataFrame or None
Output of
load_tree_biophysics().NonereturnsNoneimmediately.- tree_idstr
Tree identifier matching the
Tree IDcolumn in df_vigor (e.g.'EKA1-001').- target_datestr or datetime-like
The date for which to estimate the tree volume. String values are parsed via
pandas.to_datetime().
Returns#
- float or None
Vigor Index in m³ at target_date, or
Noneif df_vigor isNone, tree_id is not found, or the date is outside the measured range.
Notes#
The interpolation method is pandas
'time', which assumes a constant growth rate between measurement visits. Palm canopy volume grows roughly monotonically over the study period, so linear interpolation is appropriate for the typical visit interval of a few months.See Also#
load_tree_biophysics : Loads and parses the biophysical spreadsheet.
Examples#
# doctest: +SKIP # Requires a DataFrame from load_tree_biophysics(). vol = get_tree_volume_at_date(df_vigor, “EKA1-001”, “2023-06-15”) print(f”Tree volume: {vol:.4f} m3”)
- palmwtc.flux.chamber.apply_wpl_qc_overrides(row, model_qc, flux_qc, reason_text, wpl_qc_thresholds=None, h2o_valid_range=(0.0, 60.0))#
Apply WPL-specific checks and upgrade QC tiers if needed.
Checks whether the WPL correction was well-conditioned for a given cycle (sufficient valid H₂O data, reasonable correction magnitude, plausible WPL factor). If any check fails, the
model_qcandflux_qctiers are upgraded (never downgraded) and a reason string is appended.This function is called after
build_cycle_wpl_metrics()andpalmwtc.flux.cycles.score_cycle()in the post-processing pipeline, not bycalculate_flux_cycles()directly.Parameters#
- rowpd.Series or dict
A single cycle row containing WPL metrics produced by
build_cycle_wpl_metrics():wpl_valid_fraction,wpl_abs_rel_change_p95,wpl_factor_max, andh2o_max.- model_qcint
Current model QC tier (0 = A, 1 = B, 2 = C) to be potentially upgraded.
- flux_qcint
Current flux QC tier to be potentially upgraded.
- reason_textstr
Semicolon-separated QC reasons accumulated so far. New reasons are appended and duplicates are removed.
- wpl_qc_thresholdsdict or None
Override
DEFAULT_WPL_QC_THRESHOLDS.- h2o_valid_rangetuple of float
(lo, hi)valid H₂O range in mmol mol⁻¹ (default(0.0, 60.0)). H₂O values abovehitrigger a Grade C downgrade.
Returns#
- tuple of (int, int, int, str)
(model_qc, flux_qc, wpl_qc, reason_text)where:model_qc,flux_qcare the (possibly upgraded) input tiers.wpl_qcis the WPL-specific tier (0, 1, or 2) that drove the upgrade.reason_textis the updated semicolon-separated reason string.
See Also#
DEFAULT_WPL_QC_THRESHOLDS : Threshold values and key descriptions. build_cycle_wpl_metrics : Produces the per-cycle WPL metrics consumed here.
- palmwtc.flux.chamber.compute_closure_confidence(r2, nrmse, global_radiation, rad_max=800.0)#
Compute a chamber closure confidence score between 0 and 1.
Combines R², NRMSE, and global radiation into a single scalar that expresses how confident we are that the chamber was properly sealed during a flux cycle.
Physical reasoning: poor fit quality (low R², high NRMSE) is more likely to indicate a physical leak when photosynthetic demand is high (bright conditions). The same poor fit at night or on a cloudy day could simply reflect a small signal close to sensor noise. The score therefore penalizes low R² and high NRMSE more strongly when radiation is high.
Formula#
\[ \begin{align}\begin{aligned}r2\_conf = clip\left(\frac{R^2 - 0.25}{0.94 - 0.25}, 0, 1\right)\\rad\_norm = clip\left(\frac{G}{G_{max}}, 0, 1\right)\\confidence = clip\left(r2\_conf - 0.4 \times rad\_norm \times (1 - r2\_conf) - 0.2 \times rad\_norm \times clip(NRMSE / 0.20, 0, 1), 0, 1\right)\end{aligned}\end{align} \]Parameters#
- r2float or array-like
R² of the OLS linear CO₂ vs. time fit (0–1). NaN is treated as 0.
- nrmsefloat or array-like
Normalized RMSE (RMSE / CO₂ range). NaN is treated as 0.
- global_radiationfloat or array-like
Incoming solar radiation in W m⁻². NaN is treated as 0 (worst-case penalty removed).
- rad_maxfloat
Radiation level at which the radiation penalty is at its maximum. Default
800.0W m⁻² (typical clear-sky midday value in the tropics).
Returns#
- float or numpy.ndarray
Closure confidence score in [0, 1]. A score near 1 indicates a well-sealed chamber with a clean linear CO₂ trend. A score near 0 indicates likely leakage or strong non-linearity under high light.
Notes#
The R² bounds (0.25 to 0.94) and penalty weights (0.4, 0.2) were calibrated against manual inspection of gap-width experiment data.
See Also#
- calculate_flux_cyclesProduces the R², NRMSE, and radiation values
consumed here.
Examples#
>>> from palmwtc.flux.chamber import compute_closure_confidence >>> round(float(compute_closure_confidence(0.98, 0.03, 0.0)), 3) 1.0 >>> round(float(compute_closure_confidence(0.95, 0.05, 200.0)), 3) 0.988 >>> round(float(compute_closure_confidence(0.50, 0.25, 600.0)), 3) 0.021 >>> round(float(compute_closure_confidence(0.40, 0.30, 700.0)), 3) 0.0