Estimating growth rates#
Outline approach used in the app to estimate the growth rate of biological growth curves with lag phase using growthcurves package
Here we compare different packages, explain some differences between parameters using a simulated logistic growth curve with noise.
parametric model based fitting of noisy data
linear regression on non log-transformed data
linear regression on log-transformed data
spline fitting and derivatives
croissance exponential model fitting
growthcurves package fitting, comparing all methods
The \(\mu_{max}\) growth is the maximum of the second derivative or the maximum of the log transformed data slope (if estimated with a sliding window).
Setup#
Imports
Model equations
use non-ODE formulation in comparison to the growthcurves package
S-Shaped growth curve (Logistic)#
Generate time points:
2,880 would we every 30seconds for a day
17,280 would be every 5seconds for a day
Plot the model with the parameters specified above
fig, ax = plot_simulated_growth_curve(time_in_h, model_curve, pop_noisy)
Curve fitting on noisy data#
Fitted parameters with noisy data
params_fitted, pcov = curve_fit(
logistic_model, time_in_h, pop_noisy, p0=[1.0, 0.5, 3.0, 0.0]
)
GrowthParams(*params_fitted)
GrowthParams(a=1.502, r=0.598, t0=9.978, shift_y=0.199)
Fitted parameters with smoothed data (removing some noise)
difference is here minimal with low and homoscedastic noise
params_fitted, pcov = curve_fit(
logistic_model, time_in_h, smooth(pop_noisy, window=31), p0=[1.0, 0.5, 3.0, 0.0]
)
GrowthParams(*params_fitted)
GrowthParams(a=1.502, r=0.598, t0=9.978, shift_y=0.199)
Compare different growth models#
Fitting the noisy data from a logistic model with three models
Logistic fitted parameters: 1.502, 0.598, 9.978, 0.199
Gompertz fitted parameters: 1.792, 0.273, 7.936
Richards fitted parameters: 1.689, 0.943, 11.015, 4.100
/tmp/ipykernel_768/2318378229.py:30: RuntimeWarning: invalid value encountered in power
return a / ((1 + nu * np.exp(-r * (t - lag))) ** power)
A huge condition number would indicate a problem with parameter identifiability
huge >>> 1e12
np.linalg.cond(pcov)
np.float64(8794.927476730772)
Prepare data for PioReactor-like analysis#
create datetime index
apply rolling median smoothing (could be Savitzky-Golay filtering)
Convert hours (range of 24h) to timedelta and move to start data
start_datetime = pd.Timestamp("2025-11-21 08:00")
time_series = pd.Series(pd.to_timedelta(time_in_h, unit="h")) + start_datetime
time_series = pd.Series(time_series, dtype="datetime64[s]")
time_series
0 2025-11-21 08:00:00
1 2025-11-21 08:00:30
2 2025-11-21 08:01:00
3 2025-11-21 08:01:30
4 2025-11-21 08:02:00
...
2875 2025-11-22 07:57:59
2876 2025-11-22 07:58:29
2877 2025-11-22 07:58:59
2878 2025-11-22 07:59:29
2879 2025-11-22 08:00:00
Length: 2880, dtype: datetime64[s]
# View data in format imported from PioReactor software output
# - use the rounded timestamp as index which is created as a preprocessing step
df = pd.DataFrame(
{
"timestamp_rounded": time_series,
"Reactor": pop_noisy,
}
).set_index("timestamp_rounded")
df
| Reactor | |
|---|---|
| timestamp_rounded | |
| 2025-11-21 08:00:00 | 0.226061 |
| 2025-11-21 08:00:30 | 0.197506 |
| 2025-11-21 08:01:00 | 0.232892 |
| 2025-11-21 08:01:30 | 0.272301 |
| 2025-11-21 08:02:00 | 0.193247 |
| ... | ... |
| 2025-11-22 07:57:59 | 1.658720 |
| 2025-11-22 07:58:29 | 1.657515 |
| 2025-11-22 07:58:59 | 1.669609 |
| 2025-11-22 07:59:29 | 1.712810 |
| 2025-11-22 08:00:00 | 1.691233 |
2880 rows × 1 columns
Apply rolling median to smooth the data
31 consecutives timepoints are used, setting the 16th value as the smoothed value
rolling_window = 31 # in number of samples
min_periods = 15
df_rolling = df.rolling(
rolling_window,
min_periods=min_periods,
center=True,
).median()
df_rolling
| Reactor | |
|---|---|
| timestamp_rounded | |
| 2025-11-21 08:00:00 | 0.195386 |
| 2025-11-21 08:00:30 | 0.193266 |
| 2025-11-21 08:01:00 | 0.195386 |
| 2025-11-21 08:01:30 | 0.193266 |
| 2025-11-21 08:02:00 | 0.193257 |
| ... | ... |
| 2025-11-22 07:57:59 | 1.680756 |
| 2025-11-22 07:58:29 | 1.685787 |
| 2025-11-22 07:58:59 | 1.679566 |
| 2025-11-22 07:59:29 | 1.685787 |
| 2025-11-22 08:00:00 | 1.679566 |
2880 rows × 1 columns
Plot median smoothed data against Savitzky-Golay smoothed data
what happens in the case of outliers? (to be tested)
are results comparable?
For now we will continue with the rolling median smoothed data
ax = df_rolling.plot(figsize=(7.4, 4), legend=False)
_ = ax.legend(["Smoothed data: rolling median"])
pd.Series(smooth(df["Reactor"], window=31, passes=1), index=df.index).plot(ax=ax)
_ = ax.legend(["Smoothed data: rolling median", "Smoothed data: Savitzky-Golay"])
Fit a linear model on the data#
rolling window
maximum overall growth: maximum of first derivative
\(\mu_{max}\) growth is the maximum of the second derivative, or the maximum of the log transformed data slope (estimated with a sliding window)
series_window = df_rolling.squeeze().loc["2025-11-21 15:00":"2025-11-21 22:00"]
(Non-) transformed data#
/tmp/ipykernel_768/3664703078.py:19: UserWarning: This axis already has a converter set and is updating to a potentially incompatible converter
ax.plot(df_rolling.index, tangent, "--", label="Tangent at max slope (±1h) (log2)")
maximum timepoint estimation with sliding window is not perfect. small errors can lead to deviations.
print(f"shift in x of logistic funtion: {lag_duration}")
max_idx - df_rolling.index[0]
shift in x of logistic funtion: 10
Timedelta('0 days 07:09:38')
center_time = df_rolling.index[0] + pd.Timedelta(hours=lag_duration)
print(f"center of logistic curves should be: {center_time}")
slopes_log2.squeeze().nlargest(5)
center of logistic curves should be: 2025-11-21 18:00:00
timestamp_rounded
2025-11-21 15:09:38 0.462154
2025-11-21 15:09:08 0.461847
2025-11-21 15:10:08 0.461843
2025-11-21 15:10:38 0.461357
2025-11-21 15:08:38 0.461355
Name: Reactor, dtype: float64
slopes_normal = s_normal.rolling(
rolling_window + 100,
min_periods=min_periods,
center=True,
).apply(get_slope)
slopes_normal.nlargest(5)
timestamp_rounded
2025-11-21 18:21:42 0.233420
2025-11-21 18:21:12 0.233351
2025-11-21 18:22:12 0.233351
2025-11-21 18:20:42 0.233131
2025-11-21 18:22:42 0.233128
Name: Reactor, dtype: float64
<Figure size 800x500 with 1 Axes>
So, using the original (non-transformed) data, one can recover \(r\) from the slope if the carrying capacity \(K\) and the current value \(N(t)\) (or at least that the point is at the inflection, \(N = K/2\)) are known.
Growth rate in model: 0.600
Non-transformed slope: 0.233
Value of curve at max slope: 0.844
Non-transformed est. r with est. K: 0.553
Non-transformed est. r with true K: 0.622
Log2-transformed slope: 0.462
Log2-transformed est. r: 0.641
Median slope around center point (plus-minus 10mins)
non-transformed slope at center: 0.219
log-transformed slope at center: 0.328
Log2-transformed est. r: 0.455
Use spline to find the maxium growth#
just to highlight this approach
Fit spline to the smoothed data and calculate derivatives
use smoothing factor based on data length
spline_smoothing_value = get_smoothing_range(df_rolling.shape[0])
spline_smoothing_value
SmoothingRange(s_min=2804, s=2880, s_max=2955)
Identify time points where growth rate is in the top 90%
could be used to estimat the growth rate during maximum growth phase
inflection_points = df_rolling.iloc[[der2.argmax(), der2.argmin()]]
max_min_growth_change = [der2.argmax(), der2.argmin()]
der1.iloc[max_min_growth_change]
timestamp_rounded
2025-11-21 14:00:07 0.113465
2025-11-21 20:00:15 0.139471
dtype: float64
inflection_points.index - df_rolling.index.min()
TimedeltaIndex(['0 days 06:00:07', '0 days 12:00:15'], dtype='timedelta64[s]', name='timestamp_rounded', freq=None)
| Reactor | |
|---|---|
| count | 2880.000000 |
| mean | 0.053146 |
| std | 0.080244 |
| min | -0.169281 |
| 25% | 0.002053 |
| 50% | 0.077216 |
| 75% | 0.122327 |
| max | 0.137387 |
t_to_max_in_h = (derivatives.idxmax() - derivatives.index.min()).dt.seconds / 3_600
t_to_max_in_h
Reactor 10.453611
dtype: float64
lag-phase duration should be estimated (maybe from first derivative plot?)
max population can be measured using OD
min population can be measured using OD
use od_max from df_rolling
Reactor 0.851437
dtype: float64
Find time range of maximum derivative and plot the region
def add_region_high_growth(ax, time_range, use_elapsed_time=False):
"""Add shaded region to indicate high growth phase."""
start, end = time_range["start"], time_range["end"]
if pd.isna(start) or pd.isna(end):
return
if use_elapsed_time:
start = (start - df_rolling.index[0]).total_seconds() / 3600
end = (end - df_rolling.index[0]).total_seconds() / 3600
ax.axvspan(
start,
end,
color="gray",
alpha=0.3,
label="High growth phase",
)
return ax
in_high_growth = derivatives.ge(cutoffs, axis=1)
max_range = find_max_range(in_high_growth["Reactor"])
fig, ax = plot_simulated_growth_curve(time_in_h, model_curve, pop_noisy)
_ = add_region_high_growth(ax, max_range, use_elapsed_time=True)
Without lag phase#
Now calculate exponential growth in the maximum growth area:
view = df_rolling.loc[max_range.start : max_range.end]
np.log(view).plot()
<Axes: xlabel='timestamp_rounded'>
Growth rate (slope): 0.2055 per hour
Intercept: -0.4008
<matplotlib.legend.Legend at 0x70ca13978410>
Croissance estimate#
Using croissance package to estimate growth parameters from biosustain: croissance
s_normal_in_h = (
pd.Series(pop_noisy, time_in_h).rolling(31, min_periods=15, center=True).median()
)
s_normal_in_h
0.000000 0.195386
0.008336 0.193266
0.016672 0.195386
0.025009 0.193266
0.033345 0.193257
...
23.966655 1.680756
23.974991 1.685787
23.983328 1.679566
23.991664 1.685787
24.000000 1.679566
Length: 2880, dtype: float64
ret = croissance.process_curve(s_normal_in_h)
gp = ret.growth_phases[0]
gp.slope, gp.intercept
/home/docs/checkouts/readthedocs.org/user_builds/piogrowth/envs/stable/lib/python3.13/site-packages/croissance/estimation/smoothing/segments.py:20: FutureWarning: The behavior of obj[i:j] with a float-dtype index is deprecated. In a future version, this will be treated as positional instead of label-based. For label-based slicing, use obj.loc[i:j] instead
window = series[i : i + size * increment]
/home/docs/checkouts/readthedocs.org/user_builds/piogrowth/envs/stable/lib/python3.13/site-packages/croissance/estimation/smoothing/segments.py:61: FutureWarning: The behavior of obj[i:j] with a float-dtype index is deprecated. In a future version, this will be treated as positional instead of label-based. For label-based slicing, use obj.loc[i:j] instead
window = series[start:end]
(np.float64(0.38623235823792945), np.float64(10.485852595456521))
fig, axes = croissance.figures.plot.plot_processed_curve(ret, yscale="both")
Growthcurves package#
Use the [growthcurves](https://pypi.org/project/growthcurves/ package to fit the logistic model and estimate parameters
_ = s_normal_in_h.plot(
figsize=(8, 5), title="Smoothed growth curve for growthcurves pkg"
)
| max_od | N0 | mu_max | intrinsic_growth_rate | doubling_time | exp_phase_start | exp_phase_end | time_at_umax | od_at_umax | fit_t_min | fit_t_max | fit_method | model_rmse | model_name | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mech_logistic | 1.700895 | 0.19915 | 0.29292 | 0.596981 | 2.366335 | 2.39456 | 13.977475 | 8.176353 | 0.580439 | 0.0 | 24.0 | model_fitting_mech_logistic | 0.009675 | mech_logistic |
| mech_gompertz | 1.803414 | 0.158867 | 0.321205 | 0.273271 | 2.157957 | 2.128178 | 14.497329 | 5.723447 | 0.346614 | 0.0 | 24.0 | model_fitting_mech_gompertz | 0.067032 | mech_gompertz |
| mech_richards | 1.701699 | 0.192303 | 0.285704 | 0.54994 | 2.426102 | 1.996298 | 14.03032 | 8.176353 | 0.583992 | 0.0 | 24.0 | model_fitting_mech_richards | 0.01031 | mech_richards |
| mech_baranyi | 1.700947 | 0.192342 | 0.292886 | 0.596198 | 2.366612 | 0.995168 | 13.97873 | 8.176353 | 0.580576 | 0.0 | 24.0 | model_fitting_mech_baranyi | 0.00966 | mech_baranyi |
| phenom_logistic | 1.703093 | 0.200115 | 0.299711 | None | 2.312721 | 4.635667 | 11.780248 | 8.224449 | 0.586686 | 0.0 | 24.0 | model_fitting_phenom_logistic | 0.010118 | phenom_logistic |
| phenom_gompertz | 1.71383 | 0.229502 | 0.345445 | None | 2.006532 | 5.458385 | 11.278627 | 7.599198 | 0.480797 | 0.0 | 24.0 | model_fitting_phenom_gompertz | 0.019375 | phenom_gompertz |
| phenom_gompertz_modified | 1.706141 | 0.22735 | 0.334868 | None | 2.06991 | 5.171202 | 11.320117 | 7.503006 | 0.475212 | 0.0 | 24.0 | model_fitting_phenom_gompertz_modified | 0.016237 | phenom_gompertz_modified |
| phenom_richards | 1.701812 | 0.193621 | 0.295234 | None | 2.347786 | 4.471266 | 11.833378 | 8.272545 | 0.594766 | 0.0 | 24.0 | model_fitting_phenom_richards | 0.009952 | phenom_richards |
| spline | 1.721717 | 0.193828 | 0.453677 | None | 1.527844 | 5.252174 | 10.196329 | 7.236181 | 0.449496 | 0.0 | 24.0 | model_fitting_spline | 0.012405 | spline |
| sliding_window | 1.721717 | 0.193828 | 0.303536 | None | 2.283572 | 4.31411 | 11.703825 | 7.365057 | 0.461325 | 6.744008 | 7.986106 | model_fitting_sliding_window | 0.022418 | sliding_window |
growth_stats[
[
"time_at_umax",
"od_at_umax",
"mu_max",
"intrinsic_growth_rate",
"doubling_time",
"exp_phase_start",
"exp_phase_end",
"model_name",
"max_od",
"N0",
]
]
| time_at_umax | od_at_umax | mu_max | intrinsic_growth_rate | doubling_time | exp_phase_start | exp_phase_end | model_name | max_od | N0 | |
|---|---|---|---|---|---|---|---|---|---|---|
| mech_logistic | 8.176353 | 0.580439 | 0.29292 | 0.596981 | 2.366335 | 2.39456 | 13.977475 | mech_logistic | 1.700895 | 0.19915 |
| mech_gompertz | 5.723447 | 0.346614 | 0.321205 | 0.273271 | 2.157957 | 2.128178 | 14.497329 | mech_gompertz | 1.803414 | 0.158867 |
| mech_richards | 8.176353 | 0.583992 | 0.285704 | 0.54994 | 2.426102 | 1.996298 | 14.03032 | mech_richards | 1.701699 | 0.192303 |
| mech_baranyi | 8.176353 | 0.580576 | 0.292886 | 0.596198 | 2.366612 | 0.995168 | 13.97873 | mech_baranyi | 1.700947 | 0.192342 |
| phenom_logistic | 8.224449 | 0.586686 | 0.299711 | None | 2.312721 | 4.635667 | 11.780248 | phenom_logistic | 1.703093 | 0.200115 |
| phenom_gompertz | 7.599198 | 0.480797 | 0.345445 | None | 2.006532 | 5.458385 | 11.278627 | phenom_gompertz | 1.71383 | 0.229502 |
| phenom_gompertz_modified | 7.503006 | 0.475212 | 0.334868 | None | 2.06991 | 5.171202 | 11.320117 | phenom_gompertz_modified | 1.706141 | 0.22735 |
| phenom_richards | 8.272545 | 0.594766 | 0.295234 | None | 2.347786 | 4.471266 | 11.833378 | phenom_richards | 1.701812 | 0.193621 |
| spline | 7.236181 | 0.449496 | 0.453677 | None | 1.527844 | 5.252174 | 10.196329 | spline | 1.721717 | 0.193828 |
| sliding_window | 7.365057 | 0.461325 | 0.303536 | None | 2.283572 | 4.31411 | 11.703825 | sliding_window | 1.721717 | 0.193828 |