Probabilistic Forecasting of Nocturnal Hypoglycemia

University of Waterloo   &   WAT.ai   &   Gluroo   &   skTime

Fear of nocturnal hypoglycemic events continues to be one of the most significant contributors to diabetes distress. Developing improved nocturnal hypoglycemic forecasting techniques would likely improve diabetes distress (D. Ehrmann et al., 2024), but even short 2-hour blood glucose level (BGL) forecasting remains challenging (H. Nemat et al., 2024). Forecasting BGL over an 8-hour time window does not seem feasible with SOTA techniques, likely due to the limited scale of most BGL time series datasets. The most popular open-source T1D dataset, OhioT1DM, only contains 12 patients over eight weeks (C. Marling, 2020). Our student volunteer-based WAT.ai Blood Glucose Control Design Team, in partnership with Gluroo", aims to reframe the forecasting problem into a big data probabilistic forecasting problem.

Probabilistic forecasting is about producing low and high scenarios, quantifying their uncertainty, and delivering expected ranges of variation (T. Gneiting et al., 2014). Our view is that this framing is more technically feasible yet will be just as valuable in alleviating diabetes distress related to fears of nocturnal hypoglycemic events.


Our Methods

We are currently working with the open source Kaggle BrisT1D dataset and are in the process of scaling this to thousands of de-identified patients (dataset provided by Gluroo). We evaluated over 20,000 model configurations (eg: ARIMA, ARCH, Exponential Smoothing, and more) using the sktime library. We utilized feature engineering techniques to simulate insulin-on-board (IOB) and carbohydrates-on-board (COB) using differential equations (ODE).

The source code can be found in the following url: Source code


Results on Kaggle's BRIST1D Dataset

We evaluate various models on Kaggle's BrisT1D dataset. Based on preliminary results, foundational models (eg: TTM, Chonos Forecaster) outperform classical ML models (ARIMA, ARCH) with point forecast. Classical models show undesirable mean reversion, highlighting the pitfalls of RMSE optimization. Advanced models accurately capture spikes / drops in glucose levels.

Ant benchmark
Ant benchmark
Ant benchmark

The table below summarizes model performance using RMSE. However, RMSE optimization can lead to common pitfalls (eg: constant predicts can outperform regressive models when using RMSE). We separate model performance by each patient, as we various individual factors influence blood glucose levels

model_family arch arma exponential-smooth foundation naive
time_delta model_type ARCH AutoARIMA VarReduce AutoST3 StateForecastAutoCES StateForecastAutoETS StateForecastAutoTheta ChronosForecaster HFTransformerForecaster ZeroShotTinyTimeMixerForecaster NaiveForecaster
5min p02 2.195 1.941 2.012 3.621 2.934 3.043 3.042 2.211 2.866 2.122 2.356
p03 2.321 2.528 2.550 3.480 3.647 3.448 3.448 2.944 2.807 3.213 2.797
p04 1.568 1.861 1.679 2.314 2.177 2.239 2.239 1.839 2.503 1.840 1.801
p10 1.392 1.350 1.319 1.974 1.783 1.734 1.734 1.435 1.554 1.479 1.383
p11 2.335 3.171 2.345 2.953 2.876 2.801 2.803 2.672 3.079 2.543 2.507
p12 2.021 1.949 2.128 2.551 2.633 2.655 2.658 2.444 3.664 2.544 2.551
15min p01 3.470 3.631 3.546 4.460 4.687 4.196 4.193 3.728 4.576 3.375 3.824
p06 2.095 2.600 2.395 2.426 2.321 2.313 2.314 2.281 2.997 2.329 2.313
p08 2.887 2.728 2.935 4.507 3.391 3.281 3.286 2.758 4.197 3.115 3.281

Ongoing Research

We continue to work on modelling capabilities to improve nocturnal forecasting. We plan on extending current models from point forecast to probabilistic forecasting (eg: interval, quantile, and distributional). Further, we will work on fine-tuning foundational models on our data, and against zero-shot forecasting performance. Data normalization / standardization techniques will also be developed to reduce the amount of "noise" in the data. All of this will be performed and validated on Gluroo's expanded dataset.