Left Logo Left Logo 2 Left Logo 3

Right Logo Right Logo 2








From Admission to Discharge

Dynamically Updating Length of Stay Forecasts with Causal Diagnostic and Operational Penalties




Mustafa Aslan, Cardiff University, UK
Lead supervisor: Prof. Bahman Rostami-Tabar
Co-supervisor: Dr. Jeremy Dixon
Data Lab for Social Good, Cardiff University, UK

11 June 2026



Outline

  • The Operational Problem
  • Solution
  • The Framework
  • The Causal Location Shift Example
  • Counterfactual Simulation & Impact

The “Operational Problem”

Current hospital operations rely on aggregate comorbidity scores (e.g., Charlson), which compress complex medical histories, masking the true operational drivers of discharge delays.


  • The Problem: We cannot manage what we do not isolate.
  • The Gap: Traditional models suffer from selection bias; diagnostics are endogenous.
  • Our Approach: Move from “Correlative Scores” to “Causal Penalties.”

Solution

Dynamic Thresholding: Real-Time Updating

We update the survival curve (\(P(LoS > threshold)\)) dynamically as clinical data arrives (11 am daily).


1. Admission Day

  • Baseline prediction based on initial exisitng data
  • Initial \(P(LoS > \text{threshold})\)

2. Mid-Stay Update

  • Diagnosis confirmed
  • Causal Shift: Apply \(\tau_D\) penalty to the distribution

3. Action Trigger

  • Condition: \(P(LoS > \text{threshold}) \ge 55\%\)
  • Action: Refer to MDT / Community pathways

The Framework

1. Purging Confounders


We utilize Double Machine Learning (DML) to isolate the causal impact of diagnostics (\(\tau_D\)) and ward transfers (\(\tau_W\)) from the “clean-path” LoS.

\[ Y_{purged} = Total\_LoS - \sum_{i=1}^{n} \tau_{D_i} - \tau_{W_j} \]

  • Identification: DML partials out high-dimensional confounders (age, gender, ward busyness) to estimate the true “bed-day burden” of specific diagnoses.
  • Purging: We subtract realized shocks from the observed outcome to create an idealized survival curve \(S_0(t)\)—the “clean-path” prediction in the absence of operational friction.

The Framework

2. Modeling Uncertainty of the Diagnostic Penalty For Updating the \(\hat{LoS}\)

We treat diagnostic impact not as a fixed constant, but as a distribution, accounting for clinical uncertainty.


The Convolution

To generate a robust “Tail Risk” LoS forecasts, we convolve the baseline LoS distribution (\(f_L\)) with the diagnostic uncertainty (\(\tau_D \sim \mathcal{N}(\mu, \sigma^2)\)):

\[ f_{new}(z) = \int_{-\infty}^{\infty} f_L(z - \tau) \cdot f_\tau(\tau) \, d\tau \]

  • \(f_L(z - \tau)\): The baseline LoS probability.
  • \(f_\tau(\tau)\): The clinical uncertainty of the diagnostic penalty.

Why this matters for the MDT:

  • Beyond the Average: It captures the “Tail Risk”—the patients who face both a complex diagnosis and a slow recovery.
  • Operational Buffer: By modeling the variance of the effect (\(\sigma^2\)), we create a “safety margin” for discharge planning.
  • Robustness: This prevents underestimating the probability of a patient becoming “Super-Stranded” (\(\ge 14\) days).

The Causal Location Shift Example

The DML framework treats the diagnostic confirmation as a “shock” that shifts the distribution’s location along the time axis.

🟢 Day 0: Baseline

The “clean-path” prediction based on demographics and only initial information. The shaded area represents the initial risk of becoming a stranded patient.

🔴 Day \(k\): Causal Update

Diagnosis \(\tau_D\) shifts the entire distribution right. The shaded area shows sn increase in the probability of crossing the 14-day threshold.

Counterfactual Simulation & Impact


To prove the framework’s worth, we replay the historical baseline as a simulation:

  • Step 1: Run a day-by-day replay of historical patients.
  • Step 2: Apply the policy trigger (the “Model Intervention”) the moment a patient hits the risk threshold.
  • Step 3: Compute the System Delta (\(\Delta Beds\)):

\[ \Delta \text{Beds} = \sum \text{Actual LoS}_i - \sum \text{Counterfactual LoS}_i \]

This allows us to present decision-makers with a concrete metric: “If we had used this model, we would have reclaimed X bed-days.”

Any questions or thoughts? 💬