Journal of The Royal Society Interface
You have accessResearch articles

Digital twinning of cardiac electrophysiology for congenital heart disease

Matteo Salvador

Matteo Salvador

Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA, USA

Cardiovascular Institute, Stanford University, Stanford, CA, USA

Pediatric Cardiology, Stanford University, Stanford, CA, USA

[email protected]

Contribution: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review and editing

Google Scholar

Find this author on PubMed

,
Fanwei Kong

Fanwei Kong

Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA, USA

Cardiovascular Institute, Stanford University, Stanford, CA, USA

Pediatric Cardiology, Stanford University, Stanford, CA, USA

Contribution: Data curation, Visualization, Writing – original draft, Writing – review and editing

Google Scholar

Find this author on PubMed

,
Mathias Peirlinck

Mathias Peirlinck

Department of Biomechanical Engineering, Delft University of Technology, Delft, The Netherlands

Contribution: Investigation, Writing – original draft, Writing – review and editing

Google Scholar

Find this author on PubMed

,
David W. Parker

David W. Parker

Stanford Research Computing Center, Stanford University, Stanford, CA, USA

Contribution: Writing – original draft, Writing – review and editing

Google Scholar

Find this author on PubMed

,
Henry Chubb

Henry Chubb

Pediatric Cardiology, Stanford University, Stanford, CA, USA

Contribution: Writing – original draft, Writing – review and editing

Google Scholar

Find this author on PubMed

,
Anne M. Dubin

Anne M. Dubin

Pediatric Cardiology, Stanford University, Stanford, CA, USA

Contribution: Writing – original draft, Writing – review and editing

Google Scholar

Find this author on PubMed

and
Alison L. Marsden

Alison L. Marsden

Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA, USA

Cardiovascular Institute, Stanford University, Stanford, CA, USA

Pediatric Cardiology, Stanford University, Stanford, CA, USA

Department of Bioengineering, Stanford University, Stanford, CA, USA

Contribution: Funding acquisition, Project administration, Resources, Supervision, Writing – original draft, Writing – review and editing

Google Scholar

Find this author on PubMed

    Abstract

    In recent years, blending mechanistic knowledge with machine learning has had a major impact in digital healthcare. In this work, we introduce a computational pipeline to build certified digital replicas of cardiac electrophysiology in paediatric patients with congenital heart disease. We construct the patient-specific geometry by means of semi-automatic segmentation and meshing tools. We generate a dataset of electrophysiology simulations covering cell-to-organ level model parameters and using rigorous mathematical models based on differential equations. We previously proposed Branched Latent Neural Maps (BLNMs) as an accurate and efficient means to recapitulate complex physical processes in a neural network. Here, we employ BLNMs to encode the parametrized temporal dynamics of in silico 12-lead electrocardiograms (ECGs). BLNMs act as a geometry-specific surrogate model of cardiac function for fast and robust parameter estimation to match clinical ECGs in paediatric patients. Identifiability and trustworthiness of calibrated model parameters are assessed by sensitivity analysis and uncertainty quantification.

    1. Introduction

    The combination of physics-based and statistical modelling in cardiovascular medicine has the potential to shape the future of cardiology [1]. In this framework, a synergistic use of multi-physics and multi-scale mathematical models for cardiac function [28] and machine learning-based methods, such as Gaussian processes (GPs) emulators [911] and neural networks (NNs) [12,13], enables the design of efficient computational tools that are compatible with the computer resources and time frames required in clinical applications. In the foreseeable future, a continuous, bi-directional interaction between patient-specific data and artificial intelligence-enriched computer models incorporating biophysically detailed and anatomically accurate knowledge would enable the vision of precision medicine [1416]. Personalized treatment and surgical planning may be delivered by leveraging different mathematical methods, such as sensitivity analysis, parameter inference and uncertainty quantification [1720].

    Several mathematical tools have been proposed to better understand and treat different groups of adult cardiac pathologies [14]. Electrophysiology simulations play an important role in the assessment of rhythm disorders. They are used for cardiac resynchronization therapy [21], arrhythmia risk stratification [22,23] and definition of optimal ablation strategies [24]. Nevertheless, in silico numerical simulations and treatment modalities in paediatrics and congenital heart disease are less common or even not established [2527].

    Congenital heart defects (CHDs) are the most common birth defects and are characterized by cardiac anatomical abnormalities that can severely impact cardiac function [28]. Patients with CHDs often have a unique and peculiar combination of cardiac defects that warrant personalized treatment planning in clinically relevant time frames. Digital twinning of cardiac function thus holds particular promise for these patients [27].

    In this work we introduce a novel digital twin of a paediatric patient with hypoplastic left heart syndrome (HLHS), a complex form of CHD where the left ventricle of the patient is severely underdeveloped, leading to a number of morbidities and elevated mortality risk. Furthermore, HLHS is a type of single ventricle physiology that typically requires three palliative open-heart surgeries to create a functioning cardiovascular system.

    Our pipeline seamlessly combines:

    Semi-automatic segmentation and mesh generation tools suited for paediatric patients with CHD [29].

    A physiologically based mathematical formulation of cardiac electrophysiology deriving from the monodomain equation [16,30] coupled with the ten Tusscher–Panfilov [31] ionic model.

    A recently proposed scientific machine learning method, namely Branched Latent Neural Maps (BLNMs) [32], to build an accurate and efficient dynamic surrogate model of cardiac function.

    Shapley effects [33] and Hamiltonian Monte Carlo (HMC) [34,35] to perform patient-specific sensitivity analysis, fast and robust parameter estimation with uncertainty quantification while matching clinical 12-lead electrocardiograms (ECGs).

    This digital twin can be employed to simulate different scenarios of clinical interest in silico, as HLHS patients may experience different forms of electrophysiological comorbidities [36], including ventricular arrhythmias and dyssynchrony [27]. Therefore, personalized electrophysiology simulations may provide virtual pre- and post-operative guidance in this understudied patient cohort [37].

    2. Results

    In figure 1, we depict our computational pipeline to build digital twins of cardiac electrophysiology for congenital heart disease patients. This pipeline covers all the relevant aspects of digital twinning: image segmentation and mesh generation, mathematical and numerical physics-based modelling, surrogate model training, sensitivity analysis and robust parameter calibration with uncertainty quantification.

    Sketch of the computational pipeline. We reconstruct the patient-specific geometry with HLHS from imaging. We generate a dataset of electrophysiology simulations encompassing cell-to-organ variability in model parameters

    Figure 1. Sketch of the computational pipeline. We reconstruct the patient-specific geometry with HLHS from imaging. We generate a dataset of electrophysiology simulations encompassing cell-to-organ variability in model parameters. We train a BLNM that effectively reproduces 12-lead ECGs while covering model variability. We employ the BLNM for digital twinning.

    2.1. Pre-processing

    Figure 1 (first row) shows the heart–torso model of a 7-year-old female paediatric patient with HLHS constructed from the computerized tomography (CT) scan of the patient using our semi-automatic model construction pipeline [29]. This patient has a severely underdeveloped (hypoplastic) left ventricle (LV) and a dominant right ventricle (RV), which is connected to the aorta. Figure 2 shows the CHD captured in this HLHS patient, compared with the anatomy of a healthy patient.

    Comparison of the segmentations and the reconstructed anatomic models and meshes between the HLHS patient and a healthy patient with normal cardiac anatomy.

    Figure 2. Comparison of the segmentations and the reconstructed anatomic models and meshes between the HLHS patient and a healthy patient with normal cardiac anatomy.

    2.2. Cardiac electrophysiology modelling

    We run 200 numerical simulations on the patient-specific heart–torso geometry (see figure 1, second row), spanning seven relevant electrophysiology parameters of the physics-based model at the microscopic scale and organ level. We collect the corresponding in silico 12-lead ECGs.

    In table 1, we report descriptions, ranges, units and references for the seven model parameters that we explore via Latin hypercube sampling for the dataset generation. For ionic conductances and myocardial conductivities, we consider a healthy baseline value from the literature and we define an interval by multiplying this value by 0.5 (lower bound) and 2.0 (upper bound). In this way, we cover a wide range that includes both healthy and pathological conditions. Similarly, for the Purkinje conductivities, we use an interval spanning both healthy cases and different cardiovascular diseases [38]. For the stimulation time of the left Purkinje network, we employ a broad range for possible ventricular dyssynchrony in patients with single ventricle physiology [27].

    Table 1. Parameter space for cardiac electrophysiology sampled via Latin hypercube for the numerical simulations performed with the physics-based mathematical model.

    parameterdescriptionrangeunitsreferences
    GCaLmaximal Ca2+ current conductance[1.99 × 10−5, 7.96 × 10−5]cm ms−1 μF−1[31]
    GNamaximal Na+ current conductance[7.42, 29.68]ns pF−1[31]
    GKrmaximal rapid delayed rectifier current conductance[0.08, 0.31]ns pF−1[31]
    Danianisotropic conductivity[0.008298, 0.033192]mm2 ms−1[27,38,39]
    Disoisotropic conductivity[0.002766, 0.011064]mm2 ms−1[27,38,39]
    DpurkPurkinje conductivity[1.0, 3.5]mm2 ms−1[27,38]
    tLVstimPurkinje left bundle stimulation time[0, 100]ms[27,40]

    In figure 3, we depict the ensemble of the resulting in silico 12-lead ECGs together with the clinical recordings. We point out that the patient-specific 12-lead ECGs are contained within the pseudopotential variability spanned by the electrophysiology simulations, manifesting various morphologies in the QRS complex, that is ventricular depolarization, and T wave, that is ventricular repolarization. The patient diagnosis reports rhythm disorders, atrial enlargement, left and right ventricular hypertrophy, along with severe abnormalities in the ECGs. Specifically, there are signs of prolonged PR interval, ST segment depression and T wave inversion.

    Physics-based electrophysiological modelling dataset generation. Full dataset containing 200 in silico precordial and limb leads recordings (blue, solid) and patient-specific 12-lead ECGs (black, dashed)

    Figure 3. Physics-based electrophysiological modelling dataset generation. Full dataset containing 200 in silico precordial and limb leads recordings (blue, solid) and patient-specific 12-lead ECGs (black, dashed). ECG, electrocardiogram.

    In figure 4, we show the simulated spatio-temporal transmembrane potential evolution on the patient-specific paediatric model for a single instance of model parameters. Specifically, we always simulate the sinus rhythm behavior over a cardiac cycle. Figure 4 focuses on the ventricular depolarization phase, where the electric signal propagates from the one-dimensional Purkinje network at the two endocardia towards the myocardium, as well as the ventricular repolarization phase, when the transmembrane potential comes back to its resting state (i.e. approximately −90 mV).

    Physics-based electrophysiological modelling. Spatio-temporal membrane action potential evolution for one electrophysiology simulation in the dataset performed on the HLHS paediatric patient

    Figure 4. Physics-based electrophysiological modelling. Spatio-temporal membrane action potential evolution for one electrophysiology simulation in the dataset performed on the HLHS paediatric patient. HLHS, hypoplastic left heart syndrome.

    2.3. Branched latent neural maps

    We train BLNMs, which are represented by feed-forward partially connected NNs, to encode the temporal dynamics of the 12-lead pseudo-ECGs computed with the physics-based mathematical model while also covering model variability from the cellular to the tissue level (see figure 1, third row). Once trained, BLNMs act as a surrogate model for cardiac electrophysiology function that can be queried on new parameter instances (within the training range) to provide faster than real-time in silico 12-lead ECGs.

    In order to identify the optimal set of BLNM hyperparameters, which are the number of layers, number of neurons, number of states and disentanglement level in the NN structure, we employ a K-fold (K=5) cross-validation over 150 multi-scale physics-based electrophysiology simulations. The hyperparameter search space is given by a four-dimensional hypercube, where we run 50 instances of Latin hypercube sampling and we pick the BLNM configuration providing the lowest generalization error. For each configuration of hyperparameters, we sample the dataset with a fixed time step of Δt=5 ms, and we perform 10 000 iterations of the second-order Broyden–Fletcher–Goldfarb–Shanno (BFGS) optimizer. In table 2, we report the initial hyperparameter ranges for tuning and the final optimized values.

    Table 2. BLNM hyperparameter tuning. Original hyperparameter ranges and optimized hyperparameter values for the final training stage.

    BLNMlayersneuronshyperparameters number of statesdisentanglement leveltrainable parameters # parameters
    tuning final{1 ... 8}{10 ... 30}{9 ... 12}{1 ... Nlayers}
    7191022398

    Then, we train a final optimized BLNM encompassing the whole dataset of 150 multi-scale physics-based electrophysiology simulations using 50 000 BFGS iterations. The non-dimensional MSE on a testing set comprised of 50 additional numerical simulations unseen during the training stage, and Latin hypercube sampled from the same parameter space in table 1, is equal to 5×104.

    2.4. Parameter estimation

    We employ the optimized BLNM to find an initial guess for the seven model parameters that results in a computational pseudo-ECG that best matches the clinically observed 12-lead ECG dynamics of the CHD patient. The estimated model parameters are reported in table 3.

    Table 3. Parameter estimation. Calibration with the optimized BLNM for cell-to-organ level model parameters of the physics-based mathematical model. The MSE between the BLNM predictions and the clinical recordings, in non-dimensional form, is 0.16.

    parametervalueunits
    GCaL2.94 × 10−5cm ms−1 μF−1
    GNa15.58ns pF−1
    GKr0.15ns pF−1
    Dani0.03mm2 ms−1
    Diso0.01mm2 ms−1
    Dpurk1.96mm2 ms−1
    tLVstim43.3ms

    2.5. Sensitivity analysis

    Starting from the parameter calibration shown in table 3, we compute Shapley effects for each model parameter for cardiac electrophysiology, assuming independence among them as they act in different terms and equations of the physics-based mathematical model (see figure 1, third row). In figure 5, we show how each parameter contributes to matching electrophysiology simulations with the clinical 12-lead ECGs, that is, in the minimization of the MSE between BLNM outputs and our patient-specific observations. The sodium current conductance GNa plays a dominant role, followed by the L-type calcium ion channel conductance GCaL and the different conductivities Dani, Diso and Dpurk. Noteworthy, the interventricular activation dyssynchrony tLVstim plays a minor role in the calibration process. This is motivated by the dimension of the right ventricle, which mostly dictates the activation sequence with respect to the small (underdeveloped) left ventricle.

    Sensitivity analysis for the seven model parameters encoded in the BLNM via Shapley effects.

    Figure 5. Sensitivity analysis for the seven model parameters encoded in the BLNM via Shapley effects.

    2.6. Uncertainty quantification

    In figures 6 and 7, we show the results of our inverse uncertainty quantification, where we quantify how uncertainty in matching 12-lead ECGs propagates to uncertainty in the estimated model parameters. We account for both BLNM surrogate modelling error, via GPs, and measurement error in the clinical recordings.

    Inverse uncertainty quantification: parameter uncertainty. One-dimensional views of the posterior distribution. Different colours represent different HMC chains.

    Figure 6. Inverse uncertainty quantification: parameter uncertainty. One-dimensional views of the posterior distribution. Different colours represent different HMC chains.

    Inverse uncertainty quantification: matching clinical data. Clinical recordings (dashed, black) and mean estimation (red, solid) of 12-lead ECGs for the HLHS paediatric patient via HMC

    Figure 7. Inverse uncertainty quantification: matching clinical data. Clinical recordings (dashed, black) and mean estimation (red, solid) of 12-lead ECGs for the HLHS paediatric patient via HMC. Light red encompasses the variability between mean ± 5 s.d.

    From figure 6, we see that the posterior distributions of all model parameters θEP, along with the correlation length lGP and amplitude σGP, converged well towards highly similar unimodal distributions across all chains. The average value of σGP2 is approximately equal to 0.08, which is two orders of magnitude higher than the BLNM testing error (5×104), as the GP encodes the maximum BLNM uncertainty from each single lead individually and by also considering all possible correlations among the 12 leads, given the full covariance matrix in the multi-variate normal distribution (see §3.6).

    In figure 7, we depict the clinical versus in silico 12-lead ECGs, generated with the BLNM over the posterior distribution of model parameters. We see that the numerical simulations are in good agreement with the patient-specific recordings and show small variability between the 5 s.d. from the average value.

    2.7. In silico scenarios

    We consider different in silico scenarios of clinical interest using our personalized computational model. Specifically, we run 3 one-dimensional–three-dimensional electrophysiology simulations under three different conditions: first, we use the calibrated model parameters θEP, that is, the peak of the posterior distribution in figure 6, then we employ the same computational model under the assumption of either left bundle branch block or right bundle branch block, that is, by removing the left (respectively, right) Purkinje fascicles. In figure 8, we show the results of these three numerical simulations by depicting the activation and repolarization maps for this HLHS paediatric case. We notice that, for this patient-specific geometry, the role of the Purkinje network in the LV is very limited and the activation sequence is highly similar with and without a full left bundle branch block. This means that the left Purkinje fascicle in this patient is either absent or present but fully inhibited or severely delayed, with respect to the right Purkinje fascicle.

    Running in silico disease scenarios. Activation (top) and repolarization (bottom) maps with personalized model parameters (left), left bundle branch block (centre) and right bundle branch block (right).

    Figure 8. Running in silico disease scenarios. Activation (top) and repolarization (bottom) maps with personalized model parameters (left), left bundle branch block (centre) and right bundle branch block (right).

    2.8. Computational costs

    In table 4, we detail the computational costs and resources required by each step of the digital twinning process. The most expensive part resides in the physics-based computational electrophysiology modelling dataset generation, which makes use of high-performance computing given the stiffness and complexity of the underlying mathematical model. On the other hand, training a BLNM and employing it for robust Bayesian parameter estimation and sensitivity analysis are more tractable tasks that can be carried out within a few hours or minutes on a local machine. Using the physics-based mathematical model throughout parameter calibration with uncertainty quantification and sensitivity analysis would be computationally intractable and unaffordable given the extensive number of queries that Shapley effects and HMC require to show robustness and convergence in the provided results (see §3).

    Table 4. Computational resources. Summary of the computational times and resources to generate the electrophysiology simulations with the physics-based model, to train the BLNM, to compute Shapley values for sensitivity analysis and to perform Bayesian parameter estimation with uncertainty quantification on 12-lead ECGs.

    taskcomputational resourcesexecution time
    segmentation and mesh generation (one patient)1 core10 min
    200 electrophysiology simulations336 cores1 day
    BLNM hyperparameter tuning (50 confs, 10 000 iterations)5 cores20 h
    BLNM final training (50 000 iterations)1 core2 h and 30 min
    parameter estimation (100 trials)1 core2 min
    sensitivity analysis (Shapley values)1 core20 min
    uncertainty quantification (HMC, 4 chains)4 threads5 min
    total2 days

    3. Methods

    3.1. Pre-processing

    We reconstruct the heart–torso model from the CT scan of a 7-year-old female paediatric patient with HLHS in a semi-automatic manner [29]. Namely, we train a NN based on the classic UNet architecture [41] to automatically segment the myocardium from CT images of patients affected by congenital heart disease. The UNet is trained using a publicly available dataset [42] that provided CT images and ground-truth segmentation for 110 patients with age between 1 month and 40 years, combined with our private HLHS dataset containing the images and segmentation of five patients. Given the intrinsic segmentation challenges of cardiac structures in both young and CHD patients [43], we subsequently examine and improve the UNet-produced segmentations to more closely match with the CT scan. We automatically extract the surface meshes from the segmentations using the marching cube algorithm [44] and truncate the base myocardium above a manually identified plane to create a biventricular surface mesh. We choose this plane so that it is normal to the long axis of the ventricles and located below the aortic and atrioventricular annulus. We manually adjust the location of the plane to completely remove the irregular annulus geometries while keeping most of the ventricles (see figure 2). We note that this choice is common to other electrophysiological studies [22,45] and facilitates the generation of the fibre field by rule-based algorithms [46]. We subsequently use TetGen [47] to create the tetrahedral volumetric mesh with a maximum edge size of 1 mm [27,32]. The torso model is created semi-automatically from the images using threshold- and region-growing-based segmentation methods.

    3.2. Cardiac electrophysiology modelling

    We detail the physics-based mathematical model, along with its numerical discretization, that is employed to perform electrophysiology simulations on the HLHS paediatric patient.

    3.2.1. Mathematical model

    We consider the monodomain equation [16] coupled with the ten Tusscher–Panfilov ionic model [31] to describe the electric behaviour in the heart–Purkinje system. This system of differential equations reads

    {Φt+Iion(Φ,w,z)(DMΦ)=Iapp(x,t) in Ω×(0,T],(DMΦ)n=0 on Ω×(0,T],dwdt=H(Φ,w,z) in Ω×(0,T],dzdt=G(Φ,w,z) in Ω×(0,T],Φ(x,0)=Φ0(x), w(x,0)=w0(x), z(x,0)=z0(x) in Ω.(3.1)

    We always simulate a single heartbeat by fixing a final time T=THB=600 ms. The computational domain Ω=ΩpurkΩmyo is represented by the one-way coupled one-dimensional Purkinje network and three-dimensional bi-ventricular patient-specific geometry.

    Transmembrane potential Φ defines the electric signal at the Purkinje and myocardial level. The ten Tusscher–Panfilov ionic model is endowed with 18 variables, which are split in two different subsets. First, there is a vector w=(w1,,wM) (M=12) of ion channel gating variables, which are probability density functions representing the fraction of open channels across the membrane of a single cardiac cell. Then, there is a vector z=(z1,,zP) (P=6) of concentration variables representing relevant ionic species, such as sodium Na+, intracellular calcium Ca2+ and potassium K+, which all play a major role in the metabolic processes [48], dictating heart rhythmicity or sarcomere contractility, and are generally targeted by pharmaceutical therapies. The specific mathematical formulation of the ten Tusscher–Panfilov ionic model defines the ordinary differential equations for H(Φ,w,z) and G(Φ,w,z), which describe the dynamics of gating variables and ionic concentrations respectively, along with the ionic current Iion(Φ,w,z) [31]. An external applied current Iapp(x,t) fires the electric signal in the Purkinje fibres. Specifically, we trigger the action potential at the tip of the right ventricular fascicle at t=0 ms. A current is then applied to the left bundle branch at t=tLVstim. The transmembrane potential Φ evolves in time and propagates in the 1D Purkinje network by solving equation (3.1).

    The diffusion tensor is expressed as DM=DisoI+Danif0f0 in Ωmyo and DM=DpurkI in Ωpurk, where f0 introduces the biventricular fibre field [30,49]. Dani,Diso,Dpurk+ dictate the anisotropic, isotropic and Purkinje conductivities, respectively.

    The homogeneous Neumann boundary conditions prescribed at Ω define the condition of an electrically isolated domain, where n is the outward unit normal vector to the boundary.

    We bring the 0D version of equation (3.1) to the limit cycle by running 500 heartbeats with a periodic pacing dictated by THB=600 ms, and we initialize the transmembrane potential, gating and ionic variables for the one-dimensional–three-dimensional electrophysiology simulations by taking their values during the last heartbeat [50,51].

    Following [30], the extracellular potential Φe defining the ECGs is computed in each lead location xe as:

    Φe(xe)=ΩΦ1||xxe||dV,(3.2)

    where e={V1,V2,V3,V4,V5,V6} and e={LA,RA,F} define six precordial leads and three limb leads located on the paediatric patient-specific torso model, shown in coloured and black dots in figure 1 (second row), respectively. From these lead locations, we computationally reconstruct three bipolar limb leads as

    I=LARAII=FRAIII=FLA,(3.3)

    and three augmented limb leads as

    aVL=(I-III)/2aVR=-(I+II)/2aVF=(II+III)/2.(3.4)

    The resulting set ECG={V1,V2,V3,V4,V5,V6,I,II,III,aVL,aVR,aVF} of computational pseudo-potentials defines a comprehensive 12-lead ECG representation of the electrical activity in the patient-specific heart.

    3.2.2. Numerical discretization

    We employ linear finite elements to discretize the spatial domain Ω in equation (3.1). The tetrahedral tessellation defining the bi-ventricular mesh has 933 916 cells and 158 277 d.f., with a maximum mesh size of h=1 mm. The left and right Purkinje bundles within the ventricular endocardia are generated by employing the fractal tree and projection algorithm proposed in [52], starting from the atrioventricular node. These left and right bundles are endowed with 14 820 elements (14 821 d.f.) and 67 456 elements (67 457 d.f.), respectively. Given the coarse space resolution of the bi-ventricular mesh, we apply suitable non-Gaussian quadrature rules to recover appropriate conduction velocities. We refer to ten Tusscher & Panfilov [27] for a convergence analysis study showing the interplay among different quadrature point locations, mesh sizes and conduction velocities, for both Purkinje network and myocardium, in a single ventricle physiology paediatric patient. We consider a transmural variation of ionic conductances to differentiate epicardial, mid-myocardial and endocardial properties [31]. To solve equation (3.1), we leverage an implicit–explicit time discretization scheme, where we first update the variables of the ionic model and then the transmembrane potential [2]. Specifically, in the monodomain equation, the diffusion term is treated implicitly and the ionic term is treated explicitly. The latter is discretized by means of the ionic current interpolation scheme [53]. We prescribe the fibre distribution according to a Laplace–Dirichlet rule-based algorithm with αepi = -60, αendo = 60, βepi = 20and βendo = -20 [49].

    3.3. Branched latent neural maps

    We construct a geometry-specific surrogate model of cardiac function by building a feed-forward partially connected NN that explores the variability of our physics-based electrophysiology model detailed in §3.2 while structurally separating the role of temporal t and functional θEP parameters. This recent scientific machine-learning tool, proposed by Salvador & Marsden [32], allows for different levels of disentanglement between inputs and outputs. The surrogate model reads

    z(t)=BLNM(t,θEP;w) for t[0,T].(3.5)

    Weights and biases wRNw encode the algebraic structures of a feed-forward partially connected NN, which represents a map BLNM:R1+NPRNz from time t and NP=7 cell-to-organ scale electrophysiology parameters θEP=[GCaL,GNa,GKr,Dani,Diso,Dpurk,tLVstim]TΘRNP to an output vector z(t)=[zleads(t),zlatent(t)]TRNz. This vector contains in silico precordial and limb leads recordings zleads(t)=[V1(t),V2(t),V3(t),V4(t),V5(t),V6(t),LA(t),RA(t),F(t)]TR9, where we use the original LA(t), RA(t) and F(t) limb leads in place of the bipolar and augmented limb leads in order to reduce the dimensionality of the output. Indeed, we reconstruct I(t), II(t), III(t), aVL(t), aVR(t) and aVF(t) a posteriori by means of equations (3.3) and (3.4). Furthermore, vector z(t) leverages some zlatent(t) latent variables that enhance the learned temporal dynamics by acting in regions with steep gradients [32].

    We perform nonlinear optimization with the BFGS algorithm to tune NN parameters. In particular, we monitor the MSE of surrogate versus physics-based ECG pseudopotentials to find an optimal set of weights and biases w, that is,

    L(z~leads(t),z~numerical(t);w^)=argminw^[||z~leads(t)z~numerical(t)||L2(0,T)],(3.6)

    where z~leads(t)[1,1]9 represents BLNM outputs and z~numerical(t)[1,1]9 defines the physics-based numerical simulations, both in non-dimensional form. Time t~[0,1] and model parameters θ~EP[1,1]NP are also normalized during the training and testing phases. We refer to Salvador & Marsden [32] for a detailed description of all the properties related to BLNMs that enable them to effectively learn complex physical processes.

    3.4. Parameter estimation

    We employ our trained BLNM to find a set of model parameters θ~EP that matches zECG(t)R12 with zclinical(t)R12. Here, zECG(t) is the vector of BLNM physical outputs zleads(t) manipulated according to equations (3.3) and (3.4) to generate the full 12-lead ECGs, and zclinical(t) is the clinically measured ECG data vector. In particular, we perform derivative-free optimization by employing the Nelder–Mead method [54], where we specify a loss function given by the MSE of the mismatch between the trained surrogate versus clinical ECG potentials, that is,

    L(zECG(t),zclinical(t))=||zECG(t)zclinical(t)||L2(0,T)2,(3.7)

    which leads to a set of calibrated model parameters θ~EPNM and corresponding 12-lead ECGs zECGNM(t). We initialize our optimization algorithm with a random set of model parameters θ~init[1,1]NP. We repeat the optimization process 100 times and we average the model parameters obtained during the different trials in order to get θ~EPNM.

    3.5. Sensitivity analysis

    We perform a variance-based sensitivity analysis using Shapley effects [33] in order to quantify the importance of each model parameter in fitting patient-specific 12-lead ECGs during the inference process.

    Specifically, we employ Sklar’s theorem [55] to define the input multi-variate distribution, which is given by a Gaussian copula and a series of NP marginals defined by standard normal distributions centred in θ~EPNM, that is, N(θ~EPNM,i,0.2)fori=1,...,NP.

    Due to the high computational costs associated with testing all the different combinations of the features, we consider the random (rather than the exact) version of the algorithm to compute Shapley values. We monitor the expected marginal contribution of each model parameter to the BLNM prediction with respect to observations, that is, the MSE of equation (3.7). We fix 2000 permutations, 500 bootstrapped samples and 50 samples to estimate conditional variance for three times.

    3.6. Uncertainty quantification

    We employ a BLNM within HMC [34] to calibrate model parameters and to perform inverse uncertainty by matching observed 12-lead ECGs from patient-specific recordings. HMC is a Markov chain Monte Carlo (MCMC) method that aims at finding an approximation of the posterior distribution P(θ~EP|x), given a certain prior probability distribution P(θ~EP) with respect to the model parameters in non-dimensional form θ~EP[1,1]NP. Specifically, we employ the No-U-Turn Sampler (NUTS) extension of HMC, which automatically adapts the number of steps to estimate the posterior distribution [35]. This algorithm, which shares and enhances some of the features of sequential [56] and differential evolution [57] MCMC, works well with high-dimensional target distributions, possibly presenting correlated dimensions. Moreover, HMC reaches convergence using a reduced amount of samples with respect to vanilla MCMC [35]. For further details about the mathematical derivation of HMC and its application to cardiac simulations, we refer to Salvador et al. [20].

    We run four chains with 1000 adaptation samples in the warm-up phase and 1000 effective samples to estimate the posterior distribution, with a fixed 90% acceptance rate. For all model parameters, we consider prior distributions

    P(θ~EPi) U(θ~EPNM,iι,θ~EPNM,i+ι)fori=1,...,NP,(3.8)

    where θ~EPNM[1,1]NP is the initial guess obtained with the Nelder–Mead method. We always make sure that model parameters reside within the [-1,1] range. We set ι=0.2. Even though NUTS allows for many different initialization protocols, such as maximum a posteriori (MAP) or maximum likelihood estimation (MLE), we consider an initial random seed for each chain. This is motivated by the sensitivity of MAP and MLE over multiple runs, especially when several model parameters are calibrated with respect to noisy or highly varying time-dependent QoIs, which is the case for ECG recordings.

    Several sources of uncertainty can be considered. These include model uncertainties (e.g. the discrepancy between the actual physical phenomenon and the high-fidelity model), the discretization error introduced when solving the differential equations, the surrogate modelling error of reduced-order models and the measurement errors that intrinsically affect clinical ECG recordings (i.e. the sensitivity of the instrument used during the clinical test, variations in lead placement position by clinicians and patient-specific factors such as breathing and motion). However, in our inverse uncertainty quantification process, we only include the measurement error and the approximation error introduced by the transition from the high-fidelity model to the BLNM-based surrogate model. In particular, we consider a multi-variate normal distribution centred in the BLNM predictions zECG(t) for the given patient-specific observations zclinical(t), which reads

    zclinical(t) N(zECG(t),σmeas2I+k(t~,t~;σGP,lGP)).(3.9)

    σmeas=0.1 is the a priori fixed standard deviation dictating the measurement error [58,59], whereas k(t~,t~;σGP,lGP)=σGP2exp(||t~t~||22lGP2) is the exponentiated quadratic kernel of a zero-mean GP GP(0,k(t~,t~;σGP,lGP)) [60]. Amplitude σGP(0.01,1.0) and correlation length lGP(0.01,1.0) are additional hyperpriors tuned during HMC to quantify the surrogate modelling error, which may change according to the specific observation. Vectors t~ and t~ represent discrete time points in the [0,1] interval.

    A full covariance matrix in the multi-variate normal distribution allows us to model the correlation among different leads. We evaluate the convergence of the HMC chains by checking that the Gelman–Rubin diagnostic provides a value less than 1.1 for all the model parameters θ~EP, lGP and σGP [6163].

    3.7. Software and hardware

    All electrophysiology simulations are performed at the Stanford Research Computing Center using svFSIplus [64], a C++ high-performance computing multi-physics and multi-scale finite element solver for cardiac and cardiovascular modelling. This solver is part of the SimVascular software suite for patient-specific cardiovascular modelling [65].

    We train the NNs by using BLNM.jl [32], Julia library for scientific machine learning which is publicly available under MIT License at https://github.com/StanfordCBCL/BLNM.jl. This library leverages Hyperopt.jl [66] for parallel hyperparameter optimization by combining the Message Passing Interface with Open Multi-Processing (OpenMP) on physical and virtual cores, respectively.

    We perform sensitivity analysis and parameter estimation with uncertainty quantification using GlobalSensitivity.jl [67] and Turing.jl [68], respectively, which both exploit OpenMP and vectorized operations to speed-up computations. The code for sensitivity analysis and Bayesian parameter estimation is available within BLNM.jl as a test case.

    Furthermore, this public repository contains the dataset encompassing all the electrophysiology simulations used for the training and testing phases, along with the patient-specific 12-lead ECGs.

    4. Discussion

    We present a complete computational pipeline to build digital twins of cardiac electrophysiology for congenital heart disease in paediatrics. This cohort of patients is understudied in cardiology [27,37], as multi-physics and multi-scale numerical simulations are mostly focused on adults with certain sets of pathologies, such as dilated, ischaemic and hypertrophic cardiomyopathy, arrhythmias or bundle branch block [15,21,69,70].

    In this pipeline, we leverage biophysically detailed and anatomically accurate computational electrophysiology models, a recently proposed scientific machine learning tool for surrogate modelling, and robust Bayesian inference methods for personalized calibration of model parameters to match clinical 12-lead ECGs of an HLHS paediatric patient. We certify the impact and reliability of our estimation against clinical recordings by integrating fast and effective sensitivity analysis and uncertainty quantification. We run electrophysiology simulations with the estimated model parameters in order to investigate different scenarios of clinical interest in silico. We conclude that this paediatric patient presents activation and repolarization patterns similar to a left bundle branch block, where the interventricular dyssynchrony and the geometrical personalization of the Purkinje network play a minor role with respect to conductances and conductivities, even for QRS complex calibration.

    Image processing allows us to get all the anatomy-specific features of this paediatric patient and our calibration of cell-to-organ level model parameters enables patient-specific electrophysiology simulations. Nevertheless, given the non-convexity of the optimization problem, it is important to stress that the final set of model parameters might not be unique and there could be other choices that lead to similar approximation errors against clinical recordings. Indeed, we notice that changing random seeds or trying different optimizers, such as second-order local BFGS or even global adaptive differential evolution [71], may have an influence on the initial parameter estimation. These options are available within the BLNM.jl library. However, these effects are accounted for and mitigated by averaging many different trials and by running uncertainty quantification.

    Performing ad hoc sensitivity analysis for a specific parameter calibration provides individualized information, as these assessments may change on a patient-to-patient basis. Furthermore, we underline that sensitivity and practical identifiability (or trustworthiness) of model parameters are generally correlated. For instance, the maximum rapid delayed rectifier current conductance GKr and the level of interventricular dyssynchrony tLVstim have the lowest relative impact on this 12-lead ECG personalization (see figure 5) and present the highest degree of uncertainty, that is, a wider posterior distribution, among all physics-based model parameters (see figure 6).

    Although the actual heart geometry, relative heart orientation and lead placements significantly influence ECGs [72], and as such may require additional parameters to calibrate, the detailed spatial information retrieved from the CT scan combined with patient diagnosis allow us to determine these quantities a priori with a very small degree of uncertainty. While we consider all organ level parameters for the monodomain equation, we determine the relevant cell scale parameters for the ten Tusscher–Panfilov ionic model starting from the comprehensive sensitivity analyses performed by Dixit et al. and Sánchez et al. [73,74]. Specifically, the authors showed that ionic parameters related to calcium, sodium and potassium have a dominant effect in shaping the morphology of 12-lead ECGs and action potential. Furthermore, these ion channels are generally targeted by different medicines used to treat cardiac diseases [48].

    To the best of our knowledge, the work of Tikenogullari et al. [27] is the only prior study proposing computational electrophysiology models for HLHS paediatric patients. The authors analysed the effect of cardiac growth on electrical dyssynchrony. They matched some features of the QRS complex for V2, V6 and AVF leads via one-dimensional–three-dimensional numerical simulations directly, while estimating pointwise values of tissue conductivities and level of dyssynchrony.

    While our computational pipeline encompasses several rigorous steps, the physics-based model still requires high-performance computing and longer computational times compared to other approaches for digital twinning on adults [17,75], which rely on more phenomenologically based models but do not include robust methods for sensitivity analysis and uncertainty quantification. However, the monodomain equation, coupled with the ten Tusscher–Panfilov ionic model, provides an accurate mathematical model, where relevant model parameters with a direct physiological interpretation can be properly tuned. Moreover, its higher computational costs could be mitigated in future work by novel numerical methods in the framework of matrix-free [76] and isogeometric analysis [77]. We also remark that the presented approach is based on pseudo-ECGs [30]. On the other hand, reconstructing 12-lead ECGs by solving the electric potential within the torso, hence coupling the conductive torso domain with the monodomain model, would potentially provide some advantages, such as better matching of the times and peaks for the QRS complex and T wave in the different leads, while reducing the uncertainty of the results [7880]. Fibre orientation also plays an important role in cardiac function and its impact is particularly significant in the framework of one-way or two-way coupled electromechanical models [13,81].

    A limitation of the presented approach lies in the lack of experimental validation of the parameter calibration process. Indeed, mathematical modelling of congenital heart disease requires several assumptions due to the current lack of information in paediatric populations regarding fibre orientation, Purkinje structure, ionic current conductances and conduction velocities. Future studies should incorporate these data as they become available. Nevertheless, our estimations are robust, account for uncertainty quantification and are widely contained within the range explored by the electrophysiology simulations (see tables 1 and 3, along with figure 6).

    In future developments, we aim to encode anatomical variability and different CHDs, such as Tetralogy of Fallot, transposition of great arteries or atrial and ventricular septal defects within BLNMs. In this manner, the computationally expensive offline phase dictated by accurate numerical simulations and the training of the NN can be performed only once before being applied to new patients. Robust parameter estimation and uncertainty quantification will be then feasible for those CHDs within minutes, compatible with the time frame required by the clinical practice.

    Finally, we note that the proposed computational pipeline can be readily applied to other patient cohorts other than the challenging case of congenital heart disease presented in this article.

    Ethics

    Images and associated clinical data were obtained under the IRB-approved protocol 39377 at Stanford University. IRB-39377 defines a Waiver of Consent for patient-specific information.

    Data accessibility

    The code and data necessary to reproduce the results presented in this paper are publicly available on Zenodo [82].

    Declaration of AI use

    We have not used AI-assisted technologies in creating this article.

    Authors’ contributions

    M.S.: conceptualization, data curation, formal analysis, investigation, methodology, software, validation, visualization, writing—original draft, writing—review and editing. F.K.: data curation, visualization, writing—original draft, writing—review and editing. M.P.: investigation, writing—original draft, writing—review and editing. D.W.P.: writing—original draft, writing—review and editing. H.C.: writing—original draft, writing—review and editing. A.M.D.: writing—original draft, writing—review and editing. A.L.M.: funding acquisition, project administration, resources, supervision, writing—original draft, writing—review and editing.

    All authors gave final approval for publication and agreed to be held accountable for the work performed therein.

    Conflict of interest declaration

    We declare we have no competing interests.

    Funding

    We acknowledge the NSF grants 1663671 and 2105345, the NIH grants R01LM013120 and R01EB029362 and the Additional Ventures Foundation AVCC-200217.

    Acknowledgements

    We thank Professor Francisco Sahli Costabal for the insightful discussions about Purkinje network personalization.

    Footnotes

    Published by the Royal Society. All rights reserved.