Multi-fractal characterization of bacterial swimming dynamics: a case study on real and simulated Serratia marcescens

To add to the current state of knowledge about bacterial swimming dynamics, in this paper, we study the fractal swimming dynamics of populations of Serratia marcescens bacteria both in vitro and in silico, while accounting for realistic conditions like volume exclusion, chemical interactions, obstacles and distribution of chemoattractant in the environment. While previous research has shown that bacterial motion is non-ergodic, we demonstrate that, besides the non-ergodicity, the bacterial swimming dynamics is multi-fractal in nature. Finally, we demonstrate that the multi-fractal characteristic of bacterial dynamics is strongly affected by bacterial density and chemoattractant concentration.


Note 1. Tests nonlinearity of motion based on Mardia's test
To check the nonlinearity of a bacterium motion we performed Mardia's test on the data set for bacteria trajectory. This method is based on multivariate extension of skewness measure. The bacterium motion is nonlinear if the skewness of trajectory of its motion is non-zero. This also demonstrates bacterium motion does not have a normal distribution [1,2,3].

Summary of Mardia's test:
Based on Mardia's approach, the skewness of the data for a sample of { " , $ , . . . & } of k dimensional vector relates to the third moment of the data set. The first step to find the third moment of data set is calculating the covariance matrix from equation (1).
In equation (1) and (2), is the mean of the data set. In the next step we calculated skewness from equation (2 (2) Note 2. Henze-Zirkler's Multivariate Normality Test Henze-Zirkler's Multivariate Normality Test is another approach to test the nonlinearity of data set. This test is based on a nonnegative functional distance that measures the distance between two distribution functions. We calculate this distance from equation (3) P(t) is the characteristic function of a multivariable normal distribution and Q(t) is the empirical characteristic function of the data which we want to test, Q(t) should be centered and scaled to have the identity as the covariance matrix before comparison with normal distribution. F b (t) is the weight or kernel function calculated from equation (4). X is data matrix, MX is sample mean vector and S is the covariance matrix normalized by n.
For a multivariate normal data, the test Henze-Zirkler statistic T n,b has a log-normal distribution. This test calculates the mean, variance and smoothness parameter. It will log-normalize the mean and variance and estimates the p-value [7,8,9].
This test has the following properties according to [8] • Affine invariance • Consistency against each fixed non-normal alternative distribution • Asymptotic power against contiguous alternative of order n -1/2 • Feasibility for any dimension and any sample size Note 3. Royston's Multivariate Normality Test To investigate the nonlinearity of a bacterium motion we performed Royston's marginal method which tests all the p variates for univariate normality with a Shapiro-Wilk statistic, then combines the p dependent tests into one omnibus test statistic for multivariate normality. Royston transforms the p-Shapiro-Wilk statistics into an approximate Chi-squared random variable, with e (1<e <p) degree of freedom. In this method, we estimate the degrees of freedom by taking into account possible correlation structures between the original p test statistics. If the data is multivariate normal, H function in equation (7) is approximately Chi-squared distributed.
Where  [4,5,6]. We used the correlation matrix and the diagonal matrix of reciprocals of the p standard deviations and transformed a multivariate normal into independent standard normal by equation (8).
In this equation Z is the standard normalized data matrix, V is the eigenvector matrix and L is the eigenvalues diagonal matrix. In this method skewness is transformed using the D'Agostino procedure and kurtosis is transformed from a gamma distribution to a chi-square. We calculated Doornik-Hansen statistic from equation (9), which approximates to a chi-square with 2p degrees of freedom. These result are based on different tests on the bacterium trajectory from simulated bacteria with density of 8×10 2 (Bacteria/cm3) in a cubic environment with size 5mm×5mm×5mm without chemoattractant in the environment. Number of variables in all the tests are 3 as the bacterium trajectory is in 3D environment, and the sample size is 1001. All the tests results are in agreement that the bacteria motion does not have normal distribution; hence, bacterium has nonlinear motion. Figure S1. Skewness plots for some of the simulation bacteria trajectories. a) 8×10 2 (bacteria/cm 3 ) in the cubic environment, with/out linear gradient of chemoattractant in y direction, with/out obstacles on the bacteria way. b) 8×10 3 (bacteria/cm 3 ) in the cubic environment, with/out linear gradient of chemoattractant in y direction, with/out obstacles on the bacteria way. c) 8×10 4 (bacteria/cm 3 ) in the cubic environment, with/out linear gradient of chemoattractant in y direction, with/out obstacles on the bacteria way. In all the cases skewness is unequal to zero meaning bacterium displacement does not have normal distribution; therefore, bacterium motion is nonlinear. We performed MFDFA to investigate the higher order moments of a bacterium trajectory of motion to shed light on the underlying structure in bacteria motion [54]. In general, the dynamics of a time series from a stochastic process is fractal if it displays self-similarity of specific features on different scales. The fractal dimension represents a measure for these unique features (e.g., irregularities or patterns) in the stochastic process. Two major types of fractals are known as mono-fractal and multi-fractal. mono-fractal dynamics can be characterized by a single fractal dimension. In other words, if the probability distribution function of the stochastic process at two different scale is self-similar and related to each other via a power law relationship, the stochastic process is mono-fractal. In contrast, a multi-fractal stochastic process is characterized by a series of fractal dimensions. The MFDFA analysis helps to verify whether the bacterium motion is multi-fractal or monofractal by computing the generalized Hurst exponent as a function of the higher order moment and the multifractal spectrum. We explain the MFDFA analysis of the time series in the following steps.
The first step in MFDFA is checking that the time series has a random walk like structure. If the time series (x) has a noise like structure, it should be converted to a random walk type time series using equation (10).
In equation (10), <x> is the mean of time series.
Next, based on scaling size s, we divided the time series Y into N s non-overlapping subintervals using equation (11). This equation returns the integer part of the division.
Since the length of the time series is not always an exact multiple of the scaling size s, a short part of the time series at its end will remain after dividing the time series Y into N s non-overlapping subintervals. To consider the remaining part of the time series from its end and avoid disregarding it, we repeated the same division procedure starting from the opposite end of the time series. In other words, we are dividing the time series into N s non-overlapping subintervals from the end point of the time series, and this time a short part of the time series in the beginning of the time series will remain. Therefore, we obtain 2 z segments overall [55].
The next step is computing the local Root-Mean-Square variation of the time series in each segment using equation (12) and (13). This quantifies the local fluctuations in the time series.
In equation (12) and (13), • ( ) is the polynomial (e.g. linear, quadratic, cubic or higher order polynomial) fitted to each segment . In our analysis we considered a linear fit.
In the next step we averaged over all segments to quantify the qth order fluctuation function. In other words, we calculated the mean of q-order RMS for corresponding scaling size s and order q from equation (14).
We repeated all the steps for several values of the time scales s. Then we determine the scaling behavior of the fluctuation functions by analyzing log-log plots • ( ) versus for each value of . If the time series has a longrange power-law correlation, equation (15) will be valid. In other words, mean of q-order RMS ( • ( )) increases as a power-law for large values of .
For simplicity we considered that the length N of the time series is an integer multiple of the scale , meaning z = / : differently, there will be significant dependence of ( ) on , which means the time series is multi-fractal.
Therefore, we can use generalized Hurst exponent to investigate whether the time series is mono-fractal or multi-fractal. Moreover, the generalized Hurst exponent allows us to compute the Hurst parameter and discriminate between short-range and long-range memory effects. For instance, if the Hurst exponent equals 0.5 then we can state that the dynamic is governed by short-range memory or Markovian models. In contrast, if the Hurst exponent ranges between 0.5 and 1, then the dynamics has long-range memory characteristics and non-Markovian dynamical models are needed.
Another way to distinguish between mono-fractal vs. multi-fractal time series is to first convert ( ) to the scaling exponent ( ) (defined by equation (17)). And then convert ( ) to Holder exponent (i.e. singularity strength) h(q) and singularity spectrum • (defined by equation (18) and (19)) via a Legendre transform [55].
In the case of mono-fractal time series, the scaling exponent ( ) has a linear dependency with . The linear qdependency of ( ) leads to a constant ℎ of these time series because ℎ is the tangent slope of ( ). The constant ℎ reduces the multi-fractal spectrum to a small arc for the mono-fractal time series. In contrast, the multifractal time series has scaling exponents ( ) with a curved q-dependency and, consequently, a decreasing singularity exponent ℎ . The resulting multi-fractal spectrum is a large arc where the difference between the maximum and minimum ℎ are called the multi-fractal spectrum width. Thus, the width and shape of the multi-fractal spectrum is able to classify a wide range of different scale invariant structures of time series [54].  The results show increasing the bacteria density make them less directional motile and they tend to oscillate more. Adding chemoattractant to the environment makes the bacteria more directional motile towards the increasing gradient of chemoattractant and oscillating less. Obstacles in the environment make bacteria to oscillate more to prevent hitting them.  Figure S5. Multifractal spectrum plots for the simulated bacteria trajectories. a) 8x10 2 bacteria/cm 3 in the cubic environment, with/out linear gradient of chemoattractant in y direction, with/out obstacles on the bacteria way. b) 8x10 3 bacteria/cm 3 in the cubic environment, with/out linear gradient of chemoattractant in y direction, with/out obstacles on the bacteria way. c) 8x10 4 bacteria/cm 3 in the cubic environment, with/out linear gradient of chemoattractant in y direction, with/out obstacles on the bacteria way. The results show increasing the bacteria density make them less directional motile and they tend to oscillate more. Adding chemoattractant to the environment makes the bacteria more directional motile towards the increasing gradient of chemoattractant and oscillating less. Obstacles in the environment make bacteria to oscillate more to prevent hitting them.   [19], stochastic differential equations [19], and a new hybrid simulation algorithm based on the hierarchy and dynamics of biochemical systems) with genetic circuits and chemotaxis pathway models in a complex 3D environment.
Specifically, to simulate the chemoreceptors, we use the Monod-Wyman-Changeux model in which the receptor homo-dimers assemble into fully cooperative signaling teams that switch rapidly between active and inactive states. The methylation kinetics is based on the well-known Barkai and Leibler model [20] for a near perfect adaptation system. For the signal transduction from chemoreceptor to the flagella motor regulator Yp, the concentration of phosphorylated CheYp is assumed to be proportional to the kinase activity without considering the nonlinear dependence [21]. Finally, we use a two-state model to describe the motor behavior of bacteria, which sets the clockwise (CW) and counterclockwise (CCW) states in two potential wells. The transition rates have been fitted to experimental data obtained from [22].
To better clarify the validity of our model, in what follows, we also provide a more detailed description of the full chemical pathway model of bacteria used in our BNSim simulation environment. In general, the chemotaxis pathway of S. marcescens has three components: the cooperative chemoreceptor, the phosphorylation pathway, and the flagellar motor [24,[25][26][27]37]. In what follows, we describe the three components of the chemotaxis pathway. Tar receptors in a complex; the average activity of the receptor can be expressed as [23,28]: According to the MWC model, the free energy difference can be written as: In equation (21) In equation (22) = 1.7 and P = 1.
The methylation kinetics is based on the Barkai and Leibler model for a near perfect adaptation system [39].
More precisely, the methylation kinetics is assumed to have a linear form: In equation (23) B = Ÿ are the methylation and demethylation rates, respectively.

b. Phosphorylation Relay and Flagellar Motor
An active receptor enhances the autophosphorylation of the receptor-associated kinase CheA, which transmits the signal to the flagellar motors by the phosphorylation of a diffusive response regulator CheY. In most chemotaxis full pathway models, the concentration of phosphorylated CheYp is assumed to be proportional to the kinase activity, = ( ), without considering the nonlinear dependence [28]. and H is the Hill coefficient of the motor response function. However, under some gradient conditions, the bacteria flagellar motor may stay in the tumble state for a time significantly shorter or longer than the average value, which biases the swimming dynamics considerably [40]. Therefore, we adopt a two state potential well model to describe the motor behavior of bacteria, which sets their two states in two potential wells.
The energy barriers of CCW to CW and CW to CCW transitions are P ( ) and -P ( ), with transition rates : and S , respectively [22]: Where parameters P = 1. In an environment with a one-dimensional (along x-direction) chemoattractant gradient, the transport kinetics of bacteria density can be described with the following equation [33]: In equation (27) ¯ is the density flux of bacteria through a slice that is perpendicular to the x axis, is the motility coefficient, B(x,t) is the bacteria density, andis the chemotactic velocity as previously discussed.
The motility coefficient, , similar to the diffusion coefficient, is a proportionality measure between the bacterial density flux and the gradient of the bacterial density. Bacteria transport in the channel is subject to two effects: the diffusivity motility and the chemotactic drift, which are described by the first and second terms on the right hand side of Equation 27, respectively. The diffusivity motility keeps the transport bacteria down the gradient of bacterial population density, while the chemotactic velocity pumps bacteria up the chemoattractant gradient. From the probabilistic modeling of individual bacteria, the motility coefficient and chemotactic velocity can be expressed as follows [34,37]: In equation (28) and (29), S and : are the average tumble rates when bacteria travel up and down the chemical gradient, respectively, is the average 3D swimming speed of bacteria, and < > is the average tumble angle, which is defined as the swimming direction change during a tumble. Instead of a simple model based on average rate processes, models based on random walk theories that consider distributions of running and tumbling durations are also available [35,36]. These advanced models could provide a more accurate characterization of single bacteria motility and population transport if both the run and tumble durations of bacteria can be measured experimentally in the future.
We have performed the following tests to confirm that simulated bacteria in BNSim exhibit similar characteristics of real bacteria: 1. In BNSim we have calibrated the model of a bacterium using experimental data reported for real bacteria [21]. Figure S7 shows the calibration results and the validity of our model. For instance, the experimental results for normalized initial receptor activity [51] are shown in the upper Fig. S7.b, while the BNSim simulation results for the wild type bacteria is shown in the figure S7.b below. Exact receptor activity curve ensures that bacteria respond to external stimuli correctly. Figure   2. Figure S8 shows the simulation results obtain with BNSim for synchronization process on bacteria swarm ring that expand radially. The simulation result matches experimental observation reported in Brenner and Berg [21,41]. The differences between volume exclusion model and point-like model in BNSim have been shown in previous research [53]. In essence, the volume exclusion is a major source of bacteria non-linear behavior at high celldensity. In our simulations we consider each bacterium occupy a specific space based on its volume, and we check for non-overlapping cells based on the space not only center of mass. Note 8. Obstacles considered in our model In realistic scenarios of using bacteria, the swarm dynamics will be affected by collisions with existing obstacles. Consequently, we consider the effect of obstacles in the mathematical modeling of the swarm of bacteria. In our considered simulation setup, we divide the environment into 10 6 smaller cubes, each of size 50×50×50 µm 3 . We consider 0.01 percent of the total number of cubes obtained after tessellation of the environment to represent obstacles, and choose the location of obstacles to follow a uniform distribution within the environment (Fig. 1a). The way we implement the obstacles in our simulation is by making these small cubes to be impenetrable, thus when a bacterium approach one of these cubes it cannot continue swimming and so it chooses a new direction for runs.
Another reason to use small cubes is to speed up the calculation in simulations. Precisely, one data element associated with each small cube stores all references of living and non-living objects in the local environment of that cube; these data elements can therefore be accessed very efficiently according to their indexes as a function of their absolute position in the global environment. This way, we can ensure that interactions between bacterium and environment, also interactions among bacteria themselves can be efficiently simulated. Of note, to ensure the correctness of the result, access to each cube needs to happen in a mutually exclusive manner so each bacterium needs to obtain the corresponding mutex beforehand. Therefore, the small cubes help to reduce the computational complexity, so we only need to check particle pairs for collision condition inside the same cube and neighboring cubes.  [42,43]; the average swimming speeds for both bacterial strains range from 20 to 50 µm/s [42,44].
In addition to the resemblance in morphology and motility, the two species show nearly the same response to canonical chemoattractant, such as L-aspartate, as shown in [42] and [45]; and their chemotactic signaling pathways can be modeled in the same way to predict their behavior [45]. It has been shown that Tsr and Tar chemical receptors on the cell membrane are responsible for sensing chemical gradients, as well as pH and temperature gradients, and initiating signaling pathways that lead to a change in the rotation of the flagellar motors. This, in turn, drives the chemotactic behavior and responses to pH and temperature gradients. Given the nearly identical response to temperature and chemical gradients between S. marcescens and E. coli, it has been hypothesized that they share similar chemical sensors and signaling pathways [42]. Note 10. Autochemotaxis considered in our model We investigate bacteria dynamics motion under realistic conditions such as chemical interactions among bacteria. Autochemotaxis is an example of chemical interaction between bacteria. Autochemotaxis happens when bacteria convert substrate succinate molecules in the environment into chemoattractant aspartate molecules and excrete them in the environment. Once the succinate gets depleted, bacteria start consuming the chemoattractant aspartate excreted by themselves; this way, they may get some information about the population of bacteria around them [41,47,48,49].
In our BNSim simulations, we consider that the diffusion coefficient of succinate is the same as aspartate which is equal to 890. The initial concentration of Succinate is 5 , the production rate of Aspartate is 2 :< (1/ ). The concentration of Aspartate follows equation (30).
The consumption of Aspartate is modeled as presented in equation (31).

•[-zR]
•ž = − -zR , -zR = 1.5 :6 (1/ ) we assume that bacteria start to consume attractant aspartate 2 minutes after being in a low succinate environment [46]. Figure S8 shows the simulated synchronization process of a bacteria swarm ring that expands radially that we performed to validate autochemotaxis model in BNSim [21]. We reproduced this experimental phenomenon with BNSim by using the theoretical model proposed by Brenner et al (see also the experimental results in Fig. S8(e) by Brenner and Berg [41]). When bacteria are initially located in the center of a 3D space, they start excreting chemoattractant aspartate by converting the substrate succinate molecules in the environment into chemoattractant molecules. Once the succinate gets depleted, bacteria start consuming the chemoattractant aspartate excreted by themselves (T = 300s in Fig. S8(a)). Therefore, the aspartate concentration in the inner region of the ring becomes lower than the concentration outside the ring. This way, bacteria can create a chemoattractant gradient spontaneously, and consequently move outwards. In Fig. S8(d), we can observe that at time T = 2240s, the succinate inside the swarm ring is nearly exhausted, and the bacteria swarm ring carrying information covers all the network nodes in the 3D space. Note that bacteria division affects the formation of the swarm ring only weakly; this is because the division process has a much larger time scale (in the order of hours).  Table 4. Goodness-of-fit for Gaussian and stable type distributions for distance a bacterium travels in lag time Note 11. Dickey-Fuller test results to check bacterium motion is nonstationary We performed a Dickey-Fuller test on the time trajectory of simulated bacteria motion in the case of 8x10 2 bacteria/cm 3 swarm density without chemoattractant (the same case analyzed and presented in Fig. 3a in the main manuscript). We also performed the same test on white noise as an example of a stationary process and on a random walk as an example of a nonstationary process to be able to compare the results. Table 5 presents the results of the Dickey-Fuller test, which checks for the existence of a unit root in time series. The time series has a unit root if 1 is a root of the process's characteristic equation. If the time series has a unit root, we can conclude that it is nonstationary. In this table, h = 0 rejects that the time series is nonstationary, and ADFpval presents the probability that the time series has a unit root. Table 5 shows that based on our expectation, the test rejects that the white noise time series is nonstationary, but it does not reject it in the rest of the cases.  To connect bacteria positions in individual frames and form the time trajectories of bacteria up to tens of seconds in length, we used a nearest neighbor algorithm. In other words, we connected the disjoint path segments by using a weighting function, which determined the likelihood that two path segments were produced serially by the same bacteria. Equation (32) explains this weighting function in more detail. This algorithm helps to connect positions of bacteria in disjoint path segments which were produced serially by the same bacteria in different frames.
= $ $ + < ∆ $ + T ∆ + Q ∆ In equation (32), ∆ is the change in speed between the end of the first trajectory (previous path segment) and the start of the second trajectory (next path segment). The speed was calculated from trajectories smoothed with a five frame moving average. ∆ is the change in heading during a tumble. The heading was defined as the direction of the instantaneous velocity, which was calculated from the smoothed trajectories. ∆ is the change in time. d is the difference in position between the start of the second trajectory and where as object moving at the final speed and heading of the first trajectory would be at a time ∆ after the end of that trajectory. The coefficients c i are empirically fitted constants. Further details about the tracking algorithm are described in [41].