Multiscale entropy rate analysis of complex mobile agents

Accurate prediction of the motion of objects is a central scientific goal. For deterministic or stochastic processes, models exist which characterize motion with a high degree of reliability. For complex systems, or those where objects have a degree of agency, characterizing motion is far more challenging. The information entropy rate of motion through a discrete space can place a limit on the predictability of even the most complex or history-dependent actor, but the variability in measured encountered locations is inexorably tied to the spatial and temporal resolutions of those measurements. This relation depends on the path of the actor in ways that can be used to derive a general law in closed form relating the mobility entropy rate to different spatial and temporal resolutions, and the path properties within each cell along the path. Correcting for spatial and temporal effects through regression yields the path properties and a measure of mobility entropy rate robust to changes in dimension, allowing comparison of mobility entropy rates between datasets. Employing this measure on empirical datasets yields novel findings, from the similarity of taxicabs to drifters, to the predictable motions of undergraduates, to the browsing habits of Canadian moose.


Theory
For this model, the LZ-based entropy rate, given in (1), is the method used to estimate the entropy rate of the mobility string, as employed by other researchers [11,10,9]. Our implementation of (1) is available at https://github.com/ tuhinpaul/lz_entropy_rate. In (1), L is the length of the sequence and Λ i is the length of the smallest sub-string that begins at the zero-based index i and was not encountered in positions 0 to (i − 1).
The theoretical model of the scaling of entropy rate with spatio-temporal resolution, proposed by Osgood et al. estimates entropy rate according to (2).
where d is the spatial scale, T is the sampling interval, x is the total travel distance, andū is the average velocity [6].
In this work, we extend the theoretical model, and apply it to empirical datasets from a wide variety of sources. The extended model considers the velocity and dwell time of the agents in formulating the entropy rate. Dwell times follow a power-law distribution, and constitute a major part of the real life mobility traces [11,7].
Let the spatio-temporal resolution be represented as a tuple (T, d), where T is the sampling interval and d is the side length of a square cell in the spatial grid. We assume square cells for simplicity in spatial quantization, and define d as the length of an edge, or the characteristic length of a cell.
The time spent while in motion in the i th cell is expressed as t mi . Without considering the dwell time, the apparent average velocity in the i th cell along the path of an agent is shown in (3).
The agent may traverse a distance of k i d in the i th cell where k i ∈ R + . The actual velocity might vary considerably but the average velocity as observed by the experimenter will appear as v * i . The total time spent in the i th cell is the sum of the time spent in motion and the dwell time. The time spent in motion inside the i th cell is d v * i (from (3)). The total time spent in the i th cell is, therefore, expressed as (4).
The total dwell time in the i th cell is the sum of dwells in the i th cell is shown in (5).
Therefore, the observable average velocity considering dwell time,ṽ i , while passing the i th cell, can be expressed as (6).
Let the agent travel through n cells on its entire path, treating repetitions of a cell separately. The number of blocks of repeating strings along the path would be represented by n. The total time spent in motion on the entire path is expressed as t m , which is the sum of each t mi , as shown in (7).
Considering only motion, and nd as the total traversed length, the apparent average velocity v * is shown in (8).
The total dwell time along the entire path is the summation of dwell times in each cell, as shown in (9).
The observable average velocity for the entire path, considering dwell times as well, is represented asṽ, as shown in (10).
From (7), we find that n i=1 t mi = t m . From (8), we find that t m = nd v * . Substituting them into (10), we findṽ as shown in (11).
There are n blocks along the entire path. Let L i represent the length of the i th block, and L be the length of the entire string. For simplicity, we assume that L i is an even integer, and is approximated as (12).
Similarly, L can be expressed as (13): Assuming a unique terminating symbol at the end of the string representing the path, and using the same decomposition as the theoretical model [6], we can express Substituting (12) into (14), we can express The entropy rate H(d, T ) from (1) can, therefore, be expressed as (16). Let the summations in (16) be quantified as shown in (17) -(21). (16) Substituting (17) -(21) into (16), we can express H(d, T ) as shown in (22).
Although t i values can be approximated from GPS data, t mi , t di , or v * i values can not be reliably extracted without additional speed data. We assume that the sums of the terms involving these quantities approximate the true sums within the over distances and sampling rates of interest, or that the values of d and T describe a single regime of the model. At large sampling intervals and kilometerlevel spatial quantization, the deviations of the sums become significant, and the model is expected to deviate at those coarse spatio-temporal resolutions.

Variable Coefficient Analysis
In the main body of this chapter, we argued that C 1 -C 5 could be regarded as independent of d and T if the sums were stable, which was demonstrated to hold true for all datasets for all but the largest values of d and T . Here we extend the analysis to capture how C 1 -C 5 might vary with d and T .
From (21) and (13) Therefore, the term C5 T L corresponds to the fraction of dwell time in the total travel time.
d C4 T L in (22) can be expressed as (24).
The quantity d C4 T L corresponds to the fraction of non-dwell time along the entire path. (22) can be expressed as (25).
In (25), ti t is the fraction of the time spent in the i th cell. Let this ratio be expressed as f i . Then, we can rewrite (25) as (26).
If we know the distribution of f i , we can approximate n i=1 f 2 i when n changes due to change in (T, d). By expressing the sum as f (d, T ), we can then rewrite H(d, T ) from (27) as (28).
We can rely on empirical methods to estimate f (d, T ). Using Eureqa [3], we empirically found a solution for f (d, T ) as shown in (29), which works in general form across datasets: By substituting (29) into (28), we can, therefore, express H(d, T ) as (30).

Scaling Law Behavior
Knowledge of maxima/minima of the entropy rate for a particular d or T may be useful in the design and evaluation of a mobility study to assess extreme values of the entropy rate. Similarly, the behavior of the model at the limits of d and T may ensure that if the model conforms to the desirable behaviors governed by the structure of the location string at those limits.
To check if H(d, T ) has a maxima/minima at any d or T , we need to differentiate (22) or (31) with respect to d and T . We find the relation in (32) by differentiating (22) with respect to d. However, because C 1 , . . . , C 5 are positive numbers, no practical d or T can be found from (32), as all roots with respect to d and T are negative.
For convenience, we use (31) to differentiate H(d, T ) with respect to T , and find the relation in (33) dictating the presence of maxima/minima: Given C 1 -C 5 , for a given d, (33) can be solved using numerical analysis to find the T pertaining to a maxima/minima of H(d, T ).
Behavior of H(d, T ) at Limits: We examine if the model has desirable behavior at the limits of T and d. Behavior at the limits would explain whether the model conforms to theoretical constraints. For a constant T , dictionary size grows as d approaches 0. In the limiting case, all repetitions will be the result of dwelling, where C 2 and C 5 , from (18) and (21) respectively, pertain to the dwelling of the agent. The repetition from dwelling scales the maximum entropy, log L, of L symbols as follows: When the cell size is very large (d → ∞), all samples fall into the same cell, resulting in zero entropy rate, which is independent of the temporal resolution. From (22), this is mathematically presented as (34).
For convenience, we use (31) to find the entropy rates at the limits of T . When d is a constant and T → 0, entropy rate goes to zero, as shown in (36), because we end up with longer and longer strings of repeating locations with high compressibility: When d is a constant and T → ∞, then entropy rate is undefined as shown in (37).
This is sensible because when T → ∞, we can not have enough samples to evaluate H(d, T ) at that T for varying d.
We can also consider how the model behaves at extreme values of speed and dwell times. If ∀ i t di → 0, then entropy rate should depend on apparent average velocities at each cell and dwell times do not effect the entropy rate. The model conforms to this case of motion without dwelling, as shown in (38).
However, if the dwell time at the j th cell approaches ∞, all samples after reaching that cell will be the same. The apparent average speed within the cell would approach 0. Because the cell would get stuck in the j th cell, the time in that cell and the length of the substring emanating from that cell will be determined by the dwell time in that cell considering total observation time. The dwell time in the j th block, in this case, is t dj as shown in (39), and the length of the j th block is L j as shown in (40).
Considering (14) for the j th block, the entropy rate can be expressed as (41).
If the agent is observed for infinite time, the entropy rate according to (41) would approach 0.
If the apparent average velocities in the cells approaches ∞, then v * i values would not effect the entropy rate in practice and the entropy rate would depend on dwell times: (42)

Data Collection and Features
We used six empirical datasets encompassing mobility of humans [12], taxi cabs [2], animals [5], [13], and ocean drifters [8] to evaluate the performance of the theoretical model. Human mobility patters were taken from the Saskatchewan Human Ethology Datasets (specifically, SHED7 and SHED8) [12], which are linked to the ongoing development of iEpi [4]. The SHED datasets contain detailed mobility, activity, and contact traces from university students and staff. We considered the mobility traces from the taxi cab mobility study conducted by Bracciale et al. in Rome, Italy [2] as a contrasting human mobility dataset. Understanding the taxi trace patterns can play an important role as a contrast to the patterns of undergraduates. Taxi cab traces incorporate mobility traces of random people, and are expected to encounter popular routes and important urban locations. We also consider the mobility patterns of wild animals, less constrained by urban enviornments. We assessed the model with the GPS traces that were collected from collars mounted on moose [5] and Antarctic petrels [13].
Because the scaling law does not require movements with agency, we also use the mobility tracks of ocean surface drifters [8] to validate that the model applies in general to complex physical pheonmenon as well. The drifter data also includes times spent on the land before deployment and if the drifter runs aground prior to retrieval. The durations of the studies behind the datasets vary largely, as shown in Table 1. The mentioned duration of SHED7 and SHED8, taxi cab, and ocean drifter studies in Table 1 are based on all the available date values in the datasets. The traces in the taxi cab study were sampled at much smaller intervals than in other datasets. Therefore, we limited the location traces to the first fifteen days of the study to make entropy rate calculation feasible within a reasonable amount of time for small T values. For the moose dataset, records between Jan 2012 and Feb 2015 are considered, and this is reflected in Table 1. The records in the petrel dataset span from Dec 2011 to Jan 2014.
The agents/participants of each dataset were passed through a filtering process to ensure that they had a minimum number of records at large sampling intervals.
The base interval in Table 1 refers to the approximate interval of data collection, as we observed in the data or was available from the corresponding study.  Fig. 1 -Fig. 6 show the dispersion of the agents/participants of the datasets, over three days, as heat maps of visited locations. All locations visited by all participants on the selected days are considered. For a given spatial quantization, all locations within a spatial bin were grouped together. The quantization process is described in Data Mediation (Section 2.2). All the locations in a group were represented by the location (latitude, longitude) = (lat group , lon group ) as follows:

Dispersion Maps
lat group = min(latitude in group) + max(latitude in group) 2 (43) lon group = min(longitude in group) + max(longitude in group) 2 The center and zoom level were set manually to make the presentation legible because otherwise the maps covered larger areas, zooming out the locations of interest. The manual selection of the map center and zoom level dropped out some visited locations -a trade-off made to make the maps comprehensible. Based on the frequency of visits, the visited locations are colored using a gradient from red (most visited) to green (least visited). The scale bars on the bottomleft corners of the maps indicate different distances because of the variation in the speed and span of movement of the corresponding agents. In the SHED ( Fig.  1 and Fig. 2) and taxi (Fig. 3) studies, all participants take part in the study at or around the same time. Locations of agents in other datasets may not overlap. We find that SHED participants ( Fig. 1 and Fig. 2) visit similar locations from day to day and show different behaviors over the weekend. Similar trends are found in the taxi dataset ( Fig. 3) but their dispersion varies significantly based on spatio-temporal resolution. Moose (Fig. 4) visit locations within a larger home range with a few hotspots. Ocean drifters (Fig. 6) change their place significantly from day to day. Similar behaviors are observed in petrels (Fig. 5). Changing spatio-temporal resolutions redistribute hot spots in the maps because of movement of agents to new places. The change in hotspots is less pronounced in humans and moose than in other datasets, indicating their relatively slower movement. Change of hotspots in the taxi dataset is well pronounced. The change of locations of petrel and ocean dirfters are slightly obscured by the relatively large span of locations visited by these agents.

Data Mediation
For SHED 7, SHED 8, and taxi datasets, we accepted participants having at least fifteen location records (arbitrarily decided) for the largest sampling interval used for the corresponding dataset. For other datasets, we only discarded erroneous location records as described above.

Fitting Protocols
We used the Eureqa software [1,3] to derive the constant terms in our model, shown in (22), from the location sequences after spatio-temporal quantization. Eureqa [3] is an artificial intelligence-based data non-linear regression tool, which estimated the parameters in (22) via global-optimization based nonlinear regression. The input to Eureqa for data regression is a set of (dset, T, d, L, lzH) tuples where lzH is the aggregate entropy rate for the data set dset at spatiotemporal quantization (T, d) and L is the corresponding average sequence length. We used R 2 -based goodness of fit and mean squared error as the error metrics to evaluate fit performance.  Dispersion in the weekend is slightly different than the weekdays. Weekdays exhibit visually similar dispersion. The data exhibit visually less dispersion changes than the summer data in Fig. 1. Figure 3: Heatmap of the dispersion of taxi cabs tracked in Rome over three consecutive days. The map area is much smaller than the maps shown for undergraduate students in Fig. 1 and Fig. 2. Taxicabs demonstrate aggregate human movement behaviors. In the weekday, the locations are more concentrated to a hotspot but two hotspots are visible in the weekend. Figure 4: Heatmap of the dispersion of the tracked moose over three consecutive days. The hotspots appear visually stable, which indicate steady grazing behavior. Figure 5: Heatmap of the dispersion of Antarctic Petrels over three consecutive days. Their locations change largely from day to day because petrels fly over wide areas and do not stick to specific locations for long. Figure 6: Heatmap of the dispersion of ocean drifters over three consecutive days. Similar to petrels, ocean drifters move to different places due to sea currents as time passes, and show no ties to specific locations.