Electronic for: Collective navigation can facilitate passage through human-made barriers by homeward migrating Paciﬁc salmon

We fit two proportional hazards (PH) models and one logistic model: these models are termed “finding,” “ladder,” and “committing” respectively. Each model was fit for Chinook and sockeye separately, since we expected inter-species differences in most or all coefficients. The PH models were fit in R [7] using the phreg function from the eha package [1]. All models included daily counts of conspecifics. Since counts are available for each ladder, we used ladder-specific counts for the “ladder” and “commit” processes; however since a fish might approach either ladder upon entering the tailrace we used dam-level counts for the “finding” process. All models controlled for temperature [3], dam identity, and a binary diel effect [4]. For the finding and committing processes we included a spill effect, but since fish cannot detect spill from within the fishway, no spill effect was included in the ladder models. For all models we included a binary covariate (middle) indicating whether a fish’s tailrace entry was in the middle (2nd and 3rd) quartiles. This variable controlled for the effect of run timing on fish condition and passage speed, since timing is highly correlated with density (very high densities near the modal date) and might otherwise confound our analysis. For Chinook we included a binary variable (spring) indicating whether a fish was labeled springor summer-run. For sockeye we included a three-level factor variable indicating whether a fish was sourced to the Wenatchee, the Okanagan, or to any other site. The Wenatchee and Okanogan tributaries each accounted for about a third of the sockeye in our sample. Two of our predictors — temperature and day — had strong prior evidence from the literature [3, 4]. To reduce the number of variables being selected and the risks associated with model dredging, we included these variables a priori. The remainder of the predictors we selected using AICc. For consistency, however, we wished to fit structurally identical models to each set of species/process submodels. We examined the AICc for each set of models and


Statistical Analysis
We fit two proportional hazards (PH) models and one logistic model: these models are termed "finding," "ladder," and "committing" respectively. Each model was fit for Chinook and sockeye separately, since we expected inter-species differences in most or all coefficients. The PH models were fit in R [7] using the phreg function from the eha package [1]. All models included daily counts of conspecifics. Since counts are available for each ladder, we used ladder-specific counts for the "ladder" and "commit" processes; however since a fish might approach either ladder upon entering the tailrace we used dam-level counts for the "finding" process.
All models controlled for temperature [3], dam identity, and a binary diel effect [4]. For the finding and committing processes we included a spill effect, but since fish cannot detect spill from within the fishway, no spill effect was included in the ladder models.
For all models we included a binary covariate (middle) indicating whether a fish's tailrace entry was in the middle (2nd and 3rd) quartiles. This variable controlled for the effect of run timing on fish condition and passage speed, since timing is highly correlated with density (very high densities near the modal date) and might otherwise confound our analysis.
For Chinook we included a binary variable (spring) indicating whether a fish was labeled spring-or summer-run. For sockeye we included a three-level factor variable indicating whether a fish was sourced to the Wenatchee, the Okanagan, or to any other site. The Wenatchee and Okanogan tributaries each accounted for about a third of the sockeye in our sample.
Two of our predictorstemperature and day -had strong prior evidence from the literature [3,4]. To reduce the number of variables being selected and the risks associated with model dredging, we included these variables a priori. The remainder of the predictors we selected using AICc. For consistency, however, we wished to fit structurally identical models to each set of species/process submodels. We examined the AICc for each set of models and subjectively chose what seemed to be the best-fitting overall model. Due to the extremely long tails of the count predictor, we then considered adding either an un-transformed or a log-transformed count variable to the best-fitting null model. In most cases there was no discernible difference in AICc between the two, but in several cases the un-transformed variable was selected. For consistency, despite the long tails, we used the un-transformed version to conduct the remainder of our analysis.
The classical form of the PH model uses a partial likelihood function and does not specify the baseline hazard function, but in order to simulate from our bootstrapped distributions we assumed a constant baseline hazard λ 0 (t) = λ 0 . This was visually consistent with the empirical hazard functions. However, since there is a small but non-trivial minimum time to complete each task, the empirical hazard functions are zero for a short period of time. To account for this discrepancy, we truncated each fish's timeline by a constant amount (per model) such that the minimum passage time was less than 1 minute, effectively assuming a piecewise constant baseline hazard function with the first section fixed at zero hazard. In several models there were outlier individuals detected as passing through the fishway in as little as 2 minutes. Since such passage times are implausible (fast passage times are ≈ 1 hour), we attribute this to data error and remove all records which show passage in <10 minutes. This approach conservatively includes several extraordinarily fast passage times of ≈ 30 minutes.
For each model, we dropped any fish with censored measurements. Censoring is a standard concern for time-to-event models, and occurs when an event is not observed to occur because an individual leaves observation. This could occur when a fish passed but was not detected by a telemetry antenna, or because a fish died in a tailrace. Incorrect assumptions regarding censoring are known to bias inference in PH models [5]. We did not account for censoring because, for each system, we used essentially only two measurements for each fish: "entry" and "exit." A censoring analysis would require follow-up measurements between entry and exit to verify that an individual is still being detected but has not yet exited. Although the aerial tailrace antenna misses a large number of measurements, the underwater fishway antennas miss far fewer. Only a very small portion (≈ 1%) of fish are observed to enter but not exit the fishway. Since a very similar portion of fish are observed to exit but not enter, we attribute this censoring almost entirely to randomly missed telemetry detections rather than mortality. Since individuals were therefore dropped from the dataset independently of hazard we lose little information and incur very little bias in this manner.
Finally, we conducted a parametric bootstrap test on the count coefficient to detect density effects, using our best-fitting null models. Since these models do not contain reverse causality, they provide valid, unbiased fits under the null hypothesis. To each simulation, we fit a model with an effect of count. Assuming our null model is correctly specified, comparing our fitted count coefficients to these null distributions provide valid p-values to test for density effects. We do not Bonferroni correct these p-values because by the nature of our split models a single false-positive result is not statistically convincing, as we note when discussing our two marginally significant results. Instead we demand consistency between matching models, and indeed we found that several models were consistently highly significant (and would remain so even with a Bonferroni correction).
To ensure that our null models were well-specified, we conducted standard model diagnostics. For our PH models we tested the proportional hazards assumption using the cox.zph function in the survival package [8], applied to the partial-likelihood model fits. Since the cox.zph function tests the proportional hazards assumption for each covariate as well as globally for each model we made 36 tests. Using a Bonferroni correction, this test identified strong differences (corrected p < 10 −5 ) in baseline hazard functions between dams, likely caused by differences in the layout of the monitoring arrays. We thus split our models by dam. Retesting these sub-models, we made an additional 59 tests. These tests caused us to split our Chinook ladder models by run (p < 10 −5 ) and to split our sockeye.ladder.TD models by site (p < 0.005; East (TDE) vs. North (TDN)). Since all of these were individuallevel variables, we could not assign interactions with time, since this could (and did) result in predictions of effectively infinite survival time. We also added interaction effects between temperature and time (p < 0.001) in all of the Chinook ladder models. Finally we made an additional 41 tests after conducting this second level of model-splitting, with no significant results. After splitting, the Chinook ladder models each contained between 330 and 440 fish. All other models contained over 500 fish, with the exception of sockeye.ladder.TDN which had only 59. Our bootstrap simulations frequently produced ill-conditioned fits for this model, and we subsequently removed it from our analysis. A diagram of our analysis can be found in Figure 1.
For the commit models we fit one logistic regression model per species, without splitting by dam. We plotted binned residuals from the null model, which motivated the addition of a threshold variable (count ≤150) to the Chinook model, due to substantially higher-thanexpected commit rates for these low-density fish. Of these 89 individuals, 88 of them were detected at The Dalles's east ladder. Elevated commit probability was not observed at TDE at higher densities.
Although we hypothesized positive density effects, we conducted two-sided tests in all cases to allow for the possibility of negative effects (e.g. overcrowding in the fishway).
Because of the reverse causality in our system, the null modeling approach we took was the most reliable way we knew to detect a density effect. However, this approach provides only p-values, and not confidence intervals for the true magnitude. In order to quantify magnitude, we conducted another parametric bootstrap to produce 95% confidence intervals for the count coefficient. To produce these intervals we sampled from the full model (i.e. the model with a count coefficient) and fit to these simulations new count coefficients. We calculate the confidence intervals using the hybrid method, recommended in [2] for use when estimating regression coefficients in Cox models. In all cases, the bootstrapped distribution was symmetric and centered on the fitted coefficient. This secondary bootstrap also roughly reproduced the p-values of our primary bootstrap. These two metrics lend credibility to this bootstrap, despite possible biases that might enter due to reverse causality. Thus, we also report approximate 95% confidence intervals for the magnitude of the density coefficient, calculated from these bootstrapped distributions. All bootstraps included >2000 simulations.
To speed up calculation we did bootstrap simulations in parallel using 24 cores of a 48 core Windows compute server. We parallelized using the R package parallel [7] and set our random seeds using [6]'s method. Commit East Figure 1: A diagram of the models we fit. Originally we planned to fit the 6 models shown in blue on the left. However, due to model diagnostic violations (specifically violations detected by cox.zph) we were forced to split further to allow baseline hazard functions to differ between factors. These more specific final models are shown in red. Our final set of models which passed all model diagnostics are shown in green.

Model Summaries
Below, we have included the model output for our final selected models. For the three models where a highly significant density effect was detected we include that density effect. Otherwise we include the best-fitting null model.