Monday, November 1, 2010

Oceanoraphic preferences of Atlantic bluefin tuna, Thunnus thynnus, on their Gulf of Mexico breeding grounds

Authors: S. Teo, A. Boustany, and B. Block, 2007.

This paper is a statistical follow-up to the paper Annual migrations.... (reviewed below.) The methodology is rather clever. The brute observables are the same as before, i.e. temperature, depth, and location from the pop up archival tags of 28 tuna. To analyze to what extent the inferred habitat preferences represent real preferences, as opposed to stochastic fluctuations, the authors plumb a number of databases containing such oceanographic information as wind speed, temperature, etc. for the entire Gulf. At any given point $(x,y,z)$ within the Gulf proper, they "sample" the published data by convolving it with a Gaussian kernel, thus mitigating observation and sensing errors. Monte Carlo sample paths are generated on a fish-by-fish basis by a quasi-Brownian process in which total daily movement corresponds exactly to the recorded total daily movement of the fish, but the direction of the movement is random. The environmental profiles along these paths describe the set of all possible habitats.

The authors use two models to assess habitat preferences. The first involves the Chesson Preference Index, which for a single variable (e.g. temperature) is calculated as

$ \alpha_i = \frac{(o_i/\pi_i)}{\sum_j^n (o_j/\pi_j)}$,

where $o_i$ is the observed sample proportion of used units in the $i$th habitat type, and $\pi_i$ is the sample proportion of available units (with the later calculated via the monte carlo simulations described above.) Note that the precise value of the index depends on the Monte Carlo simulation--the authors generated 10,000 sets of 10,000 paths and calculated the CPI for each, thus generating a histogram of CPI's for each environmental variable.

The problem with the CPI approach is that a) it uses bins, which create artifacts, and b)it leverages Monte Carlo techniques, which may result in false positives. As a consequence,the authors evoke a discrete choice model to calculate the resource selection function, defined as the numerator of the expression

$p(i) = \frac{\Pi_{j=1}^p e^{\beta_i x_{ij}}}{\sum_{k = U'\cup A} \Pi_{j=1}^p e^{\beta_i x_{kj}}$,

where the $\beta_i$ are coefficients to be estimated, the $x_{ij}$ represent the value of oceanogrpahic parameter $i$ in area $j$, $U'$ is the set of used areas and $A$ is the set of total areas in the Monte Carlo sample. The $\beta$s are calculated by minimizing the likelihood function

$L(\beta_1, \cdots, \beta_p) = \Pi_{j=n_1}^{n_u} p(j)$,

a task the authors tackled with the help of the Cox proportional hazards function.

The upshot of the analysis is that the histograms generated by merely binning the observables were pretty close to those generated by this convoluted statistical analysis, with one or two exceptions.

No comments:

Post a Comment