Common field data limitations can substantially bias sexual selection metrics

Enviar para

Título:
Common field data limitations can substantially bias sexual selection metrics
Autor: Cramer, Emily ; Kaiser, Sara Ann ; Webster, Mike S ; Ryder, T Brandt
Assuntos: Bateman gradient ; Behavior: evolution ; Behavior: reproductive ; Evolution: social ; extra-pair paternity (EPP) ; Jones index ; Methods: computer simulations ; Methods: experimental design ; opportunity for selection ; Opportunity for sexual selection ; parentage analysis ; Reproduction: sperm competition ; s'max
Notas: 10.5061/dryad.8kprr4xjk
RelationTypeNote: IsSupplementedBy -- 10.5061/dryad.8kprr4xjk
Empirical data were gathered from the published literature and by collecting additional unpublished information from published datasets.
10.1086/709547
This data supplement contains 7 items, with dual goals of providing all the code needed to replicate the results in the manuscript and providing code to researchers who want to explore bias in their own study systems. Note that results for individual study systems should be viewed with caution, as assumptions about the exact mechanisms of extra-pair copulations and fertilizations may not be appropriate for a particular study system. The utility for other researchers to explore their own study systems was a secondary goal. Thus the code is not streamlined for this application. Suggestions for how to approach this are at the end of this document. We provide: 1. Simulation code used to generate populations from multiple different species, with flexibility in the number of data limitations and their degree (MethodsMSSimCode_20203027.R). Within the simulation, each population's Bateman gradients are calculated and saved, as are data needed to calculate the other metrics. Other information about the population is not stored. The simulation code outputs four dataframes that are intended to be written as .csv files, to be re-opened and analysed separately. "GrandvarianceTerms" includes some descriptive information about the study population (number of extra-pair offspring), as well as true and detected variance in reproductive and copulation success. It is one row per simulation iteration. "GrandGradientSummary" contains the model output for eight Bateman gradients per simulation iteration (true and detected gradients for each of two sexes, with both a raw and a relativized form). The different types of Bateman gradient are identified in a column named "Model" with levels: FemaleBatemanApparent, MaleBatemanApparent, FemaleBatemanTrue, MaleBatemanTrue, MaleBatemanDetRel_Terr, MaleBatemanTrue_Both, FemaleBatemanApparentRel, and FemaleBatemanTrueRel. The latter four are the relativized forms of the gradient. "GrandPopSUmmaryTot" summarizes, for each replicate population, the number of extra-pair males that sired offspring in each brood. Length of the output will vary. "GrandPopSummaryNEPTot" summarizes, for each replicate population, the number of extra-pair offspring for each brood size. Length of the output will vary. 2. Though they are not used in the main study, we include two additions relevant for researchers wanting to model their own systems. First, you can add random noise to your estimates of m and s by un-commenting the code at lines 90-91 and 153-154 (consider commenting out lines 102 and 103). This incorporates up to 10% variation in m and s (and can easily be adjusted to add a different amount of noise). Second, you can incorporate spatial limitations. Because this was a longer piece of code, we have supplied it as a separate file (ModificationsToIncludeSpatialImpacts.R), which can be used to replace the code at lines 149-192 in the main simulation code. 3. Code used to analyze the output from the simulations (ExaminingSims_Methods_20200327_commented.R). The code for statistical analysis and visualization starts by re-shaping the data and combining content from the GrandVarianceTerms and GrandGradientSummary dataframes in order to make a dataframe for s'max (which is then written as a file and re-read from the file for convenience). The processing includes a number of steps that may not be useful for examining one's own study population, but that were needed for our analyses. We have erred on the side of including more than necessary, and noting places where we expect most readers would prefer to comment out the code (below in specific instructions for simulating individual populations). For the analyses presented in the paper, there are three main dataframes for statistics (ie., that contain the mean values across replicate iterations of the simulation) and three main dataframes with information on the individual iterations, which were used in preparing figures. Dataframes with the word Variance in the name are optimized for opportunity for selection. Dataframes with Gradients in the name are for the Bateman gradient. Dataframes with JonesIndex in the name are for s'max, the relativized Bateman gradient, and opportunity for sexual selection. For each separate analytical question, the main dataframe is filtered to contain only one source of data limitation at a time. Starting at line 390, this code begins the analysis presented in the main paper. Analysis follows the overall structure in the paper: first looking at basic simulation conditions, then at infertility, then predation, etc. Each of these major subdivisions has a header in the code, to facilitate quick access. Within each major heading, analysis is first from the variance dataframe (i.e., opportunity for selection), then from the gradients dataframe (Bateman gradients), then from the JonesIndex dataframe (first s'max then opportunity for sexual selection). Within each response variable, first the dataframe is filtered for graphing and analysis. Then there is the main statistical analysis, then calculation of Cohen's d, then graphing. Note that the structure of the data was different in the variance data frame (where a subset = statement is needed for the analysis to include only opportunity for selection) compared to the others (where each column represents a separate response variable and no subsetting is required) The overall approach was to save output from all the statistical tests into dataframes (one for ANOVA and one for parameter estimats), consolidate them, and construct output tables (starting line 2321). Finally, example code for visualizing how extra-pair offspring and extra-pair sires are distributed across broods begins at line 2371. This section is not exhaustive, as it is only supplementary to the main paper. 4 and 5. Our raw data input files, named "BrommerEtAlRawDataFile.csv" and "BrommerSpeciesOverViews.csv". The former contains three columns with no headers, and one row per nest. The first column gives a population identifying number. The second is the number of extra-pair offspring in the nest. The third is the brood size. The latter file contains a summary, one row per population (including all the populations from Brommer et al. 2010, of which we used only 4). It has column headers. The columns are: Population (a population identifying number, to match with the BrommerSpeciesOverViews.csv file), species (an abbreviated form of the common name), m (described in the paper; mean of the Poisson distribution for number of extra-pair copulations), s (described in the paper; the sperm competition paramter), N (the number of nests used in estimating m and s), PctEPY (the empirical estimate of the percent of offspring sired by extra-pair males), PctEPNests (the percent of nests containing at least one extra-pair offspring), and ApproxAveClutchSize (the average brood size, estimated from the same dataset used in estimating m and s), and Latin (the scientific name of the species, with genus and specific epithet joined by an underscore; note that this uses older nomenclature for compatibility with BirdTree.org). For most users, only Population, species, m, and s are necessary; other columns were used in down-stream analysis and are not needed in the simulation. 6 and 7. For researchers wanting to evaluate their own study system, we include our modification of Brommer et al.'s code for estimating m and s. This consists of two files; one is R code (FindingMandS.R) and the other is a .txt file specifying their Bayesian model (BrommerEtAlModel_Mod.txt.; simplfied to handle one species at a time). These files should be in the same directory when you run the code. The R code works by calling the Bayesian model. The input file is two columns with headers "epy.obs" and "brood.size", and one row per nest. For each nest, the number of extra pair offspring and the brood size is given. Advice for researchers wanting to examine their own study system: First estimate m and s. (FindingMandS.R) Next, for easiest use in our code, slightly modify your input file for estimating m and s to create one of the required input files (equivalent to BrommerEtAlRawDataFile.csv). It should be one row per brood, with no column headers. Column 1 is the species or population identification number. Column 2 is the number of extra-pair offspring. Column 3 is the brood size. These are named early in the R code. The second data file (BrommerSpeciesOverViews.csv) must include the same species or population identification number, m, and s. Follow the column names given above, or use our file as a template. Next, update the simulation code (MethodsMSSimCode_20203027.R) to your desired settings. You may also update your input file names (L 9 and 12) Update varibles: NSims (L72) for the number of replicates per simulation condition. PopsOfInterest (L73), a vector of the individual species to be considered (here labelled as populations; terminology inherited from Brommer et al. 2010, which examined several separate populations per species; species and population are redundant in this code). TotalInfertilityList (L79): a COUNT of the number of males infertile with fertile sperm precedence, assuming a population of 400 males PartialInfertilityList (L80): a COUNT of the number of males infertile without fertile sperm precedence, assuming a population of 400 males PredationCountList (L81): a COUNT of the number of nests depredated (whole-brood depredation), assuming 400 nests initially UnbandedMaleList (L82): a COUNT of the number of territorial males not sampled, assuming 400 males in the population UnfoundNestList (L83): a COUNT of the number of nests not sampled, assuming 400 nests in the population FloaterMalesList (L84): a COUNT of the nu
Descrição: Sexual selection studies widely estimate several metrics, but values may be inaccurate because standard field methods for studying wild populations produce limited data (e.g., incomplete sampling, inability to observe copulations directly). We compared four selection metrics (Bateman gradient, opportunity for sexual selection, opportunity for selection, and s’max) estimated with simulated complete and simulated limited data for 15 socially monogamous songbird species with extra-pair paternity (4-54% extra-pair offspring). Inferring copulation success from offspring parentage creates non-independence between these variables and systematically underestimates copulation success. We found that this introduces substantial bias for the Bateman gradient, opportunity for sexual selection, and s’max. Notably, 47.5% of detected Bateman gradients were significantly positive for females, suggesting selection on females to copulate with multiple males, though the true Bateman gradient was zero. Bias generally increased with the extent of other sources of data limitations tested (nest predation, male infertility, and unsampled floater males). Incomplete offspring sampling introduced bias for all metrics except the Bateman gradient, while incomplete sampling of extra-pair sires did not introduce additional bias when sires were a random subset of breeding males. Overall, our findings demonstrate how biases due to field data limitations can strongly impact the study of sexual selection.
Editor: Dryad
Data de criação/publicação: 2020
Idioma: Inglês

Links

org

Common field data limitations can substantially bias sexual selection metrics

Cramer, Emily ; Kaiser, Sara Ann ; Webster, Mike S ; Ryder, T Brandt

Dryad 2020

Buscando em bases de dados remotas. Favor aguardar.