RADseq data from Atlantic silversides used for linkage and QTL mapping from 2017-05-01 to 2018-05-09 (NCEI Accession 0292182)
This dataset contains biological data collected from 2017-05-01 to 2018-05-09. These data include taxon. The instruments used to collect these data include Automated DNA Sequencer. These data were collected by Nina Overgaard Therkildsen of Cornell University and Hannes Baumann of University of Connecticut as part of the "Collaborative research: The genomic underpinnings of local adaptation despite gene flow along a coastal environmental cline (GenomAdapt)" project. The Biological and Chemical Oceanography Data Management Office (BCO-DMO) submitted these data to NCEI on 2024-04-24.
The following is the text of the dataset description provided by BCO-DMO:
Dataset Description:
* Raw data from the RADseq libraries are available under NCBI BioProject accession number PRJNA771889 (see related dataset section).
* SNP genotype call files (VCF format) are available at https://doi.org/10.6084/m9.figshare.19521955.v1 (see related dataset section) and as supplemental files to this dataset.
Methods and Sampling:
We generated three crosses for linkage mapping, including two F1 families resulting from reciprocal crossing of wild-caught silversides from two adaptively divergent parts of the distribution range (Georgia and New York), and one F2 family from intercrossing laboratory-reared progeny from one of the F1 families. Because linkage mapping measures recombination during gamete production in the parents, the F1 families give us separate information about the wild-caught male and female founder fish from each separate population (the F0 progenitors), and the F2 map reflects recombination in the hybrid F1 progeny.
In the spring of 2017, spawning ripe founders were caught by beach seine from Jekyll Island, Georgia (31°03’N, 81°26’W) and Patchogue, New York (40°45’N, 73°00’W) and transported live to the Rankin Seawater Facility at University of Connecticut's Avery Point campus. For each family, we strip-spawned a single male and a single female onto mesh screens submerged in seawater-filled plastic dishes, then transferred the fertilized embryos to rearing containers (20 L) placed in large temperature-controlled water baths with salinity (30 psu) and photoperiod held constant (15 L:9 D). Water baths were kept at 20°C for the New York mother and at 26°C for Georgia mother families, which increased hatching success by mimicking the ambient spawning temperatures at the two different latitudes. Post hatch, larvae were provided ad libitum rations of newly hatched brine shrimp nauplii (Artemia salina, brineshrimpdirect.com). At 22 days post hatch (dph), we sampled 138 full-sib progeny from each of the two F1 families to be genotyped. The remaining offspring from the Georgia-mother F1 family were reared to maturity in groups of equal density (40–50 individuals) in 24°C water baths. In spring 2018, one pair of adult F1 siblings from the Georgia family were intercrossed to generate the F2 mapping population. At 70 dph, we sampled 221 full-sib F2 progeny for genotyping. In total, we analyzed 503 individuals: the two founders (male and female) and 138 offspring from each of the two F1 families, plus two additional F1 siblings from the Georgia mother F1 family and their 221 F2 offspring. All animal care and euthanasia protocols were carried out in accordance with the University of Connecticut's Institutional Animal Care and Use Committee (A17-043).
We extracted DNA from each individual with a Qiagen DNeasy tissue kit following the manufacturer's instructions and used double-digest restriction-site associated DNA (ddRAD) sequencing (Peterson et al., 2012) to identify and genotype single nucleotide polymorphisms (SNPs) for linkage map construction. We created two ddRAD libraries, each with a random subset of ~250 barcoded individuals, using restriction enzymes MspI and PstI (New England BioLabs cat. R0106S and R3140S, respectively), following library construction steps as in Peterson et al. (2012). We size-selected libraries for 400– 650 bp fragments with a Pippin Prep instrument (Sage Science) and sequenced the libraries across six Illumina NextSeq500 lanes (75 bp single- end reads) at the Cornell Biotechnology Resource Centre. Raw reads were processed in Stacks v2.53 (Catchen et al., 2013) with the module process_radtags to discard low-quality reads and reads with ambiguous barcodes or RAD cut sites. The reads that passed the quality filters were demultiplexed to individual fastq files. To capture genomic regions potentially not included in the current reference genome assembly, we ran the ustacks module to assemble RAD loci de novo (rather than mapping to the reference genome). We required a minimum of three raw reads to form a stack (i.e., minimum read depth, default -m option) and allowed a maximum of four mismatches between stacks to merge them into a putative locus (-M option).
Because the founders contain all the possible alleles that can occur in the progeny (except from any new mutations), we assembled a catalogue of loci with cstacks using only the four wild-caught F0 progenitors. We built the catalogue with both sets of founders to allow cross-referencing of common loci across the resulting F1 maps and we allowed for a maximum of four mismatches between loci (-n option). We matched loci from all progeny against the catalogue with sstacks, transposed the data with tsv2bam to be organized by sample rather than locus, called variable sites across all individuals, and genotyped each individual at those sites with gstacks using the default SNP model (marukilow) with a genotype likelihood ratio test critical value (α) of 0.05. Finally, we ran the populations module three times to generate a genotype output file for each mapping cross. For each run of populations, we specified the type of test cross (-- map- type option cp or F2), pruned unshared SNPs to reduce haplotype-wise missing data (-H option), and exported loci present in at least 80% of individuals in that cross (-r option) to a VCF file, without restricting the number of SNPs retained per locus.
The following is the text of the dataset description provided by BCO-DMO:
Dataset Description:
* Raw data from the RADseq libraries are available under NCBI BioProject accession number PRJNA771889 (see related dataset section).
* SNP genotype call files (VCF format) are available at https://doi.org/10.6084/m9.figshare.19521955.v1 (see related dataset section) and as supplemental files to this dataset.
Methods and Sampling:
We generated three crosses for linkage mapping, including two F1 families resulting from reciprocal crossing of wild-caught silversides from two adaptively divergent parts of the distribution range (Georgia and New York), and one F2 family from intercrossing laboratory-reared progeny from one of the F1 families. Because linkage mapping measures recombination during gamete production in the parents, the F1 families give us separate information about the wild-caught male and female founder fish from each separate population (the F0 progenitors), and the F2 map reflects recombination in the hybrid F1 progeny.
In the spring of 2017, spawning ripe founders were caught by beach seine from Jekyll Island, Georgia (31°03’N, 81°26’W) and Patchogue, New York (40°45’N, 73°00’W) and transported live to the Rankin Seawater Facility at University of Connecticut's Avery Point campus. For each family, we strip-spawned a single male and a single female onto mesh screens submerged in seawater-filled plastic dishes, then transferred the fertilized embryos to rearing containers (20 L) placed in large temperature-controlled water baths with salinity (30 psu) and photoperiod held constant (15 L:9 D). Water baths were kept at 20°C for the New York mother and at 26°C for Georgia mother families, which increased hatching success by mimicking the ambient spawning temperatures at the two different latitudes. Post hatch, larvae were provided ad libitum rations of newly hatched brine shrimp nauplii (Artemia salina, brineshrimpdirect.com). At 22 days post hatch (dph), we sampled 138 full-sib progeny from each of the two F1 families to be genotyped. The remaining offspring from the Georgia-mother F1 family were reared to maturity in groups of equal density (40–50 individuals) in 24°C water baths. In spring 2018, one pair of adult F1 siblings from the Georgia family were intercrossed to generate the F2 mapping population. At 70 dph, we sampled 221 full-sib F2 progeny for genotyping. In total, we analyzed 503 individuals: the two founders (male and female) and 138 offspring from each of the two F1 families, plus two additional F1 siblings from the Georgia mother F1 family and their 221 F2 offspring. All animal care and euthanasia protocols were carried out in accordance with the University of Connecticut's Institutional Animal Care and Use Committee (A17-043).
We extracted DNA from each individual with a Qiagen DNeasy tissue kit following the manufacturer's instructions and used double-digest restriction-site associated DNA (ddRAD) sequencing (Peterson et al., 2012) to identify and genotype single nucleotide polymorphisms (SNPs) for linkage map construction. We created two ddRAD libraries, each with a random subset of ~250 barcoded individuals, using restriction enzymes MspI and PstI (New England BioLabs cat. R0106S and R3140S, respectively), following library construction steps as in Peterson et al. (2012). We size-selected libraries for 400– 650 bp fragments with a Pippin Prep instrument (Sage Science) and sequenced the libraries across six Illumina NextSeq500 lanes (75 bp single- end reads) at the Cornell Biotechnology Resource Centre. Raw reads were processed in Stacks v2.53 (Catchen et al., 2013) with the module process_radtags to discard low-quality reads and reads with ambiguous barcodes or RAD cut sites. The reads that passed the quality filters were demultiplexed to individual fastq files. To capture genomic regions potentially not included in the current reference genome assembly, we ran the ustacks module to assemble RAD loci de novo (rather than mapping to the reference genome). We required a minimum of three raw reads to form a stack (i.e., minimum read depth, default -m option) and allowed a maximum of four mismatches between stacks to merge them into a putative locus (-M option).
Because the founders contain all the possible alleles that can occur in the progeny (except from any new mutations), we assembled a catalogue of loci with cstacks using only the four wild-caught F0 progenitors. We built the catalogue with both sets of founders to allow cross-referencing of common loci across the resulting F1 maps and we allowed for a maximum of four mismatches between loci (-n option). We matched loci from all progeny against the catalogue with sstacks, transposed the data with tsv2bam to be organized by sample rather than locus, called variable sites across all individuals, and genotyped each individual at those sites with gstacks using the default SNP model (marukilow) with a genotype likelihood ratio test critical value (α) of 0.05. Finally, we ran the populations module three times to generate a genotype output file for each mapping cross. For each run of populations, we specified the type of test cross (-- map- type option cp or F2), pruned unshared SNPs to reduce haplotype-wise missing data (-H option), and exported loci present in at least 80% of individuals in that cross (-r option) to a VCF file, without restricting the number of SNPs retained per locus.
Dataset Citation
- Cite as: Therkildsen, Nina Overgaard; Baumann, Hannes (2024). RADseq data from Atlantic silversides used for linkage and QTL mapping from 2017-05-01 to 2018-05-09 (NCEI Accession 0292182). [indicate subset used]. NOAA National Centers for Environmental Information. Dataset. https://www.ncei.noaa.gov/archive/accession/0292182. Accessed [date].
Dataset Identifiers
ISO 19115-2 Metadata
gov.noaa.nodc:0292182
Download Data |
|
Distribution Formats |
|
Ordering Instructions | Contact NCEI for other distribution options and instructions. |
Distributor |
NOAA National Centers for Environmental Information +1-301-713-3277 NCEI.Info@noaa.gov |
Dataset Point of Contact |
NOAA National Centers for Environmental Information ncei.info@noaa.gov |
Time Period | 2017-05-01 to 2018-05-09 |
Spatial Bounding Box Coordinates |
West: -81.43
East: -73
South: 31.02
North: 40.75
|
Spatial Coverage Map |
General Documentation |
|
Associated Resources |
|
Publication Dates |
|
Data Presentation Form | Digital table - digital representation of facts or figures systematically displayed, especially in columns |
Dataset Progress Status | Complete - production of the data has been completed Historical archive - data has been stored in an offline storage facility |
Data Update Frequency | As needed |
Purpose | This dataset is available to the public for a wide variety of uses including scientific research and analysis. |
Use Limitations |
|
Dataset Citation |
|
Cited Authors | |
Principal Investigators | |
Contributors | |
Resource Providers | |
Points of Contact | |
Publishers | |
Acknowledgments |
Use Constraints |
|
Data License | |
Access Constraints |
|
Fees |
|
Lineage information for: dataset | |
---|---|
Processing Steps |
|
Output Datasets |
|
Last Modified: 2024-05-31T15:15:28Z
For questions about the information on this page, please email: ncei.info@noaa.gov
For questions about the information on this page, please email: ncei.info@noaa.gov