Dataset Overview | National Centers for Environmental Information (NCEI)

Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP) location information on samples obtained on Gould (LMG1411) in the Western Antarctica Peninsula during 2014 (Polar Transcriptomes project) (NCEI Accession 0278260)

Graphic not available.

This dataset contains biological and survey - biological data collected on ARSV Laurence M. Gould during cruise LMG1401 from 2014-01-01 to 2014-12-31. These data include genus, phylum, species, and taxon. The instruments used to collect these data include Bioanalyzer and Inverted Microscope. These data were collected by Dr Adrian Marchetti of University of North Carolina at Chapel Hill as part of the "Iron and Light Limitation in Ecologically Important Polar Diatoms: Comparative Transcriptomics and Development of Molecular Indicators (Polar_Transcriptomes)" project. The Biological and Chemical Oceanography Data Management Office (BCO-DMO) submitted these data to NCEI on 2019-04-17.

The following is the text of the dataset description provided by BCO-DMO:

MMETSP location information on samples obtained on LMG1411.

Dataset Description:
MMETSP location information on samples obtained on LMG1411.

Diatom isolates were obtained from the Western Antarctic Peninsula surface waters.

Dataset Citation

Cite as: Marchetti, Adrian (2023). Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP) location information on samples obtained on Gould (LMG1411) in the Western Antarctica Peninsula during 2014 (Polar Transcriptomes project) (NCEI Accession 0278260). [indicate subset used]. NOAA National Centers for Environmental Information. Dataset. https://www.ncei.noaa.gov/archive/accession/0278260. Accessed [date].

Dataset Identifiers

ISO 19115-2 Metadata

gov.noaa.nodc:0278260

Full Text · XML

Download Data	HTTPS (download) Navigate directly to the URL for data access and direct download. FTP (download) These data are available through the File Transfer Protocol (FTP). FTP is no longer supported by most internet browsers. You may copy and paste the FTP link to the data into an FTP client (e.g., FileZilla or WinSCP).
Distribution Formats	TSV
Ordering Instructions	Contact NCEI for other distribution options and instructions.
Distributor	NOAA National Centers for Environmental Information +1-301-713-3277 NCEI.Info@noaa.gov
Dataset Point of Contact	NOAA National Centers for Environmental Information ncei.info@noaa.gov

Time Period	2014-01-01 to 2014-12-31
Spatial Bounding Box Coordinates	West: 58.853 East: 132.4 South: -77.8333 North: 77.8333
Spatial Coverage Map

General Documentation

NCEI Dataset Landing Page
Navigate directly to the URL for a descriptive web page with download links.
Descriptive Information
Navigate directly to the URL for a descriptive web page with download links.

Associated Resources

Biological, chemical, physical, biogeochemical, ecological, environmental and other data collected from around the world during historical and contemporary periods of biological and chemical oceanographic exploration and research managed and submitted by the Biological and Chemical Oceanography Data Management Office (BCO-DMO)
- NCEI Collection
  Navigate directly to the URL for data access and direct download.
Marchetti, A. (2016) Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP) location information on samples obtained on Gould (LMG1411) in the Western Antarctica Peninsula during 2014 (Polar Transcriptomes project). Biological and Chemical Oceanography Data Management Office (BCO-DMO). Dataset version 2016-11-21. https://doi.org/10.1575/1912/bco-dmo.665427.1
- https://doi.org/10.1575/1912/bco-dmo.665427.1 (download)
  originator dataset

gov.noaa.nodc:BCO-DMO

Publication Dates	publication: 2023-05-15
Data Presentation Form	Digital table - digital representation of facts or figures systematically displayed, especially in columns
Dataset Progress Status	Complete - production of the data has been completed Historical archive - data has been stored in an offline storage facility
Data Update Frequency	As needed
Supplemental Information	Acquisition Description: Nine species of diatoms were isolated from the Western Antarctic Peninsula along the PalmerLTER sampling grid in 2013 and 2014. Isolations were performed using an Olympus CKX41 inverted microscope by single cell isolation with a micropipette (Anderson 2005). Diatom species were identified by morphological characterization and 18S rRNA gene (rDNA) sequencing. DNA was extracted with the DNeasy Plant Mini Kit according to the manufacturer’s protocols (Qiagen). Amplification of the nuclear 18S rDNA region was achieved with standard PCR protocols using eukaryotic-specific, universal 18S forward and reverse primers. Primer sequences were obtained from Medlin et al. (1982). The length of the region amplified is approximately 1800 base pairs (bp). Pseudo-nitzschia species are often difficult to identify by their 18S rDNA sequence, therefore, additional support of the taxonomic identification of P. subcurvata was provided through sequencing of the 18S-ITS1-5.8S regions. Amplification of this region was performed with the 18SF-euk and 5.8SR_euk primers of Hubbard et al. (2008). PCR products were purified using either QIAquick PCR Purification Kit (Qiagen) or ExoSAP-IT (Affymetrix) and sequenced by Sanger DNA sequencing (Genewiz). Sequences were edited using Geneious Pro software ( http://www.geneious.com , Kearse et al., 2012) and BLASTn sequence homology searches were performed against the NCBI nucleotide non-redundant (nr) database to determine species with a cutoff identity of 98%. Diatom phylogenetic analysis was performed with Geneious Pro and included 71 additional diatom 18S rDNA sequences from publically available genomes and transcriptomes, including those in the MMETSP database. Diatom sequences were trimmed to the same length and aligned with MUSCLE (Edgar 2004). A phylogenetic tree was created in Mega with the Maximum-likelihood method of tree reconstruction, the Jukes-Cantor genetic distance model (Jukes and Cantor 1969), and 100 bootstrap replicates. Illumina TruSeq adapters and poly-A tails were trimmed from raw reads using the Fastx_toolkit clipper function. Fastq_quality_filter was used to remove poor quality sequences, such that remaining sequences had a minimum quality score of 20 with a minimum of 80% of bases within a read meeting this quality score requirement. Any remaining raw sequences less than 50 base pairs in length were also removed. Merged files were assembled de novo using Trinity (Grabherr et al. 2011). The resulting assembly was filtered to remove contigs less than 200 bp in length. Trinity-assembled contigs which exhibited sequence overlap were grouped into isogroups which were then used for sequence homology searches (BLASTx E-value ≤ 10-4) against the Kyoto Encyclopedia of Genes and Genomes (KEGG) databases (Kanehisa 2006). BUSCO (Benchmarking Universal Single-Copy Orthologs) was used to assess the completeness of genomes and transcriptomes based on sets of single copy orthologous groups derived from OrthoDB that are highly conserved within multiple lineages (Felipe et al. 2015). Completed, duplicated and fragmented orthologs were determined by meeting an ‘expected score’ and having aligned sequences within two standard deviations of the BUSCO gene’s length. A second metric of completeness was performed by evaluating conserved pathways, such as the ribosome and spliceosome, using the single-directional best-hit method in the KEGG Automatic Annotation Server (KAAS) (Moriya et al. 2007). Finally contiguity, was calculated at the 0.75 level as according to Martin and Wang (2011) with custom scripts. For each transcriptome, unassembled sequence reads were aligned to the final Trinity assembly using Bowtie 2 (Langmead 2012). Mapped reads were normalized by the Reads per Kilobase per Million reads method (RPKM) (Mortazavi et al. 2008). Gene biogeographical distributions - 20 genes of interest were selected in the study to investigate the molecular basis of iron and light limitation in polar diatoms. Reference sequences for each of these genes were obtained from the F. cylindrus and P. tricornutum JGI genome portals and T. pseudonana and T. oceanica NCBI and GenBank repositories. Reference sequences were identified in the transcriptomes by translated nucleotide homology searches (tBLASTn) with an e-value cutoff of <10-5. A reciprocal tBLASTn homology search was performed for each transcriptome against the KEGG GENES database, using the single-directional best-hit method in the KAAS online tool to ensure consistent gene annotations (Moriya et al. 2007). Subsequently, reference sequences were identified in the MMETSP protein database by BLASTp (e-value <10-5) homology searches among the diatom transcriptomes. The transcriptomes and their associated latitude and longitude were obtained from iMicrobe Data Commons (Project Code CAM_P_0001000) and the National Center for Marine Algae and Microbiota (NCMA). Custom Matlab scripts allowed global biogeographical distribution of key genes of interest to be mapped.
Purpose	This dataset is available to the public for a wide variety of uses including scientific research and analysis.
Use Limitations	accessLevel: Public Distribution liability: NOAA and NCEI make no warranty, expressed or implied, regarding these data, nor does the fact of distribution constitute such a warranty. NOAA and NCEI cannot assume liability for any damages caused by any errors or omissions in these data. If appropriate, NCEI can only certify that the data it distributes are an authentic copy of the records that were accepted for inclusion in the NCEI archives.

Dataset Citation	Cite as: Marchetti, Adrian (2023). Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP) location information on samples obtained on Gould (LMG1411) in the Western Antarctica Peninsula during 2014 (Polar Transcriptomes project) (NCEI Accession 0278260). [indicate subset used]. NOAA National Centers for Environmental Information. Dataset. https://www.ncei.noaa.gov/archive/accession/0278260. Accessed [date].
Cited Authors	Marchetti, Adrian University of North Carolina at Chapel Hill
Principal Investigators	Adrian Marchetti University of North Carolina - Chapel Hill (UNC)
Contributors	University of North Carolina - Chapel Hill (UNC)
Resource Providers	Biological and Chemical Oceanography Data Management Office (BCO-DMO)
Points of Contact	Biological and Chemical Oceanography Data Management Office (BCO-DMO)
Publishers	NOAA National Centers for Environmental Information
Acknowledgments	Funding provided by NSF Division of Ocean Sciences (NSF OCE) Award Number: PLR-1341479 Award URL: http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=1341479

Theme keywords	NODC DATA TYPES THESAURUS SPECIES IDENTIFICATION TAXONOMIC CODE NODC OBSERVATION TYPES THESAURUS biological survey - biological WMO_CategoryCode oceanography BCO-DMO Standard Parameters genus latitude longitude longitude from 0 to 360 degrees no standard parameter phylum species taxon Originator Parameter Names MMETSP_ID_paste NCGR_PEP_ID_paste TAXON_ID_mean evalue_mean evalue_paste genus_max lat lat2 lon lon_360 phylum_max species_max strain_max
Data Center keywords	NODC COLLECTING INSTITUTION NAMES THESAURUS University of North Carolina - Chapel Hill NODC SUBMITTING INSTITUTION NAMES THESAURUS Biological and Chemical Oceanography Data Management Office Global Change Master Directory (GCMD) Data Center Keywords BCO-DMO > Biological and Chemical Oceanography Data Management Office
Platform keywords	NODC PLATFORM NAMES THESAURUS R/V Laurence M. Gould BCO-DMO Platform Names ARSV Laurence M. Gould Global Change Master Directory (GCMD) Platform Keywords Ships ICES/SeaDataNet Ship Codes LAURENCE M. GOULD (call sign: WCX7445, ICES code: 33LG, 1998)
Instrument keywords	NODC INSTRUMENT TYPES THESAURUS microscope BCO-DMO Standard Instruments Bioanalyzer Inverted Microscope Global Change Master Directory (GCMD) Instrument Keywords MICROSCOPES > MICROSCOPES Originator Instrument Names Agilent Bioanalyzer 2100 Olympus CKX41
Place keywords	Provider Place Names Antarctica
Project keywords	BCO-DMO Standard Projects Iron and Light Limitation in Ecologically Important Polar Diatoms: Comparative Transcriptomics and Development of Molecular Indicators (Polar_Transcriptomes) Provider Cruise IDs LMG1401 Provider Funding Award Information Funding provided by NSF Division of Ocean Sciences (NSF OCE) Award Number: PLR-1341479
Keywords	NCEI ACCESSION NUMBER 0278260

Use Constraints	Cite as: Marchetti, Adrian (2023). Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP) location information on samples obtained on Gould (LMG1411) in the Western Antarctica Peninsula during 2014 (Polar Transcriptomes project) (NCEI Accession 0278260). [indicate subset used]. NOAA National Centers for Environmental Information. Dataset. https://www.ncei.noaa.gov/archive/accession/0278260. Accessed [date].
Data License	This dataset is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) License. SPDX License: Creative Commons Attribution 4.0 International (CC-BY-4.0)
Access Constraints	Use liability: NOAA and NCEI cannot provide any warranty as to the accuracy, reliability, or completeness of furnished data. Users assume responsibility to determine the usability of these data. The user is responsible for the results of any application of this data for other than its intended purpose.
Fees	In most cases, electronic downloads of the data are free. However, fees may apply for custom orders, data certifications, copies of analog materials, and data distribution on physical media.

Lineage information for: dataset
Processing Steps	2023-05-15T06:38:55Z - NCEI Accession 0278260 v1.1 was published.
Output Datasets	NCEI Accession 0278260 v1.1 NCEI Accession 0278260 v1.1 (download) published 2023-05-15T06:38:55Z

Acquisition Information (collection)
Instrument	microscope
Platform	R/V Laurence M. Gould

Last Modified: 2024-05-31T15:15:28Z
For questions about the information on this page, please email: ncei.info@noaa.gov

Marine Microbial Eukaryote Transcriptome Sequencing Project (MMETSP) location information on samples obtained on Gould (LMG1411) in the Western Antarctica Peninsula during 2014 (Polar Transcriptomes project) (NCEI Accession 0278260)

Follow us

Contact us