-
Notifications
You must be signed in to change notification settings - Fork 7
Files
/
Copy pathHTAN.model.csv
1073 lines (1073 loc) · 370 KB
/
HTAN.model.csv
1 | Attribute | Description | Valid Values | DependsOn | Properties | Required | Parent | DependsOn Component | Source | Validation Rules |
---|---|---|---|---|---|---|---|---|---|---|
2 | Assay | A planned process with the objective to produce information about the material entity that is the evaluant, by physically examining it or its proxies.[OBI_0000070] | FALSE | http://purl.obolibrary.org/obo/OBI_0000070 | ||||||
3 | Device | A thing made or adapted for a particular purpose, especially a piece of mechanical or electronic equipment | FALSE | Assay | https://w3id.org/biolink/vocab/Device | |||||
4 | Sequencing | Module for next generation sequencing assays | FALSE | Assay | ||||||
5 | Component | Category of metadata (e.g. Diagnosis, Biospecimen, scRNA-seq Level 1, etc.); provide the same one for all items/rows. | TRUE | https://w3id.org/biolink/vocab/category | ||||||
6 | Patient | HTAN patient | Component, HTAN Participant ID | FALSE | Individual Organism | Demographics, Family History, Exposure, Follow Up, Diagnosis, Therapy, Molecular Test | ||||
7 | File | A type of Information Content Entity specific to OS | FALSE | Information Content Entity | https://w3id.org/biolink/vocab/DataFile | |||||
8 | Filename | Name of a file | TRUE | regex search ^.+\/\S*$ | ||||||
9 | File Format | Format of a file (e.g. txt, csv, fastq, bam, etc.) | hdf5, bedgraph, idx, idat, bam, bai, excel, powerpoint, tif, tiff, OME-TIFF, png, doc, pdf, fasta, fastq, sam, vcf, bcf, maf, bed, chp, cel, sif, tsv, csv, txt, plink, bigwig, wiggle, gct, bgzip, zip, seg, html, mov, hyperlink, svs, md, flagstat, gtf, raw, msf, rmd, bed narrowPeak, bed broadPeak, bed gappedPeak, avi, pzfx, fig, xml, tar, R script, abf, bpm, dat, jpg, locs, Sentrix descriptor file, Python script, sav, gzip, sdf, RData, hic, ab1, 7z, gff3, json, sqlite, svg, sra, recal, tranches, mtx, tagAlign, dup, DICOM, czi, mex, cloupe, am, cell am, mpg, m, mzML,scn, dcc, rcc, pkc, sf, bedpe | TRUE | ||||||
10 | CDS Sequencing Template | CDS compatible template file, includes attributes for Genomic Reference, Library Layout, Data Type, Sequencing Platform, Library Selection Method | Component, Filename, File Format, HTAN Data File ID, HTAN Parent Biospecimen ID, CDS library_id, CDS library_strategy, CDS library_source, CDS library_selection, CDS library_layout, CDS platform, CDS instrument_model, CDS design_description, CDS reference_genome_assembly, CDS custom_assembly_fasta_file_for_alignment, CDS bases, CDS number_of_reads, CDS coverage, CDS avg_read_length, CDS sequence_alignment_software | TRUE | ||||||
11 | CDS library_id | Short unique identifier for the sequencing library. | TRUE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | str | ||||
12 | CDS library_strategy | Library strategy | AMPLICON, ATAC-seq, Bisulfite-Seq, ChIA-PET, ChIP-Seq, CLONE, CLONEEND, CTS, DNase-Hypersensitivity, EST, FAIRE-seq, FINISHING, FL-cDNA, Hi-C, MBD-Seq, MeDIP-Seq, miRNA-Seq, MNase-Seq, MRE-Seq, ncRNA-Seq, OTHER, POOLCLONE, RAD-Seq, RIP-Seq, RNA-Seq, SELEX, ssRNA-seq, Synthetic-Long-Read, Targeted-Capture, Tethered Chromatin Conformation Capture, Tn-Seq, WCS, WGA, WGS, WXS | TRUE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | str | |||
13 | CDS library_source | The Library Source specifies the type of source material that is being sequenced | GENOMIC, GENOMIC SINGLE CELL, METAGENOMIC, METATRANSCRIPTOMIC, OTHER, SYNTHETIC, TRANSCRIPTOMIC, TRANSCRIPTOMIC SINGLE CELL, VIRAL RNA | TRUE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | str | |||
14 | CDS library_selection | Library Selection Method | 5-methylcytidine antibody, CAGE, cDNA, cDNA_oligo_dT, cDNA_randomPriming, CF-H, CF-M, CF-S, CF-T, ChIP, DNAse, HMPR, Hybrid Selection, Inverse rRNA, MBD2 protein methyl-CpG binding domain, MDA, MF, MNase, MSLL, Oligo-dT, other, Padlock probes capture method, PCR, PolyA, RACE, RANDOM, RANDOM PCR, Reduced Representation, repeat fractionation, Restriction Digest, RT-PCR, size fractionation, unspecified | TRUE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | str | |||
15 | CDS library_layout | Paired-end or Single | Paired-end, Single-end | TRUE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | str | |||
16 | CDS platform | Sequencing Platform used for Sequencing | LS454, ABI_SOLID, BGISEQ, CAPILLARY, COMPLETE_GENOMICS, HELICOS, ILLUMINA, ION_TORRENT, OXFORD_NANOPORE, PACBIO_SMRT | TRUE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | str | |||
17 | CDS instrument_model | Instrument model used for sequencing | 454 GS, 454 GS 20, 454 GS FLX, 454 GS FLX+, 454 GS FLX Titanium, 454 GS Junior, HiSeq X Five, HiSeq X Ten, Illumina Genome Analyzer, Illumina Genome Analyzer II, Illumina Genome Analyzer IIx, Illumina HiScanSQ, Illumina HiSeq 1000, Illumina HiSeq 1500, Illumina HiSeq 2000, Illumina HiSeq 2500, Illumina HiSeq 3000, Illumina HiSeq 4000, Illumina iSeq 100, Illumina NovaSeq 6000, Illumina MiniSeq, Illumina MiSeq, NextSeq 500, NextSeq 550, Helicos HeliScope, AB 5500 Genetic Analyzer, AB 5500xl Genetic Analyzer, AB 5500x-Wl Genetic Analyzer, AB SOLiD 3 Plus System, AB SOLiD 4 System, AB SOLiD 4hq System, AB SOLiD PI System, AB SOLiD System, AB SOLiD System 2.0, AB SOLiD System 3.0, Complete Genomics, PacBio RS, PacBio RS II, PacBio Sequel, PacBio Sequel II, Ion Torrent PGM, Ion Torrent Proton, Ion Torrent S5 XL, Ion Torrent S5, AB 310 Genetic Analyzer, AB 3130 Genetic Analyzer, AB 3130xL Genetic Analyzer, AB 3500 Genetic Analyzer, AB 3500xL Genetic Analyzer, AB 3730 Genetic Analyzer, AB 3730xL Genetic Analyzer, GridION, MinION, PromethION, BGISEQ-500, DNBSEQ-G400, DNBSEQ-T7, DNBSEQ-G50, MGISEQ-2000RS | TRUE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | str | |||
18 | CDS design_description | Free-form description of the methods used to create the sequencing library; a brief 'materials and methods' section. | FALSE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | str | ||||
19 | CDS reference_genome_assembly | This is only if you are submitting a bam file aligned against a NCBI assembly. | FALSE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | str | ||||
20 | CDS custom_assembly_fasta_file_for_alignment | Please provide the name of the custom assembly fasta file used during alignment | FALSE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | str | ||||
21 | CDS bases | Count of unique basecalls present in the data. Please count each base only once if using secondary alignments. | FALSE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | int | ||||
22 | CDS number_of_reads | Count of the number of reads in the data. Please count each read only once if using secondary alignments. | FALSE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | int | ||||
23 | CDS coverage | Depth of coverage on assembly used. Found by (Unique Aligned Basecalls)/(Reference Length) | FALSE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | int | ||||
24 | CDS avg_read_length | Found by (Bases)/(Reads) | FALSE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | int | ||||
25 | CDS sequence_alignment_software | The name of the software program used to align nucleotide sequencing data. | FALSE | Sequencing | https://dataservice.datacommons.cancer.gov/#/resources | str | ||||
26 | Checksum | MD5 checksum of the BAM file | TRUE | Information Content Entity | ||||||
27 | HTAN Data File ID | Self-identifier for this data file - HTAN ID of this file HTAN ID SOP (eg HTANx_yyy_zzz) | TRUE | File | https://docs.google.com/document/d/1podtPP8L1UNvVxx9_c_szlDcU1f8n7bige6XA_GoRVM/edit?usp=sharing | regex match ^(HTA([1-9]|1[0-6]))_((EXT)?([0-9]\d*|0000))_([0-9]\d*|0000)$ warning | ||||
28 | HTAN Participant ID | HTAN ID associated with a patient based on HTAN ID SOP (eg HTANx_yyy ) | TRUE | Patient | https://docs.google.com/document/d/1podtPP8L1UNvVxx9_c_szlDcU1f8n7bige6XA_GoRVM/edit?usp=sharing | regex match ^(HTA([1-9]|1[0-6]))_((EXT)?([0-9]\d*|0000))$ warning | ||||
29 | HTAN Biospecimen ID | HTAN ID associated with a biosample based on HTAN ID SOP (eg HTANx_yyy_zzz) | TRUE | Biospecimen | https://docs.google.com/document/d/1podtPP8L1UNvVxx9_c_szlDcU1f8n7bige6XA_GoRVM/edit?usp=sharing | regex match ^(HTA([1-9]|1[0-6]))_((EXT)?([0-9]\d*|0000))_([0-9]\d*|0000)$ warning | ||||
30 | HTAN Parent ID | HTAN ID of parent from which the biospecimen was obtained. Parent could be another biospecimen or a research participant. | TRUE | Biospecimen | https://docs.google.com/document/d/1podtPP8L1UNvVxx9_c_szlDcU1f8n7bige6XA_GoRVM/edit?usp=sharing | |||||
31 | HTAN Parent Biospecimen ID | HTAN Biospecimen Identifier (eg HTANx_yyy_zzz) indicating the biospecimen(s) from which these files were derived; multiple parent biospecimen should be comma-separated | TRUE | Biospecimen | https://docs.google.com/document/d/1podtPP8L1UNvVxx9_c_szlDcU1f8n7bige6XA_GoRVM/edit?usp=sharing | |||||
32 | HTAN Parent Data File ID | HTAN Data File Identifier indicating the file(s) from which these files were derived | TRUE | File | ||||||
33 | Clinical Data Tier 2 | Tier 2 Cancer Data | Component, HTAN Participant ID, Timepoint Label, Start Days from Index, Stop Days from Index, Sentinel Lymph Node Count, Sentinel Node Positive Assessment Count, Tumor Extranodal Extension Indicator, Satellite Metastasis Present Indicator, Other Biopsy Resection Site, Extent of Tumor Resection, Prior Sites of Radiation, Immunosuppression, Concomitant Medication Received Type, Family Member Vital Status Indicator, COVID19 Occurrence Indicator, COVID19 Current Status, COVID19 Positive Lab Test Indicator, COVID19 Antibody Testing, COVID19 Complications Severity, COVID19 Cancer Treatment Followup, Ecig vape use, Ecig vape 30 day use num, Ecig vape times per day, Type of smoke exposure cumulative years, Chewing tobacco daily use count, Second hand smoke exposure years, Known Genetic Predisposition Mutation, Hereditary Cancer Predisposition Syndrome, Cancer Associated Gene Mutations, Mutational Signatures, Mismatch Repair System Status, Lab Tests for MMR Status, Mode of Cancer Detection, Education Level, Country of Birth, Medically Underserved Area, Rural vs Urban, Cancer Incidence, Cancer Incidence Location | FALSE | Patient | |||||
34 | SRRS Clinical Data Tier 2 | Cancer related clinical data specific to SRRS | Component, HTAN Participant ID, Timepoint Label, Start Days from Index, Stop Days from Index, Education Level, Country of Birth, Medically Underserved Area, Rural vs Urban, Cancer Incidence, Cancer Incidence Location, Ethnicity, Gender, Race, Vital Status, Age at Diagnosis, Days to Last Follow up, Days to Last Known Disease Status, Days to Recurrence, Last Known Disease Status, Morphology, Primary Diagnosis, Progression or Recurrence, Site of Resection or Biopsy, Tissue or Organ of Origin, NCI Atlas Cancer Site, Tumor Grade, Pack Years Smoked, Years Smoked, Days to Follow Up, Gene Symbol, Molecular Analysis Method, Test Result, Treatment Type, Tumor Largest Dimension Diameter | FALSE | Patient | |||||
35 | Lung Cancer Tier 3 | Lung cancer specific attributes in Clinical Data Tier 3 | Component, HTAN Participant ID, Timepoint Label, Start Days from Index, Stop Days from Index, Lung Cancer Detection Method Type, Lung Cancer Participant Procedure History, Lung Adjacent Histology Type, Lung Tumor Location Anatomic Site, Lung Tumor Lobe Bronchial Location, Current Lung Cancer Symptoms, Lung Topography, Lung Cancer Harboring Genomic Aberrations | FALSE | Patient | |||||
36 | Colorectal Cancer Tier 3 | Colorectal cancer specific attributes in Clinical Data Tier 3 | Component, HTAN Participant ID, Timepoint Label, Start Days from Index, Stop Days from Index,Colorectal Cancer Detection Method Type, History of Prior Colon Polyps, Family Colon Cancer History Indicator, Family Medical History Colorectal Polyp Diagnosis, Immediate Family History Endometrial Cancer, Immediate Family History Ovarian Cancer, Patient Inflammatory Bowel Disease Personal Medica History, Patient Colonoscopy Performed Indicator, Colorectal Cancer Tumor Border Configuration, MLH1 Promoter Methylation Status, Colorectal Cancer KRAS Indicator, Colon Polyp Occurence Indicator, Family History Colorectal Polyp, Colorectal Polyp New Indicator, Colorectal Polyp Shape, Size of Polyp Removed, Colorectal Polyp Count, Colorectal Polyp Type, Colorectal Polyp Adenoma Type | FALSE | Patient | |||||
37 | Breast Cancer Tier 3 | Breast cancer specific attributes in Clinical Data Tier 3 | Component, HTAN Participant ID, Timepoint Label, Start Days from Index, Stop Days from Index,Breast Carcinoma Detection Method Type, Breast Carcinoma Histology Category, Invasive Lobular Breast Carcinoma Histologic Category, Invasive Ductal Breast Carcinoma Histologic Category, Breast Biopsy Procedure Finding Type, Breast Quadrant Site, Breast Cancer Assessment Tests, Breast Cancer Genomic Test Performed, Mammaprint Risk Group, Oncotype Risk Group, Breast Carcinoma Estrogen Receptor Status, Breast Carcinoma Progesteroner Receptor Status, Breast Cancer Allred Estrogen Receptor Score, Prior Invasive Breast Disease, Breast Carcinoma ER Status Percentage Value, Breast Carcinoma PR Status Percentage Value, HER2 Breast Carcinoma Copy Number Total, Breast Carcinoma Centromere 17 Copy Number, Breast Carcinoma HER2 Centromere17 Copynumber Total, Breast Carcinoma HER2 Chromosome17 Ratio, Breast Carcinoma Surgical Procedure Name, Breast Carcinoma HER2 Ratio Diagnosis, Breast Carcinoma HER2 Status, Hormone Therapy Breast Cancer Prevention Indicator, Breast Carcinoma ER Staining Intensity, Breast Carcinoma PR Staining Intensity, Oncotype Score, Breast Imaging Performed Type, Multifocal Breast Carcinoma Present Indicator, Multicentric Breast Carcinoma Present Indicator, BIRADS Mammography Breast Density Category | FALSE | Patient | |||||
38 | Neuroblastoma and Glioma Tier 3 | Brain cancer specific attributes in Clinical Data Tier 3 | Component, HTAN Participant ID, Timepoint Label, Start Days from Index, Stop Days from Index,CNS Tumor Primary Anatomic Site, Glioma Specific Metastasis Sites, Glioma Specific Radiation Field, Supra Tentorial Ependymoma Molecular Subgroup, Infra Tentorial Ependymoma Molecular Subgroup, Neuroblastoma MYCN Gene Amplification Status | FALSE | Patient | |||||
39 | Acute Lymphoblastic Leukemia Tier 3 | Acute Lymphoblastic Leukemia attributes in Clinical Data Tier 3 | Component, HTAN Participant ID, Timepoint Label, Start Days from Index, Stop Days from Index,Specimen Blast Count Percentage Value, NCI ALL Risk Group, MRD ALL Diagnostic Sensitivity, CNS Leukemia Status | FALSE | Patient | |||||
40 | Ovarian Cancer Tier 3 | Ovarian cancer specific attributes in Clinical Data Tier 3 | Component, HTAN Participant ID, Timepoint Label, Start Days from Index, Stop Days from Index,Ovarian Cancer Histologic Subtype, Ovarian Cancer Surgical Outcome, Ovarian Cancer Platinum Status | FALSE | Patient | |||||
41 | Prostate Cancer Tier 3 | Prostate cancer specific attributes in Clinical Data Tier 3 | Component, HTAN Participant ID, Timepoint Label, Start Days from Index, Stop Days from Index,Location Extent Extraprostatic Extension, Location Nature Positive Margins, Seminal Vesicle Invasion, Prostate Carcinoma Histologic Type, Prostate Cancer Local Extent, Additonal Findings Uninvolved Prostate, Prostate Cancer Cytologic Morphologic Subtypes | FALSE | Patient | |||||
42 | Sarcoma Tier 3 | Sarcoma specific attributes in Clinical Data Tier 3 | Component, HTAN Participant ID, Timepoint Label, Start Days from Index, Stop Days from Index,Sarcoma Subtype, Sarcoma Diagnosis Classification Category, Sarcoma Tumor Extension Type | FALSE | Patient | |||||
43 | Pancreatic Cancer Tier 3 | Pancreatic cancer specific attributes in Clinical Tier Data 3 | Component, HTAN Participant ID, Timepoint Label, Start Days from Index, Stop Days from Index,Pancreas Precancer Histopathologic Grade, Pancreatic IPMN Pathology Epithelial Subtype, Pancreatic Duct Final Pathology Type | FALSE | Patient | |||||
44 | Melanoma Tier 3 | Melanoma specific attributes in Clinical Data Tier 3 | Component, HTAN Participant ID, Timepoint Label, Start Days from Index, Stop Days from Index, Cutaneous Melanoma Tumor Infiltrating Lymphocytes, Cutaneous Melanoma Tumor Regression Range, Melanoma Specimen Clark Level Value, Cutaneous Melanoma Surgical Margins, Melanoma Lesion Size, History of Atypical Nevi, Fitzpatrick Skin Tone, History of Chronic UV Exposure, History of Blistering Sunburn, History of Tanning Bed Use, Immediate Family History Melanoma, Melanoma Biopsy Resection Sites, Cutaneous Melanoma Ulceration, Cutaneous Melanoma Additional Findings | FALSE | Patient | |||||
45 | Demographics | Demographic attributes | Component, HTAN Participant ID, Ethnicity, Gender, Race, Vital Status, Days to Birth, Country of Residence, Age Is Obfuscated, Year Of Birth, Occupation Duration Years, Premature At Birth, Weeks Gestation at Birth | FALSE | Patient | |||||
46 | Family History | Family cancer history | Component, HTAN Participant ID, Relative with Cancer History | FALSE | Patient | |||||
47 | Exposure | Exposure to carcinogens | Component, HTAN Participant ID, Start Days from Index, Smoking Exposure, Alcohol Exposure, Asbestos Exposure, Coal Dust Exposure, Environmental Tobacco Smoke Exposure, Radon Exposure, Respirable Crystalline Silica Exposure | FALSE | Patient | |||||
48 | Follow Up | Follow up clinical visits | Component, HTAN Participant ID, Days to Follow Up, Adverse Event, Progression or Recurrence, Barretts Esophagus Goblet Cells Present, BMI, Cause of Response, Comorbidity, Comorbidity Method of Diagnosis, Days to Adverse Event, Days to Comorbidity, Diabetes Treatment Type, Disease Response, DLCO Ref Predictive Percent, ECOG Performance Status, FEV1 FVC Post Bronch Percent, FEV 1 FVC Pre Bronch Percent, FEV1 Ref Post Bronch Percent, FEV1 Ref Pre Bronch Percent, Height, Hepatitis Sustained Virological Response, HPV Positive Type, Karnofsky Performance Status, Menopause Status, Pancreatitis Onset Year, Reflux Treatment Type, Risk Factor, Risk Factor Treatment, Viral Hepatitis Serologies, Weight, Adverse Event Grade, AIDS Risk Factors, Body Surface Area, CD4 Count, CDC HIV Risk Factors, Days to Imaging, Evidence of Recurrence Type, HAART Treatment Indicator, HIV Viral Load, Hormonal Contraceptive Use, Hysterectomy Margins Involved, Hysterectomy Type, Imaging Result, Imaging Type, Immunosuppressive Treatment Type, Nadir CD4 Count, Pregnancy Outcome, Recist Targeted Regions Number, Recist Targeted Regions Sum, Scan Tracer Used | FALSE | Patient | |||||
49 | Therapy | Clinical therapy or treatment | Component, HTAN Participant ID, Treatment or Therapy, Treatment Type, Treatment Effect, Treatment Outcome, Days to Treatment End, Treatment Anatomic Site, Days to Treatment Start, Initial Disease Status, Regimen or Line of Therapy, Therapeutic Agents, Treatment Intent Type, Chemo Concurrent to Radiation, Number of Cycles, Reason Treatment Ended, Treatment Arm, Treatment Dose, Treatment Dose Units, Treatment Effect Indicator, Treatment Frequency | FALSE | Patient | |||||
50 | Diagnosis | Disease diagnosis | Component, HTAN Participant ID, Age at Diagnosis, Year of Diagnosis, Primary Diagnosis, Precancerous Condition Type, Site of Resection or Biopsy, Tissue or Organ of Origin, Morphology, Tumor Grade, Progression or Recurrence, Last Known Disease Status, Days to Last Follow up, Days to Last Known Disease Status, Method of Diagnosis, Prior Malignancy, Prior Treatment, Metastasis at Diagnosis, Metastasis at Diagnosis Site, First Symptom Prior to Diagnosis, Days to Diagnosis, Percent Tumor Invasion, Residual Disease, Synchronous Malignancy, Tumor Confined to Organ of Origin, Tumor Focality, Tumor Largest Dimension Diameter, Gross Tumor Weight, Breslow Thickness, Vascular Invasion Present, Vascular Invasion Type, Anaplasia Present, Anaplasia Present Type, Laterality, Perineural Invasion Present, Lymphatic Invasion Present, Lymph Nodes Positive, Lymph Nodes Tested, Peritoneal Fluid Cytological Status, Classification of Tumor, Best Overall Response, Mitotic Count, AJCC Clinical M, AJCC Clinical N, AJCC Clinical Stage, AJCC Clinical T, AJCC Pathologic M, AJCC Pathologic N, AJCC Pathologic Stage, AJCC Pathologic T, AJCC Staging System Edition, Cog Neuroblastoma Risk Group, Cog Rhabdomyosarcoma Risk Group, Gleason Grade Group, Gleason Grade Tertiary, Gleason Patterns Percent, Greatest Tumor Dimension, IGCCCG Stage, INPC Grade, INPC Histologic Group, INRG Stage, INSS Stage, International Prognostic Index, IRS Group, IRS Stage, ISS Stage, Lymph Node Involved Site, Margin Distance, Margins Involved Site, Medulloblastoma Molecular Classification, Micropapillary Features, Mitosis Karyorrhexis Index, Non Nodal Regional Disease, Non Nodal Tumor Deposits, Ovarian Specimen Status, Ovarian Surface Involvement, Pregnant at Diagnosis, Primary Gleason Grade, Secondary Gleason Grade, Supratentorial Localization, Tumor Depth, WHO CNS Grade, WHO NTE Grade | FALSE | Patient | |||||
51 | Molecular Test | Clinical molecular test data | Component, HTAN Participant ID, Timepoint Label, Start Days from Index, Stop Days from Index, Gene Symbol, Molecular Analysis Method, Test Result, AA Change, Antigen, Clinical Biospecimen Type, Blood Test Normal Range Upper, Blood Test Normal Range Lower, Cell Count, Chromosome, Clonality, Copy Number, Cytoband, Exon, Histone Family, Histone Variant, Intron, Laboratory Test, Loci Abnormal Count, Loci Count, Locus, Mismatch Repair Mutation, Molecular Consequence, Pathogenicity, Ploidy, Second Exon, Second Gene Symbol, Specialized Molecular Test, Test Analyte Type, Test Units, Test Value, Transcript, Variant Origin, Variant Type, Zygosity | FALSE | Patient | |||||
52 | Biospecimen | HTAN biological entity; this can be tissue, blood, analyte and subsamples of those | Component, HTAN Biospecimen ID, Source HTAN Biospecimen ID, HTAN Parent ID, Timepoint Label, Collection Days from Index, Adjacent Biospecimen IDs, Biospecimen Type, Acquisition Method Type, Fixative Type, Storage Method, Processing Days from Index, Protocol Link, Site Data Source, Collection Media, Mounting Medium, Processing Location, Histology Assessment By, Histology Assessment Medium, Preinvasive Morphology, Tumor Infiltrating Lymphocytes, Degree of Dysplasia, Dysplasia Fraction, Number Proliferating Cells, Percent Eosinophil Infiltration, Percent Granulocyte Infiltration, Percent Inflam Infiltration, Percent Lymphocyte Infiltration, Percent Monocyte Infiltration, Percent Necrosis, Percent Neutrophil Infiltration, Percent Normal Cells, Percent Stromal Cells, Percent Tumor Cells, Percent Tumor Nuclei, Fiducial Marker, Slicing Method, Lysis Buffer, Method of Nucleic Acid Isolation | FALSE | Biosample | Patient | ||||
53 | SRRS Biospecimen | SRRS-specific HTAN biological entity; this can be tissue, blood, analyte and subsamples of those, however it can be described via fewer attributes than a standard HTAN specimen | Component, HTAN Biospecimen ID, Source HTAN Biospecimen ID, HTAN Parent ID, Adjacent Biospecimen IDs, Biospecimen Type, Timepoint Label, Collection Days from Index, Acquisition Method Type, Ischemic Time, Ischemic Temperature, Collection Media, Topography Code, Additional Topography, Fixative Type, Storage Method, Preinvasive Morphology, Histologic Morphology Code, Preservation Method, Processing Days from Index, Protocol Link | FALSE | Biosample | Patient | ||||
54 | Source HTAN Biospecimen ID | This is the HTAN ID that may have been assigned to the biospecimen at the site of biospecimen origin (e.g. BU). | FALSE | Biosample | ||||||
55 | Other Assay | Metadata applying to any assay without standard descriptors. Can be used as a placeholder for minimal amount of metadata until the assay descriptors are standardized | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Assay Type | FALSE | Assay | Biospecimen | ||||
56 | ExSeq Minimal | Minimal metadata for the ExSeq assay | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Assay Type | FALSE | Assay | Biospecimen | ||||
57 | Assay Type | The type and level of assay this metadata applies to (e.g. RPPA, NanoString DSP, etc.) | TRUE | Assay | ||||||
58 | scRNA-seq Level 1 | Single-cell RNA-seq [EFO_0008913] | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Nucleic Acid Source, Cryopreserved Cells in Sample, Single Cell Isolation Method, Dissociation Method, Library Construction Method, Read Indicator, Read1, Read2, End Bias, Reverse Transcription Primer, Spike In, Sequencing Platform, Total Number of Input Cells, Input Cells and Nuclei, Library Preparation Days from Index, Single Cell Dissociation Days from Index, Sequencing Library Construction Days from Index, Nucleic Acid Capture Days from Index, Protocol Link, Technical Replicate Group | FALSE | Sequencing | Biospecimen | http://www.ebi.ac.uk/efo/EFO_0008913 | |||
59 | scRNA-seq Level 2 | Alignment workflows downstream of scRNA-seq Level 1 | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, scRNAseq Workflow Type, Workflow Version, scRNAseq Workflow Parameters Description, Workflow Link, Genomic Reference, Genomic Reference URL, Genome Annotation URL, Checksum, Whitelist Cell Barcode File Link, Cell Barcode Tag, UMI Tag, Applied Hard Trimming | FALSE | Sequencing | scRNA-seq Level 1 | ||||
60 | scRNA-seq Level 3 | Gene and Isoform expression files | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Data Category, Matrix Type, Linked Matrices, Cell Median Number Reads, Cell Median Number Genes, Cell Total, scRNAseq Workflow Type, scRNAseq Workflow Parameters Description, Workflow Link, Workflow Version | FALSE | Sequencing | scRNA-seq Level 2 | ||||
61 | scRNA-seq Level 4 | Data represents the relationships between cells derived from Level 3 expression data and shown as tSNE or UMAP coordinates per cell, plus all other cell-specific meta information (e.g., cell type) | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, scRNAseq Workflow Type, scRNAseq Workflow Parameters Description, Workflow Version, Workflow Link | FALSE | Sequencing | scRNA-seq Level 3 | ||||
62 | Slide-seq Level 1 | Raw sequencing files for the Slide-seq assay. | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Nucleic Acid Source, Read Indicator, Spatial Read1, Spatial Read2, End Bias, Reverse Transcription Primer, Spatial Barcode Offset, Spatial Barcode and UMI, Spike In, Sequencing Platform, Technical Replicate Group, Protocol Link, Spatial Library Construction Method, Library Preparation Days from Index, Sequencing Library Construction Days from Index, Nucleic Acid Capture Days from Index | FALSE | Spatial Transcriptomics | Biospecimen | ||||
63 | Slide-seq Level 2 | Aligned sequencing files and QC for the Slide-seq assay. | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Slide-seq Workflow Type, Workflow Version, Slide-seq Workflow Parameter Description, Workflow Link, Genomic Reference, Genomic Reference URL, Genome Annotation URL, Checksum, Spatial Barcode Tag, Matched Spatial Barcode Tag, UMI Tag, Applied Hard Trimming | FALSE | Spatial Transcriptomics | Slide-seq Level 1 | ||||
64 | Slide-seq Level 3 | Gene matrices with features and barcodes for Slide-seq as well as spatial information (bead location files). | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Run ID, Sequencing Batch ID, Data Category, Matrix Type, Slide-seq Workflow Type, Workflow Version, Slide-seq Workflow Parameter Description, Workflow Link, Beads Total, Median UMI Counts per Spot, Median Number Genes per Spatial Spot, Slide-seq Bead File Type, Slide-seq Fragment Size | FALSE | Spatial Transcriptomics | Slide-seq Level 2 | ||||
65 | Slide-seq Fragment Size | Average cDNA length associated with the experiemtn. Integer | FALSE | Spatial Transcriptomics | ||||||
66 | Matched Spatial Barcode Tag | SAM tag for matched spot barcode field; please provide a valid spot barcode tag (e.g. CB:Z) (Slide-seq specific) | TRUE | Spatial Transcriptomics | ||||||
67 | Beads Total | Number of sequenced beads. Applies to raw counts matrix only. Integer | FALSE | Spatial Transcriptomics | ||||||
68 | Slide-seq Workflow Type | Generic name for the workflow used to analyze the Slide-seq data set. String | TRUE | Spatial Transcriptomics | ||||||
69 | Slide-seq Workflow Parameter Description | Parameters used to run the Slide-seq workflow. String | TRUE | Spatial Transcriptomics | ||||||
70 | Slide-seq Bead File Type | The type of Level 3 file submitted as part of the Slide-seq workflow. | Matrix Features, Matrix Barcodes, All Bead Locations, All Bead Barcodes, Matched Bead Barcodes, Matched Bead Locations, Not Applicable | TRUE | Spatial Transcriptomics | |||||
71 | Bulk RNA-seq Level 1 | Bulk RNA-seq [EFO_0003738] | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Library Layout, Read Indicator, Nucleic Acid Source, Micro-region Seq Platform, ROI Tag, Sequencing Platform, Sequencing Batch ID, Read Length, Library Selection Method, Library Preparation Kit Name, Library Preparation Kit Vendor, Library Preparation Kit Version, Library Preparation Days from Index, Spike In, Adapter Name, Adapter Sequence, Base Caller Name, Base Caller Version, Flow Cell Barcode, Fragment Maximum Length, Fragment Mean Length, Fragment Minimum Length, Fragment Standard Deviation Length, Lane Number, Library Strand, Multiplex Barcode, Size Selection Range, Target Depth, To Trim Adapter Sequence, Transcript Integrity Number, RIN, DV200, Adapter Content, Basic Statistics, Encoding, Kmer Content, Overrepresented Sequences, Per Base N Content, Per Base Sequence Content, Per Base Sequence Quality, Per Sequence GC Content, Per Sequence Quality Score, Per Tile Sequence Quality, Percent GC Content, Sequence Duplication Levels, Sequence Length Distribution, Total Reads, QC Workflow Type, QC Workflow Version, QC Workflow Link | FALSE | Sequencing | Biospecimen | http://www.ebi.ac.uk/efo/EFO_0003738 | |||
72 | Bulk RNA-seq Level 2 | Bulk RNA-seq alignment protocol description | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Alignment Workflow Url, Alignment Workflow Type, Genomic Reference, Genomic Reference URL, Index File Name, Average Base Quality, Average Insert Size, Average Read Length, Contamination, Contamination Error, Mean Coverage, MSI Workflow Link, MSI Score, MSI Status, Pairs On Diff CHR, Total Reads, Total Uniquely Mapped, Total Unmapped reads, Proportion Reads Duplicated, Proportion Reads Mapped, Proportion Targets No Coverage, Proportion Base Mismatch, Short Reads, Is lowest level | FALSE | Sequencing | Bulk RNA-seq Level 1 | ||||
73 | Bulk RNA-seq Level 3 | Bulk RNA-seq gene expression matrices | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Pseudo Alignment Used, Data Category, Expression Units, Matrix Type, Fusion Gene Detected, Fusion Gene Identity | FALSE | Sequencing | Bulk RNA-seq Level 2 | ||||
74 | Bulk WES Level 1 | Bulk Whole Exome Sequencing raw files | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Sequencing Batch ID, Library Layout, Read Indicator, Library Selection Method, Read Length, Target Capture Kit, Library Preparation Kit Name, Library Preparation Kit Vendor, Library Preparation Kit Version, Sequencing Platform, Adapter Name, Adapter Sequence, Base Caller Name, Base Caller Version, Flow Cell Barcode, Fragment Maximum Length, Fragment Mean Length, Fragment Minimum Length, Fragment Standard Deviation Length, Lane Number, Multiplex Barcode, Library Preparation Days from Index, Size Selection Range, Target Depth, To Trim Adapter Sequence | FALSE | Sequencing | Biospecimen | ||||
75 | Bulk WES Level 2 | Bulk Whole Exome Sequencing aligned files and QC | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Alignment Workflow Type, Genomic Reference, Genomic Reference URL, Index File Name, Average Base Quality, Average Insert Size, Average Read Length, Contamination, Contamination Error, Mean Coverage, Adapter Content, Basic Statistics, Encoding, Overrepresented Sequences, Per Base N Content, Per Base Sequence Content, Per Base Sequence Quality, Per Sequence GC Content, Per Sequence Quality Score, Per Tile Sequence Quality, Percent GC Content, Sequence Duplication Levels, Sequence Length Distribution, QC Workflow Type, QC Workflow Version, QC Workflow Link, MSI Workflow Link, MSI Score, MSI Status, Pairs On Diff CHR, Total Reads, Total Uniquely Mapped, Total Unmapped reads, Proportion Reads Duplicated, Proportion Reads Mapped, Proportion Targets No Coverage, Proportion Base Mismatch, Short Reads, Proportion Coverage 10x, Proportion Coverage 30X,Is lowest level | FALSE | Sequencing | Bulk WES Level 1 | ||||
76 | Bulk WES Level 3 | Bulk Whole Exome Sequencing called variants | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Genomic Reference, Genomic Reference URL, Germline Variants Workflow URL, Germline Variants Workflow Type, Somatic Variants Workflow URL, Somatic Variants Workflow Type, Somatic Variants Sample Type, Structural Variant Workflow URL, Structural Variant Workflow Type | FALSE | Sequencing | Bulk WES Level 2 | ||||
77 | Microarray Level 1 | Microarray Level 1 refers to the raw text table of probe level intensities | Component, Filename, File Format, HTAN Data File ID, HTAN Participant ID, HTAN Parent Biospecimen ID, Nucleic Acid Source, Microarray Platform ID, Microarray Molecule, Microarray Label, Microarray Value Definition, Microarray Protocol Auxiliary File | FALSE | Assay | Biospecimen | ||||
78 | Microarray Level 2 | Microarray Level 2 provides a normalized matrix of values. | Component, Filename, File Format, HTAN Participant ID, HTAN Parent Biospecimen ID, HTAN Parent Data File ID, HTAN Data File ID, Microarray Platform ID, Normalization Method | FALSE | Assay | Microarray Level 1 | ||||
79 | scATAC-seq Level 1 | scATAC-seq files containing sequence read information, with or without alignment, as FASTQ or BAM files | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Nucleic Acid Source, Dissociation Method, Single Nucleus Buffer, Single Cell Isolation Method, Transposition Reaction, scATACseq Library Layout, Nucleus Identifier, Nuclei Barcode Length, Nuclei Barcode Read, scATACseq Read1, scATACseq Read2, scATACseq Read3, Library Construction Method, Sequencing Platform, Threshold for Minimum Passing Reads, Total Number of Passing Nuclei, Median Fraction of Reads in Peaks, Median Fraction of Reads in Annotated cis DNA Elements, Median Passing Read Percentage, Median Percentage of Mitochondrial Reads per Nucleus,Technical Replicate Group, Total Reads, Protocol Link | FALSE | Sequencing | Biospecimen | ||||
80 | scATAC-seq Level 2 | scATAC-seq files containing aligned sequence data, as a BAM file | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Alignment Workflow Url, Alignment Workflow Type, Genomic Reference, Genomic Reference URL, Index File Name, Average Base Quality, Average Insert Size, Average Read Length, Mean Coverage, Pairs On Diff CHR, Total Reads, Proportion Reads Mapped, MapQ30, Total Uniquely Mapped, Total Unmapped reads, Proportion Reads Duplicated, Short Reads, Proportion Coverage 10x, Proportion Coverage 30X, Proportion Targets No Coverage, Proportion Base Mismatch, Median Percentage of Mitochondrial Reads per Nucleus, Contamination,Contamination Error | FALSE | Sequencing | scATAC-seq Level 1 | ||||
81 | scATAC-seq Level 3 | Processed data files containing peak information for cells | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, scATAC-seq Object ID, nCount Peaks, nFeature Peaks, Total Read-Pairs, Duplicate Read-Pairs, Chimeric Read-Pairs, Unmapped Read-Pairs, LowMapQ, Mitochondrial Read-Pairs, Passed Filters, TSS Fragments, DNase Sensitive Region Fragments, Enhancer Region Fragments, Promoter Region Fragments, On Target Fragments, Blacklist Region Fragments, Peak Region Fragments, Peak Region Cutsites, Nucleosome Signal, Nucleosome Percentile, TSS Enrichment, TSS Percentile, Pct Reads in Peaks, Blacklist Ratio, Seurat Clusters, nCount RNA, nFeature RNA, MACS2 Seqnames, MACS2 Start, MACS2 End, MACS2 Width, MACS2 Strand, MACS2 Name, MACS2 Score, MACS2 Fold Change, MACS2 Neg Log10 pvalue Summit, MACS2 Neg Log10 qvalue Summit, MACS2 Relative Summit Position | FALSE | Sequencing | scATAC-seq Level 2 | ||||
82 | scmC-seq Level 1 | Files contain raw scmC-seq data. | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Nucleic Acid Source, scmCseq Read1, scmCseq Read2, scmCseq Read3, Single Cell Isolation Method, Single Nucleus Buffer, Single Nucleus Capture, Bisulfite Conversion, Library Layout, Nucleus Identifier, Sequencing Platform, Technical Replicate Group, Median Fraction of Reads in Peaks, Median Passing Read Percentage, Peaks Calling Software, Median Percentage of Mitochondrial Reads per Nucleus, Threshold for Minimum Passing Reads, Total Number of Passing Nuclei, Total Reads | FALSE | Sequencing | Biospecimen | ||||
83 | scmC-seq Level 2 | Files contain scmC-seq files containing aligned sequence data, as a BAM file. | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Alignment Workflow Url, Alignment Workflow Type, Genomic Reference, Genomic Reference URL, Index File Name, Average Base Quality, Average Insert Size, Average Read Length, Contamination, Contamination Error, Mean Coverage, Pairs On Diff CHR, Total Reads, Total Uniquely Mapped, Total Unmapped reads, Proportion Reads Duplicated, Proportion Reads Mapped, Proportion Targets No Coverage, Proportion Base Mismatch, Short Reads | FALSE | Sequencing | scmC-seq Level 1 | ||||
84 | scATAC-seq Level 4 | Data represents the relationships between cells derived from Level 3 expression data and shown as tSNE or UMAP coordinates per cell, plus all other cell-specific meta information (e.g., cell type) | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, scATACseq Workflow Type, scATACseq Workflow Parameters Description, Workflow Version, Workflow Link | FALSE | Sequencing | scATAC-seq Level 3 | ||||
85 | scDNA-seq Level 1 | Single-cell DNA-seq | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Sequencing Batch ID, Library Layout, Nucleic Acid Source, Library Selection Method, Read Length, Library Preparation Kit Name, Library Preparation Kit Vendor, Library Preparation Kit Version, Adapter Name, Adapter Sequence, Base Caller Name, Base Caller Version, Flow Cell Barcode, Fragment Maximum Length, Fragment Mean Length, Fragment Minimum Length, Fragment Standard Deviation Length, Lane Number, Library Strand, Multiplex Barcode, Size Selection Range, Target Depth, To Trim Adapter Sequence, Adapter Content, Basic Statistics, Encoding, Kmer Content, Overrepresented Sequences, Per Base N Content, Per Base Sequence Content, Per Base Sequence Quality, Per Sequence GC Content, Per Sequence Quality Score, Per Tile Sequence Quality, Percent GC Content, Sequence Duplication Levels, Sequence Length Distribution, Total Reads, QC Workflow Type, QC Workflow Version, QC Workflow Link | FALSE | Sequencing | Biospecimen | ||||
86 | scDNA-seq Level 2 | Alignment workflows downstream of scDNA-seq Level 1 | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Alignment Workflow Url, Alignment Workflow Type, Genomic Reference, Genomic Reference URL, Index File Name, Average Base Quality, Average Insert Size, Average Read Length, Mean Coverage, Pairs On Diff CHR, Total Reads, Proportion Reads Mapped, MapQ30, Total Uniquely Mapped, Total Unmapped reads,Proportion Reads Duplicated, Short Reads, Proportion Coverage 10x, Proportion Coverage 30X, Proportion Targets No Coverage, Proportion Base Mismatch, Proportion Mitochondrial Reads, Contamination, Contamination Error | FALSE | Sequencing | scDNA-seq Level 1 | ||||
87 | Multiplexed CITE-seq Level 1 | Raw sequencing files for the multiplexed CITE-seq assay | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Nucleic Acid Source,Cryopreserved Cells in Sample, Single Cell Isolation Method, Dissociation Method, Library Construction Method,Read Indicator, Read1, Read2, End Bias, Reverse Transcription Primer, Spike In, Spike In Concentration, Sequencing Platform, Total Number of Input Cells, Input Cells and Nuclei, Library Preparation Days from Index, Single Cell Dissociation Days from Index, Sequencing Library Construction Days from Index, Nucleic Acid Capture Days from Index, Protocol Link, Technical Replicate Group, Empty Well Barcode,Well Index,Feature Reference Id, Associated mRNA Library Data File ID, Single Cell Barcode Method Applied, Feature Barcode Library Type, Barcode Folder Synapse ID, Barcode Folder File List | FALSE | Sequencing | Biospecimen | ||||
88 | Multiplexed CITE-seq Level 2 | Alignment workflows downstream of Multiplexed CITE-seq Level 1 | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Parent Data File ID, HTAN Data File ID, Associated mRNA Library Data File ID, scRNAseq Workflow Type, Workflow Version, scRNAseq Workflow Parameters Description, Workflow Link, Genomic Reference, Genomic Reference URL, Genome Annotation URL, Checksum, Whitelist Cell Barcode File Link, Cell Barcode Tag, UMI Tag, Applied Hard Trimming | FALSE | Sequencing | Multiplexed CITE-seq Level 1 | ||||
89 | Multiplexed CITE-seq Level 3 | Gene and Isoform expression files | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Parent Biospecimen ID, HTAN Data File ID, Associated mRNA Library Data File ID, Data Category, Matrix Type, Linked Matrices, Cell Median Number Reads, Cell Median Number Genes, Cell Total, scRNAseq Workflow Type, scRNAseq Workflow Parameters Description, Workflow Link, Workflow Version | FALSE | Sequencing | scRNA-seq Level 2 | ||||
90 | Multiplexed CITE-seq Level 4 | Data represents the relationships between cells derived from Level 3 expression data and shown as tSNE or UMAP coordinates per cell, plus all other cell-specific meta information (e.g., cell type) | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Parent Biospecimen ID, HTAN Data File ID, Associated mRNA Library Data File ID, scRNAseq Workflow Type, scRNAseq Workflow Parameters Description, Workflow Version, Workflow Link | FALSE | Sequencing | Multiplexed CITE-seq Level 3 | ||||
91 | Bulk Methylation-seq Level 1 | Raw data for bulk methylation sequencing, such as FASTQs and unaligned BAMs | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Nucleic Acid Source, Bisulfite Conversion, Sequencing Platform, Replicate Type, Bulk Methylation Assay Type, Total DNA Input | FALSE | Sequencing | Biospecimen | ||||
92 | Bulk Methylation-seq Level 2 | Aligned primary data for bulk methylation sequencing, such as gene expression matrix files, VCFs, etc. | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Alignment Workflow Url, Trimmer, Bulk Methylation Genomic Reference, Genomic Reference URL, Index File Name, Alignment Workflow Type, Duplicate Removal Software, Mean Coverage, Library Layout, Average Base Quality, Average Insert Size, Average Read Length, Contamination, Contamination Error, Pairs On Diff CHR, Total Reads, Total Uniquely Mapped, Total Unmapped reads, Proportion Reads Duplicated, Proportion Reads Mapped, Proportion Targets No Coverage, Proportion Base Mismatch, Short Reads, Proportion of Minimum CpG Coverage 10X, Proportion Coverage 30X | FALSE | Sequencing | Bulk Methylation-seq Level 1, Biospecimen | ||||
93 | Bulk Methylation-seq Level 3 | Sample level summary data for bulk methylation sequencing, such as t-SNE plot coordinates, etc. | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID,DMC Calling Tool, DMC Calling Workflow URL, DMR Calling Tool, DMR Calling Workflow URL, pUC19 methylation ratio, Lambda methylation ratio, DMC data file format, DMR data file Format | FALSE | Sequencing | Bulk Methylation-seq Level 2, Biospecimen | ||||
94 | Imaging Level 1 | Raw imaging data | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Imaging Assay Type, Protocol Link, Software and Version, Commit SHA, Pre-processing Completed, Pre-processing Required, Comment | FALSE | Assay | Biospecimen | ||||
95 | Imaging Level 2 | Raw and pre-processed image data | Component, Filename, File Format, HTAN Participant ID, HTAN Parent Biospecimen ID, HTAN Data File ID, Channel Metadata Filename, Imaging Assay Type, Protocol Link, Software and Version, Microscope, Objective, NominalMagnification, LensNA, WorkingDistance,WorkingDistanceUnit, Immersion, Pyramid, Zstack, Tseries, Passed QC, Comment, FOV number, FOVX, FOVXUnit, FOVY, FOVYUnit, Frame Averaging, Image ID, DimensionOrder, PhysicalSizeX, PhysicalSizeXUnit, PhysicalSizeY, PhysicalSizeYUnit, PhysicalSizeZ, PhysicalSizeZUnit, Pixels BigEndian, PlaneCount, SizeC, SizeT, SizeX, SizeY, SizeZ, PixelType, MERFISH Positions File, MERFISH Codebook File | FALSE | Assay | Imaging Level 1 | ||||
96 | MERFISH Positions File | The positions file is an auxiliary MERFISH file that describes the location of bead positions in the assay. | FALSE | Assay | ||||||
97 | MERFISH Codebook File | The codebook is an auxiliary MERFISH file that describes how each grouping of bits is converted to a gene name. | FALSE | Assay | ||||||
98 | Imaging Level 3 Segmentation | Object segmentations | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Imaging Segmentation Data Type, Parameter file, Software and Version, Commit SHA, Imaging Object Class, Number of Objects | FALSE | Assay | Imaging Level 2 | ||||
99 | Imaging Level 3 Image | Quality controlled imaging data | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Parent Data File ID, HTAN Parent Channel Metadata ID, HTAN Data File ID, Imaging Assay Type, Protocol Link,Software and Version, Microscope, Objective, NominalMagnification, LensNA, WorkingDistance, Immersion, Pyramid, Zstack, Tseries, Passed QC, Comment, FOV number, FOVX, FOVY, Frame Averaging | FALSE | Assay | Imaging Level 3 Channels, Imaging Level 2 | ||||
100 | 10x Visium Spatial Transcriptomics - RNA-seq Level 1 | Files contain raw RNA-seq data associated with spot/slide data. | Component, Filename, Run ID, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Read Indicator, Spatial Read1, Spatial Read2, Spatial Library Construction Method, Library Preparation Days from Index, Sequencing Library Construction Days from Index, End Bias, Reverse Transcription Primer, Sequencing Platform, Capture Area, Protocol Link, Slide Version, Slide ID, Image Re-orientation, Permeabilization Time, RIN, DV200 | FALSE | Spatial Transcriptomics | Biospecimen | ||||
101 | 10x Visium Spatial Transcriptomics - RNA-seq Level 2 | Alignment workflows downstream of Spatial Transcriptomics RNA-seq Level 1. | Component, Filename, File Format, Checksum,HTAN Parent Data File ID, HTAN Data File ID, UMI Tag, Whitelist Spatial Barcode File Link, Spatial Barcode Tag, Applied Hard Trimming, Workflow Version, Workflow Link, Genomic Reference, Genomic Reference URL, Genome Annotation URL, HTAN Parent Biospecimen ID, Run ID, Capture Area | FALSE | Spatial Transcriptomics | 10x Visium Spatial Transcriptomics - RNA-seq Level 1 | ||||
102 | 10x Visium Spatial Transcriptomics - Auxiliary Files | Auxiliary data associated with spot/slide analysis (aligned Images, quality control files, etc) from Spatial Transcriptomics. | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Parent Data File ID, HTAN Data File ID, Run ID, Visium File Type, Slide ID, Capture Area, Workflow Version, Workflow Link | FALSE | Spatial Transcriptomics | 10x Visium Spatial Transcriptomics - RNA-seq Level 1, 10x Visium Spatial Transcriptomics - RNA-seq Level 2 | ||||
103 | 10x Visium Spatial Transcriptomics - RNA-seq Level 3 | Processed data files based on Spatial Transcriptomics RNA-seq Level 2 and Spatial Transcriptomics Auxiliary files. | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Parent Data File ID, HTAN Data File ID, Run ID, Visium File Type, Workflow Version, Workflow Link, Capture Area, Spots under tissue, Mean Reads per Spatial Spot, Median Number Genes per Spatial Spot, Sequencing Saturation, Proportion Reads Mapped, Proportion Reads Mapped to Transcriptome, Median UMI Counts per Spot | FALSE | Spatial Transcriptomics | 10x Visium Spatial Transcriptomics - RNA-seq Level 2, 10x Visium Spatial Transcriptomics - Auxiliary Files | ||||
104 | 10x Visium Spatial Transcriptomics - RNA-seq Level 4 | Processed data files based on Spatial Transcriptomics RNA-seq Level 3. | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Parent Data File ID, HTAN Data File ID, Run ID, Workflow Version, Workflow Link, Visium Workflow Type, Visium Workflow Parameters Description | FALSE | Spatial Transcriptomics | 10x Visium Spatial Transcriptomics - RNA-seq Level 3 | ||||
105 | Visium File Type | The file type generated for the visium experiment. | reference png, reference jpg, json scale factors, probe dataset csv, qc result html, filtered mex, unfiltered mex, tissue_positions, barcodes, features, fiducial image png, fiducial image jpg, detected image png, detected jpg, high res image, low res image, json scale factors, probe dataset csv | TRUE | Spatial Transcriptomics | |||||
106 | Run ID | A unique identifier for this individual run (typically associated with a single slide) of the spatial transcriptomic processing workflow. | TRUE | Spatial Transcriptomics | ||||||
107 | Capture Area | Area (or Capture Area) - One of the either four or two active regions where tissue can be placed on a Visium slide. Each area is intended to contain only one tissue sample. Slide areas are named consecutively from top to bottom: A1, B1, C1, D1 for Visium slides with 6.5 mm Capture Area and A, B for CytAssist slides with 11 mm Capture Area. Both CytAssist slides with 6.5 mm Capture Area and Gateway Slides contain only two slide areas, A1 and D1. | A, B, C, D, A1, B1, C1, D1 | FALSE | Spatial Transcriptomics | |||||
108 | Slide Version | Version of imaging slide used. Slide version is critical for the analysis of the sequencing data as different slides have different capture area layouts. | V1, V2, V3, V4 | FALSE | Spatial Transcriptomics | |||||
109 | Slide ID | For Visium, it is the unique identifier printed on the label of each Visium slide. The serial number starts with V followed by a number which can range between one through five and ends with a dash and a three digit number, such as 123. For CosMx, this refers to the loaded Flow Cell ID. For Xenium, this ID indicates the slide orientation, as it matches the relative location of the ID on the physical Xenium slide. | FALSE | Spatial Transcriptomics | ||||||
110 | Image Re-orientation | To ensure good fiducial alignment and tissue spots detection, it is important to correct for this shift in orientation. | TRUE, FALSE | FALSE | Spatial Transcriptomics | |||||
111 | Permeabilization Time | Fixed and stained tissue sections are permeabilized for different times. Each Capture Area captures polyadenylated mRNA from the attached tissue section. Measure is provided in minutes. | FALSE | Spatial Transcriptomics | ||||||
112 | Whitelist Spatial Barcode File Link | Link to file listing all possible spatial barcodes. URL | TRUE | Spatial Transcriptomics | ||||||
113 | Spatial Barcode Tag | SAM tag for spot barcode field; please provide a valid spot barcode tag (e.g. CB:Z) | TRUE | Spatial Transcriptomics | ||||||
114 | Spatial Barcode Offset | Offset in sequence for spot barcode read (in bp): number | TRUE | Spatial Transcriptomics | ||||||
115 | Spatial Barcode Length | Length of spot barcode read (in bp): number | TRUE | Spatial Transcriptomics | ||||||
116 | Spatial Read1 | Read 1 content description | cDNA, Spatial Barcode and UMI | TRUE | Spatial Transcriptomics | |||||
117 | Spatial Read2 | Read 2 content description | cDNA, Spatial Barcode and UMI | TRUE | Spatial Transcriptomics | |||||
118 | Spatial Library Construction Method | Process which results in the creation of a library from fragments of DNA using cloning vectors or oligonucleotides with the role of adaptors [OBI_0000711] | Smart-seq2, Smart-SeqV4, 10xV1.0, 10xV1.1, 10xV2, 10xV3,10xV3.1, Drop-seq, inDropsV2, inDropsV3, TruDrop, Nextera XT | TRUE | Spatial Transcriptomics | |||||
119 | Spatial Barcode and UMI | Spot and transcript identifiers | Spatial Barcode Offset, Spatial Barcode Length, UMI Barcode Offset, UMI Barcode Length | TRUE | Spatial Transcriptomics | num | ||||
120 | Mean Reads per Spatial Spot | The number of reads, both under and outside of tissue, divided by the number of barcodes associated with a spot under tissue. | TRUE | Spatial Transcriptomics | num | |||||
121 | Visium Workflow Type | Generic name for the workflow used to analyze the visium data set. | TRUE | Spatial Transcriptomics | ||||||
122 | Visium Workflow Parameters Description | Parameters used to run the workflow.. | TRUE | Spatial Transcriptomics | ||||||
123 | Spots under tissue | The number of barcodes associated with a spot under tissue. | TRUE | Spatial Transcriptomics | num | |||||
124 | Median UMI Counts per Spot | The median number of UMI counts per tissue covered spot. | TRUE | Spatial Transcriptomics | num | |||||
125 | Sequencing Saturation | The fraction of reads originating from an already-observed UMI. This is a function of library complexity and sequencing depth. More specifically, this is the fraction of confidently mapped, valid spot-barcode, valid UMI reads that had a non-unique (spot-barcode, UMI, gene). | TRUE | Spatial Transcriptomics | ||||||
126 | Proportion Reads Mapped to Transcriptome | Fraction of reads that mapped to a unique gene in the transcriptome. The read must be consistent with annotated splice junctions. These reads are considered for UMI counting. | TRUE | Spatial Transcriptomics | ||||||
127 | Median Number Genes per Spatial Spot | The median number of genes detected per spot under tissue-associated barcode. Detection is defined as the presence of at least 1 UMI count. | TRUE | Spatial Transcriptomics | num | |||||
128 | NanoString GeoMx DSP Spatial Transcriptomics Level 1 | Files contain raw data output from the NanoString GeoMx DSP Pipeline. These can include RCC or DCC Files. | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Synapse ID of GeoMx DSP PKC File, GeoMx DSP NGS Sequencing Platform, GeoMx DSP NGS Library Selection Method, GeoMx DSP NGS Library Preparation Kit Name, GeoMx DSP Library Preparation Kit Vendor, GeoMx DSP Library Preparation Kit Version, Synapse ID of GeoMx Lab Worksheet File, Software and Version | FALSE | Spatial Transcriptomics | Biospecimen, NanoString GeoMx DSP ROI RCC Segment Annotation Metadata, NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||
129 | GeoMx DSP Assay Type | The assay type which was used for the GeoMx DSP pipeline. | RNA nCounter, Protein nCounter, Protein NGS, RNA NGS | TRUE | Spatial Transcriptomics | |||||
130 | Synapse ID of GeoMx DSP PKC File | The Synapse ID(s) associated with the PKC mapping file for the assay. Multiple files are listed as comma separated values. | TRUE | Spatial Transcriptomics | list::regex match syn\d+ | |||||
131 | GeoMx DSP NGS Sequencing Platform | A platform is an object aggregate that is the set of instruments and software needed to perform a process [OBI_0000050]. Specific model of the sequencing instrument. | FALSE | Spatial Transcriptomics | ||||||
132 | GeoMx DSP NGS Library Selection Method | How RNA molecules are isolated. | FALSE | Spatial Transcriptomics | ||||||
133 | GeoMx DSP NGS Library Preparation Kit Name | Name of Library Preparation Kit. String | FALSE | Spatial Transcriptomics | ||||||
134 | GeoMx DSP Library Preparation Kit Vendor | Vendor of Library Preparation Kit. String | FALSE | Spatial Transcriptomics | ||||||
135 | GeoMx DSP Library Preparation Kit Version | Version of Library Preparation Kit. String | FALSE | Spatial Transcriptomics | ||||||
136 | Synapse ID of GeoMx Lab Worksheet File | Synapse ID(s) of Lab Worksheet Files output from the GeoMx DSP workflow. Multiple files are listed as comma separated values. | FALSE | Spatial Transcriptomics | list::regex match syn\d+ | |||||
137 | NanoString GeoMx DSP Spatial Transcriptomics Level 3 | Files contain processed data from the NanoString GeoMx DSP Pipeline. This level depends on GeoMx Level 1 and Imaging Level 2. | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, GeoMx DSP Assay Type, Synapse ID of GeoMx DSP ROI Segment Annotation File, GeoMx DSP Unique Probe Count, GeoMx DSP Unique Target Count, GeoMx DSP Genomic Reference, Matrix Type, GeoMx DSP Workflow Type, GeoMx DSP Workflow Parameter Description, GeoMx DSP Workflow Link | FALSE | Spatial Transcriptomics | NanoString GeoMx DSP Spatial Transcriptomics Level 1, Imaging Level 2, NanoString GeoMx DSP ROI RCC Segment Annotation Metadata, NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||
138 | Synapse ID of GeoMx DSP ROI Segment Annotation File | Synapse ID(s) for ROI/Segmentation annotations in the GeoMx DSP experiment. | TRUE | Spatial Transcriptomics | list::regex match syn\d+ | |||||
139 | GeoMx DSP Genomic Reference | Exact version of the human genome reference used in the alignment of reads (e.g. https://www.gencodegenes.org/human/). Only applicable to some applications in GeoMx | FALSE | Spatial Transcriptomics | num | |||||
140 | GeoMx DSP Unique Probe Count | Total number of unique probes reported. | FALSE | Spatial Transcriptomics | num | |||||
141 | GeoMx DSP Unique Target Count | Total number of unique genes reported. | FALSE | Spatial Transcriptomics | num | |||||
142 | GeoMx DSP Workflow Type | Generic name for the workflow used to analyze the GeoMx DSP data set. | FALSE | Spatial Transcriptomics | ||||||
143 | GeoMx DSP Workflow Parameter Description | Parameters used to run the GeoMx DSP workflow. | FALSE | Spatial Transcriptomics | ||||||
144 | GeoMx DSP Workflow Link | Link to workflow or command. DockStore.org recommended. URL | FALSE | Spatial Transcriptomics | ||||||
145 | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata | GeoMx ROI and Segment Metadata Attributes. The assayed biospecimen should be reported one per row with the associated ROI coordinates. | HTAN Parent Biospecimen ID, Scan name, ROI name, Segment name, ROI X Coordinate,ROI Y Coordinate, Tags, QC status, Scan Height, Scan Width, Scan Offset X, Scan Offset Y, Binding Density, Positive norm factor, Surface area, Nuclei count, Tissue Stain | FALSE | Assay | |||||
146 | Scan name | GeoMx Scan name (as appears in Segment Summary) | TRUE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata, NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
147 | ROI name | ROI name (application generated). For Xenium this is referred to as the “region name” | TRUE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata, NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
148 | Segment name | Name given to segment at time of generation | TRUE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata, NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
149 | Tags | Unique descriptor of a variable group (ie. MAPK+) | TRUE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata, NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
150 | ROI X Coordinate | X location within the image | TRUE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata, NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
151 | ROI Y Coordinate | Y location within the image | TRUE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata, NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
152 | QC status | ROI quality control flag as reported by the application | FALSE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata | ||||||
153 | Scan Height | Height of the scan for GeoMx Analysis | TRUE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata | ||||||
154 | Scan Width | Width of the scan for GeoMx Analysis | TRUE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata | ||||||
155 | Scan Offset X | Offset X of the scan for GeoMx Analysis | TRUE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata | ||||||
156 | Scan Offset Y | Offset Y of the scan for GeoMx Analysis | TRUE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata | ||||||
157 | Binding Density | The binding density as reported by the application | FALSE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata | ||||||
158 | Positive norm factor | The Positive Control Normalization factor calculated using pos-hyb controls | FALSE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata | ||||||
159 | Surface area | Surface area of the ROI in square microns (µm^2). In CosMx, this is referred to as the Scan Area. In Xenium, this is referred to as the Region Area | TRUE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata, NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
160 | Nuclei count | Number of nuclei detected in the segment (if applicable) | TRUE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata, NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
161 | Tissue Stain | e.g. CD45 or PanCK (if masking was performed) | FALSE | NanoString GeoMx DSP ROI RCC Segment Annotation Metadata | ||||||
162 | NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | GeoMx ROI and Segment Metadata Attributes. The assayed biospecimen should be reported one per row with the associated ROI coordinates. | HTAN Parent Biospecimen ID, Scan name, Slide name, ROI name, Segment name, ROI X Coordinate,ROI Y Coordinate, Tags, Scan Height, Scan Width, Scan Offset X, Scan Offset Y, Surface area, Nuclei count, Sequencing Saturation, MapQ30, Raw reads, Stitched reads, Aligned reads, Deduplicated reads, In Situ Negative median, Biological probe median | FALSE | Assay | |||||
163 | Slide name | Similar to a Run ID, the slide name indicates the slide a given ROI is linked to (as reported in Segment Summary). | FALSE | NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
164 | Raw reads | Reads not yet analyzed in any way to be used for data analysis. The number of reads that pass filter from the flow cell represented in the FASTQ file. | FALSE | NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
165 | Stitched reads | Represents consensus from the overlapping sequence of read 1 and 2. This is a % of the aligned reads that were overlapped and consensus confirmed, usually upward of 80% but less in terms of number of reads than aligned reads | FALSE | NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
166 | Aligned reads | Is a sequence that has been aligned to a gene/probe. Typically these reads can number from the hundreds of thousands to tens of millions. In GeoMx alignment is via mapping the RTS ID to a white list of sequences that represent targets. | FALSE | NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
167 | Deduplicated reads | Is the replacement of blocks of duplicate data with a Virtual Index Pointer linking the new sub-block to the existing block of data in a duplicate repository. This is used to reduce the amount of space need to store the data. | FALSE | NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
168 | In Situ Negative median | Is the median of all negative control probes for a given segment. A measure of signal to background for each segment. | FALSE | NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
169 | Biological probe median | Is the median count from all probes except the negative control probes. A measure of signal to background for each segment | FALSE | NanoString GeoMx DSP ROI DCC Segment Annotation Metadata | ||||||
170 | HI-C-seq Level 1 | Unaligned sequence data | Component, HTAN Parent Biospecimen ID, HTAN Data File ID, Filename, File Format, Genomic Reference, Sequencing Platform, Nucleic Acid Source, Technical Replicate Group, Transposition Reaction, Crosslinking Condtion, DNA Digestion Condition, Nuclei Permeabilization Method, Ligation Condition, Biotin Enrichment, DNA Input Amount, Total Reads, Protocol Link | FALSE | Sequencing | Biospecimen | ||||
171 | HI-C-seq Level 2 | Aligned read pairs, contact matrix | Component, HTAN Data File ID, HTAN Parent Data File ID, Filename, File Format, Genomic Reference, Aligned Read Length, Tool, Resolution, Normalization Method | FALSE | Sequencing | HI-C-seq Level 1 | ||||
172 | HI-C-seq Level 3 | Summary data for the HI-C-seq assay. | Component, HTAN Parent Data File ID, HTAN Data File ID, Filename, File Format, Genomic Reference, Stripe Calling, Loop Window, Stripe Window, Loop Calling | FALSE | Sequencing | HI-C-seq Level 2 | ||||
173 | Crosslinking Condtion | Detailed condition for DNA crosslinking | TRUE | Sequencing | ||||||
174 | DNA Digestion Condition | Enzymes and treatment length/temperature for genome digestion | TRUE | Sequencing | ||||||
175 | Nuclei Permeabilization Method | Detergent and treatment condition for nuclei permeabilization and crosslinking softening | TRUE | Sequencing | ||||||
176 | Ligation Condition | Name of ligase and condition for proximity ligation | TRUE | Sequencing | ||||||
177 | Biotin Enrichment | Whether biotin is used for enriching ligation product | Yes, No | TRUE | Sequencing | |||||
178 | DNA Input Amount | Amount of DNA for library construction, in nanograms. | TRUE | Sequencing | int | |||||
179 | Resolution | Binning size used for generating contact matrix, in basepair. | TRUE | Sequencing | ||||||
180 | Stripe Calling | Tool used for identifying architectural stripe-forming, interaction hotspots. | MACS2, Other | TRUE | Sequencing | |||||
181 | Loop Window | Binning size used for calling significant dot interactions (loops) | TRUE | Sequencing | list like :: regex search -?\d+ | |||||
182 | Stripe Window | Binning size used for calling significant architectural stripes. Can be an integer or comma-separated list of integers indicating bin size and sliding window size if different. | TRUE | Sequencing | list like :: regex search -?\d+ | |||||
183 | Loop Calling | Tool used for identifying loop interactions | HiCCUPS, Cooltools, Other | TRUE | Sequencing | |||||
184 | Imaging Level 4 | Derived imaging data: Object-by-feature array | Component, Filename, File Format, HTAN Parent Data File ID, HTAN Parent Channel Metadata ID, HTAN Data File ID, Parameter file, Software and Version, Commit SHA,Number of Objects, Number of Features,Imaging Object Class, Imaging Summary Statistic | FALSE | Assay | Imaging Level 3 Channels | ||||
185 | SRRS Imaging Level 2 | SRRS-specific HTAN raw and pre-processed image data | Component, Filename, File Format, HTAN Participant ID, HTAN Parent Biospecimen ID, HTAN Data File ID, Channel Metadata Filename, Imaging Assay Type, Protocol Link, Software and Version, Microscope, Objective, NominalMagnification, Pyramid, Zstack, Tseries, Passed QC, Frame Averaging, Image ID, DimensionOrder, PhysicalSizeX, PhysicalSizeXUnit, PhysicalSizeY, PhysicalSizeYUnit, Pixels BigEndian, PlaneCount, SizeC, SizeT, SizeX, SizeY, SizeZ, PixelType | FALSE | Assay | Biospecimen | ||||
186 | 10X Genomics Xenium ISS Experiment | All data pertaining to the 10X Genomics Xenium In-Situ Hybridization experiment | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Xenium Bundle Contents, Slide ID, ROI name, Panel Name, Protocol Link, Software and Version,Total Number of Cells, Total Number of Targets, Surface area, Experiment IF Channels, Transcripts per Cell, Percent of Transcripts within Cells, Decoded Transcripts, Xenium IF image HTAN File ID, Xenium HE image HTAN File ID | FALSE | Spatial Transcriptomics | Biospecimen | ||||
187 | Xenium Bundle Contents | A comma separated list of filenames within the Xenium bundle zip file | TRUE | Spatial Transcriptomics | ||||||
188 | Panel Name | The human-readable panel name. This could be the Gene Panel name or Protein Panel name. In Xenium, this refers to the string entered as the name in panel specification (e.g. Xenium Human Immuno-Oncology Add-on B Gene Expression). In CosMx, this refers to the panel name as it appears in the CosMx catalog (e.g. CosMx Human Universal Cell Characterization Panel (1000-plex)) | TRUE | Spatial Transcriptomics | ||||||
189 | Total Number of Cells | The total number of cells analyzed on the flow cell | TRUE | Spatial Transcriptomics | ||||||
190 | Total Number of Targets | Refers to the target of an assay. Can be genes/transcripts or probes | TRUE | Spatial Transcriptomics | ||||||
191 | Experiment IF Channels | A comma-separated list with any number of channels the user deems appropriate(Example: PanCK, CD45, CD3, DAPI) | TRUE | Spatial Transcriptomics | ||||||
192 | Transcripts per Cell | Mean or Median transcript count per cell analyzed on the flow cell or slide | TRUE | Spatial Transcriptomics | ||||||
193 | Percent of Transcripts within Cells | The percentage of transcripts assigned to assayed cells | TRUE | Spatial Transcriptomics | ||||||
194 | Decoded Transcripts | In Xenium, this is the number of high-quality, decoded-to-gene nuclear transcripts divided by the total segmented nuclear area to get a transcript density (units are reported in 100um^2). | TRUE | Spatial Transcriptomics | ||||||
195 | Xenium IF image HTAN File ID | The HTAN Data File ID of a Imaging Level 2 file | FALSE | Spatial Transcriptomics | ||||||
196 | Xenium HE image HTAN File ID | The HTAN Data File ID of a Imaging Level 2 file | FALSE | Spatial Transcriptomics | ||||||
197 | RPPA Level 2 | Array based protemics. Each dilution curve of spot intensities is fitted using the monotone increasing B-spline model in the SuperCurve R package. This fits a single curve using all the samples on a slide with the signal intensity as the response variable and the dilution steps as independent variables. The fitted curve is plotted with the signal intensities on the y-axis and the log2-concentration of proteins on the x-axis for diagnostic purposes. | Component, Filename, File Format, HTAN Participant ID, HTAN Parent Biospecimen ID, HTAN Parent Data File ID, HTAN Data File ID, HTAN RPPA Antibody Table, Assay Type, Protocol Link, Software and Version | FALSE | Assay | Biospecimen | ||||
198 | HTAN RPPA Antibody Table | A table containing antibody level metadata for RPPA | HTAN RPPA Antibody Table ID, Filename, File Format, Ab Name Reported on Dataset, GENCODE Gene Symbol Target, UNIPROT Protein ID Target, Phosphoprotein Flag, Vendor, Catalog Number, Internal Ab ID, Species, RPPA Dilution, Phospho Site, RPPA Validation Status, Clone, Clonality, Antibody Notes | TRUE | RPPA Level 2 | |||||
199 | RPPA Level 3 | Level 3 Reverse Phase Protein Array (RPPA) data contains intra-batch normalized intensities. | Component, Filename, File Format, HTAN Participant ID, HTAN Parent Biospecimen ID, HTAN Parent Data File ID, HTAN Data File ID, Assay Type, Software and Version, Normalization Method | FALSE | Assay | Biospecimen | ||||
200 | RPPA Level 4 | Level 4 Reverse Phase Protein Array (RPPA) data contains intra-batch corrected intensities. | Component, Filename, File Format, HTAN Participant ID, HTAN Parent Biospecimen ID, HTAN Parent Data File ID, HTAN Data File ID, Assay Type, Batch Correction Method | FALSE | Assay | RPPA Level 2 | ||||
201 | Nanostring CosMx SMI Experiment | RNA and Protein Panel assays applied as part of Nanostring CosMx Spatial Molecular Imager (SMI) | Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, CosMx Bundle Contents, Slide ID, CosMx Assay Type, Panel Name, Protocol Link, Software and Version, Total Number of Cells, Total Number of Targets, Number of FOVs, Surface area, Experiment IF Channels, Transcripts per Cell, Percent of Transcripts within Cells, Mean Total Transcripts per Area, Unique Genes, Total Negative Probe Counts | FALSE | Spatial Transcriptomics | Biospecimen | ||||
202 | CosMx Bundle Contents | A comma separated list of filenames within the CosMx bundle zip file | TRUE | Spatial Transcriptomics |