Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make a peppy.Project from a sample yaml file #458

Closed
donaldcampbelljr opened this issue Dec 5, 2023 · 3 comments
Closed

Make a peppy.Project from a sample yaml file #458

donaldcampbelljr opened this issue Dec 5, 2023 · 3 comments

Comments

@donaldcampbelljr
Copy link
Contributor

          example: I would like to make a peppy.Project from a sample yaml file:
- sample_name: sample 1
  file: path/to/file.tsv
- sample_name: sample 2
  file: path/to/2.tsv
prj = peppy.Project(sample_yaml="path.yaml")

or `prj = peppy.Project(sample_yaml={ ... sample dict ... })

Originally posted by @nsheff in #457 (comment)

@khoroshevskyi
Copy link
Member

As far as I know, peppy doesn't have this functionality.

But there is workaround, you can use from_dict:

project_dict = read_yaml(...)
# insesrt project_dict to this function:
prj = Project().from_dict({'_sample_df': dict,
                                        '_config': dict,
                                        '_subsample_list':list[dict],
                                        'name': str,
                                        'description': str}
)

@khoroshevskyi
Copy link
Member

@nsheff could you provide example yaml file that can be provided as input to this function?

@nsheff
Copy link
Contributor

nsheff commented Dec 7, 2023

Here's another example:

- assembly: HG01891.alt.pat.f1_v2
  population: African Caribbean In Barbados
  assembly_accession: GCA_018467165.1
  assembly_link: https://www.ebi.ac.uk/ena/browser/view/GCA_018467165.1
  assembly_submitter: UCSC Genomics Institute
  annotation_gtf: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018467165.1/ensembl/geneset/2022_07/Homo_sapiens-GCA_018467165.1-2022_07-genes.gtf.gz
  annotation_gff3: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018467165.1/ensembl/geneset/2022_07/Homo_sapiens-GCA_018467165.1-2022_07-genes.gff3.gz
  proteins: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018467165.1/ensembl/geneset/2022_07/Homo_sapiens-GCA_018467165.1-2022_07-pep.fa.gz
  transcripts: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018467165.1/ensembl/geneset/2022_07/Homo_sapiens-GCA_018467165.1-2022_07-cdna.fa.gz
  variants_clinvar: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018467165.1/ensembl/variation/2022_10/vcf/Homo_sapiens-GCA_018467165.1-2022_10-clinvar.vcf.gz
  variants_gnomad: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018467165.1/ensembl/variation/2022_10/vcf/Homo_sapiens-GCA_018467165.1-2022_10-gnomad.vcf.gz
  ftp_dumps: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018467165.1
  rapid_link: https://rapid.ensembl.org/Homo_sapiens_gca018467165v1/Info/Index
  file_name: Homo_sapiens-GCA_018467165.1-unmasked.fa.gz
  url: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018467165.1/ensembl/genome/Homo_sapiens-GCA_018467165.1-unmasked.fa.gz
  local_file: data/HG01891.alt.pat.f1_v2.unmasked.fa.gz
  remote_md5: f80e3ab39c3a3245cc7d3edadac1adfd
  fasta: analysis/data/HG01891.alt.pat.f1_v2.unmasked.fa.gz
- assembly: HG01258.pri.mat.f1_v2
  population: Colombian In Medellin, Colombia
  assembly_accession: GCA_018469405.1
  assembly_link: https://www.ebi.ac.uk/ena/browser/view/GCA_018469405.1
  assembly_submitter: UCSC Genomics Institute
  annotation_gtf: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018469405.1/ensembl/geneset/2022_07/Homo_sapiens-GCA_018469405.1-2022_07-genes.gtf.gz
  annotation_gff3: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018469405.1/ensembl/geneset/2022_07/Homo_sapiens-GCA_018469405.1-2022_07-genes.gff3.gz
  proteins: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018469405.1/ensembl/geneset/2022_07/Homo_sapiens-GCA_018469405.1-2022_07-pep.fa.gz
  transcripts: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018469405.1/ensembl/geneset/2022_07/Homo_sapiens-GCA_018469405.1-2022_07-cdna.fa.gz
  variants_clinvar: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018469405.1/ensembl/variation/2022_10/vcf/Homo_sapiens-GCA_018469405.1-2022_10-clinvar.vcf.gz
  variants_gnomad: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018469405.1/ensembl/variation/2022_10/vcf/Homo_sapiens-GCA_018469405.1-2022_10-gnomad.vcf.gz
  ftp_dumps: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018469405.1
  rapid_link: https://rapid.ensembl.org/Homo_sapiens_gca018469405v1/Info/Index
  file_name: Homo_sapiens-GCA_018469405.1-unmasked.fa.gz
  url: https://ftp.ensembl.org/pub/rapid-release/species/Homo_sapiens/GCA_018469405.1/ensembl/genome/Homo_sapiens-GCA_018469405.1-unmasked.fa.gz
  local_file: data/HG01258.pri.mat.f1_v2.unmasked.fa.gz
  remote_md5: cf7d737137c312357b409b962eea0494
  fasta: analysis/data/HG01258.pri.mat.f1_v2.unmasked.fa.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants