bedToBigBed

This is a Python script to convert BED files to BigBed format, used by the UCSC Genome Browser.

At the moment, this script works specifically for the CAGE track files as it takes into consideration it's format.

To run the script:

python main.py <dir_path>

where dir_path is the path of the directory containing the list of BED files.

The original BED files submitted to us (novel_CAGE and annot_CAGE) do not abide to the UCSC rules for BED format and therefore several changes were made to the files before they could be converted to BigBed.

After the conversion script is run on the directory path, 2 sub-directories are created in the dir_path.

Those 2 sub-directpries are namely: updated and bigbed

1) updated directory

This folder contains the modified bed files which abide to the UCSC rules for BED files format.

The changes made are:

inclusion of the name field. "." is used since the name field has not been provided and is thus considered empty.
moving the width column to the end because it’s a non-standard user-defined column and needs to be after all other BED fields
swapping the order of score and strand to abide to BED fields ordering
removal of headers
editing the chromEnd value from 16617 to 16616 because of error message thrown by the bedToBigBed application. The chromEnd value provided by our submitter is 16617 while the value of the chromEnd size for NC_001941.1 is 16616. See chrom.sizes file CF_002742125.1_Oar_rambouillet_v1.0.chrom.sizes
score value must be between 0 and 1000. Score was therefore changed to int and where the value is greater than 1000, only the first 3 digits are considered as score - assuming that the decimal point was misplaced by our submitter.
an autosql file is used to describe the fields and include the non-standard fields to ensure that conversion to bigBed happens seamlessly

2) bigbed

This folder contains the successfully generated bigBed files, ready to be uploaded to UCSC.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.idea		.idea
awk		awk
configs		configs
files		files
logs		logs
module		module
.DS_Store		.DS_Store
README.md		README.md
bedToBigBed		bedToBigBed
main.py		main.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bedToBigBed

To run the script:

About

Releases

Packages

Languages

FAANG/bedToBigBed

Folders and files

Latest commit

History

Repository files navigation

bedToBigBed

To run the script:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages