The benchmark
function can be used to calculate the area under the precision recall curve (AUPRC) for a bigwig file compared to a gold standard in bigwig format. The user must provide the predictions in bigwig format and specify the resolution of the evaluation (e.g., 200bp).
maxatac benchmark --prediction GM12878_CTCF_chr1.bw --gold_standard GM12878_CTCF_ENCODE_IDR.bw --chromosomes chr1 --bin_size 200
The input bigwig file of transcription factor binding predictions. This file can also be any bigwig signal track that you want to compare against a gold standard.
The input gold standard bigwig file. This file needs to be a binary signal track that has 1 corresponding to TFBS (e.g., from ChIP-seq) and 0 in positions with no TFBS.
The output filename prefix to use. Default: maxatac_benchmark
The chromosomes to benchmark the predictions for. Default: chr1
is the held out test chromosome.
The size of the bin to use for aggregating the single base-pair predictions. Default: 200
is the size used by the ENCODE-DREAM in vivo TFBS Prediction Challenge
The method to use for aggregating the single base-pair predictions into larger bins. Options include max
, min
, and mean
. Default: max
score found in the window.
See the pyBigWig documentation for more details.
This flag will set the precision of the predictions signal track. Provide an integer that represents the number of floats before rounding. Currently, the predictions go from 0 - .0000000001
. Default: 9
is the limit of precision from TensorFlow.
The output directory to write the results to. Default: ./prediction_results
The path to the blacklist bigwig signal track of regions that should be excluded. Default: hg38_maxatac_blacklist.bed
which contains regions that are specific to ATAC-seq.
This argument is used to set the logging level. Currently, the only working logging level is ERROR
.