You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PEPATAC produces the PEPATAC_commands.sh and PEPATAC_log.md files, which makes it very easy to follow which commands were executed with which parameters and how each output file was exactly produced. Given the PEPATAC_command.sh file, it seems possible to execute that file and get the exact same results back. However, one step is directly implement in the pepatac.py file and is thus missing from PEPATAC_commands.sh and PEPATAC_log.md. In commit 461ae32c8ddf06bad5362aea1430b5dd714a3f3f this is the lines 2362-2384, containing the following code:
df = pd.read_csv(peak_output_file, sep='\t', header=None,
names=("V1","V2","V3","V4","V5","V6",
"V7","V8","V9","V10"))
nineNine = df['V5'].quantile(q=0.99)
df.loc[df['V5'] > nineNine, 'V5'] = nineNine
# rescale score to be between 0 and 1000
df['V5'] = rescale(np.log(df['V5']), [0, 1000])
cs = pd.read_csv(res.chrom_sizes, sep='\t', header=None,
names=("V1","V2"))
df = df.merge(cs, on="V1")
df.columns = ["V1","V2","V3","V4","V5","V6",
"V7","V8","V9","V10","V11"]
# make sure 'chromEnd' positions are not greater than the
# max chrom_size
n = np.array(df['V3'].values.tolist())
df['V3'] = np.where(n > df['V11'], df['V11'], n).tolist()
df = df.drop(columns=["V11"])
# ensure score is a whole integer value
df['V5'] = pd.to_numeric(df['V5'].round(), downcast='integer')
df.to_csv(temp.name, sep='\t', header=False, index=False)
Refactoring these lines out into a tools/ script and calling it through the normal mechanism would include the call into the PEPATAC_commands.sh and PEPATAC_log.md files, making it possible to track how each result file was produced exactly.
The text was updated successfully, but these errors were encountered:
There is also a postprocess step for the fseq and fseq2 peak caller in lines 2107-2121, which does the same. That step overrides the peak file instead of creating a temporary file.
Yes. This sounds good. We will attempt to add this with our next release.
Also, please feel free to submit a PR if you would like to take a shot at it.
PEPATAC produces the
PEPATAC_commands.sh
andPEPATAC_log.md
files, which makes it very easy to follow which commands were executed with which parameters and how each output file was exactly produced. Given thePEPATAC_command.sh
file, it seems possible to execute that file and get the exact same results back. However, one step is directly implement in thepepatac.py
file and is thus missing fromPEPATAC_commands.sh
andPEPATAC_log.md
. In commit461ae32c8ddf06bad5362aea1430b5dd714a3f3f
this is the lines 2362-2384, containing the following code:Refactoring these lines out into a
tools/
script and calling it through the normal mechanism would include the call into thePEPATAC_commands.sh
andPEPATAC_log.md
files, making it possible to track how each result file was produced exactly.The text was updated successfully, but these errors were encountered: