Skip to content

Latest commit

 

History

History
229 lines (144 loc) · 9.26 KB

README.md

File metadata and controls

229 lines (144 loc) · 9.26 KB

Identifying nearby sources of ultra-high-energy cosmic rays with deep learning

by Oleg Kalashev, Maxim Pshirkov, Mikhail Zotov

Please cite the following paper

source code and supplemental materials

1. Unpack KKOS spectra files

cd src
tar xfoj data/spectra_1s.tbz2

2. Sample E,Z pairs

  1. Sample 100000 events (E,Z pairs) with energy above 56 EeV for the source located at 3.5 Mpc

    python3 psample.py --distance 3.5 --Nini 100000 --Emin 56
    • same with mass composition shift A -> A/2
    python3 psample.py --distance 3.5 --Nini 100000 --Emin 56 --shiftA 0.5
    • same with monochromatic composition, e.g. A=16
    python3 psample.py --distance 3.5 --Nini 100000 --Emin 56 --shiftA -16
  2. sorting output

cat sample_D3.5_Emin56_100000nuclei.txt | awk 'NR>6' | sort | uniq -c | awk '{print $2 " " $3 " " $1}' > data/sample_D3.5_Emin56_100000nuclei_sorted.txt

3. Deflection map preparation

Prerequisites:

Install crpropa along with python3 integration

Command

To calculate deflection map for pair E, Z run command

python3 galback.py Z E

You can edit healpix grid resolution parameter Nside and galactic magnetic fields model in galback.py For list of available galactic magnetic field models scroll to line

# Model of the Galactic Magnetic Field

in galback.py

4. From-source event map generation

Prerequisites:

  • data/sample_D3.5_Emin56_100000nuclei_sorted.txt generated in step 2.2 (here 3.5 is distance to CenA)
  • data/jf/32 should contain deflection maps for all pair E,Z (e.g. helium_141EeV.txt.xz etc.), where jf is magnetic field model

Command

python3 src_sample.py --source_id CenA --Nside 32 --Nini 100000 --GMF jf --Emin 56

Output

file with arrival directions density map data/jf/sources/src_sample_CenA_D3.5_Emin56_N100000_R1_Nside32.txt.xz

Map converter tool

We advice to generate maps for higher Nside i.e. Nside=512 and convert them to lower Nside if needed. This can be done with the following command

python3 convert_src_sample.py --source CenA --Nside_ini 512 --Nini 100000 --mf jf --Nside 32

5. Train classifier on HEALPix grid and estimate minimal from-source event fraction

Prerequisites:

  • Corresponding from-source event maps prepared in step 4. should be located in folder data/jf/sources/

Command

For single source classifier

python3 train_healpix.py --source_id CenA --Neecr 500 --log_sample --n_samples 100000 --mf jf --n_early_stop 2 --Nside 32

For multisource classifier

python3 train_healpix.py --source_id "CenA,M82,NGC253,M87,FornaxA" ...

To check the resulting test statistic performance for alternative magnetic field model use --compare_mf flag:

python3 train_healpix.py --mf jf --compare_mf pt ...

For nonuniform exposure use --exposure flag. Example of Telescope Array exposure is provided in exposure.py

Output

  • CenA_N500_Bjf_Ns32-1_F32_v0.h5 neural network classifier
  • CenA_N500_Bjf_Ns32-1_F32_v0.h5.score text file with classifier and test statistic metrics
  • CenA_N500_Bjf_Ns32-1_F32_v0_det_frac.txt log file containing the evolution of minimal from-source event fraction during neural network training

6. Apply pretrained network classifier to an arbitrary event map

Prerequisites:

  • Event map(s) must be saved in src_sample format (see step 4.)
  • To be able to load the models saved earlier to .h5 apply patch to NNhealpix library
cd (NNhealpix dir)/nnhealpix/layers
patch < (uhecr_aniso dir)/src/nnhealpix_layers.patch

Command

python3 calc_min_fractions.py data/jf/sources/src_sample_CenA_D3.5_Emin56_N10000_R1_Nside32_shift2.0.txt.xz --log_sample --Neecr 300 --n_samples 10000 --Nside 32 --model CenA_FornaxA_M82_M87_NGC253_N300_Bjf_Ns32-1_F32_v0.h5

In case several src_sample files are given, maps are generated using each of them in roughly equal amounts (only one random map is used for each sample generation)

With flag --fractions several maps in given proportions can be used for each sample generation, e.g.:

python3 calc_min_fractions.py data/jf/sources/src_sample_CenA_D3.5_Emin56_N10000_R1_Nside32.txt.xz  data/jf/sources/src_sample_FornaxA_D20.0_Emin56_N10000_R1_Nside32.txt.xz  --fractions 3 1 --log_sample --Neecr 300 --n_samples 10000 --Nside 32 --model CenA_FornaxA_M82_M87_NGC253_N300_Bjf_Ns32-1_F32_v0.h5

Output

Program outputs minimal detectable from-source event fraction on particular map(s) along with type I error alpha. The information is appended to log file CenA_FornaxA_M82_M87_NGC253_N300_Bjf_Ns32-1_F32_v0.h5_cmp.txt

Graph convolutional neural network based test statistics

1. Train classifier and estimate minimal from-source event fraction

Current implementation uses HEALPix maps of the from-source events.

Prerequisites:

  • Corresponding from-source event maps prepared in step 4. of the previous section should be located in folder data/jf/sources/

Command

For single source classifier

python3 train_gcnn.py --source_id CenA --Neecr 100 --log_sample --n_samples 10000 --n_early_stop 2 --Nside 256 --Nini 100000 --n_epochs 20

For multisource classifier

python3 train_gcnn.py --source_id "CenA,M82,NGC253,M87,FornaxA" ...

To check the resulting test statistic performance for alternative magnetic field model use --compare_mf flag:

python3 train_gcnn.py --mf jf --compare_mf pt ...

Output

  • CenA_N100_Bjf_gc3_k10_K30_20_D256_sig20_v0.h5 neural network classifier
  • CenA_N100_Bjf_gc3_k10_K30_20_D256_sig20_v0.h5.score text file with classifier and test statistic metrics
  • CenA_N100_Bjf_gc3_k10_K30_20_D256_sig20_v0.h5_det_frac.txt log file containing the evolution of minimal from-source event fraction during neural network training

2. Apply pretrained network classifier to an arbitrary event map

Prerequisites:

  • Event map(s) must be saved in src_sample format (see step 4. of the previous section)
  • Model should be prepaired

Command

python3 calc_min_fractions.py data/jf/sources/src_sample_CenA_D3.5_Emin56_N10000_R1_Nside32_shift2.0.txt.xz --log_sample --Neecr 300 --n_samples 10000 --Nside 32 --model CenA_N100_Bjf_gc3_k10_K30_20_D256_sig20_v0.h5

Angular power spectrum based test statistic

1. Generate sample data files

Two groups of sample data files should be generated - training and testing. Each group should contain at least one with mixed sample spectra and at least one with isotropic data

mixed sample datafile generation:

python3 mixed_spectrum_gen.py --Neecr 50 --Nmixed_samples 100000 --source_id CenA --log_sample --f_src_min 1e-2

Output is saved to aps_CenA_D3.5_Bjf_Emin56_Neecr50_Nsample100000_R1_Nside512_logF_src_f_min0.01_0.npz Consequent multiple executions of the above command in the same folder will produce files xxx_1.npz, xxx_2.npz, etc. with unique samples which is ensured by random seed initialization

isotropic coefficients generation:

python3 mixed_spectrum_gen.py --Neecr 50 --Nmixed_samples 100000 --source_id CenA --f_src 0

Output is saved to iso_Neecr50_Nsample100000_Nside512_0.npz, iso_Neecr50_Nsample100000_Nside512_1.npz, etc.

select file used for normalization

This could be any file created on step 1. Edit train_spec.py to set norm_file parameter

norm_file = 'aps_CenA_D3.5_Emin56_Neecr500_Nsample3000_R1_Nside512_100.npz'

2. Train classifier

python3 train_spec.py aps_CenA_D3.5_Bjf_Emin56_Neecr50_Nsample100000_R1_Nside512_logF_src_f_min0.01_0.npz iso_Neecr50_Nsample100000_Nside512_0.npz

Here use files created in previous step as parameters.

Output

spectrum_L33_th0.01_v0.h5 trained classifier

3. Calculate test statistic on test data

Prerequisites:

  • Two or more files generated in step 1 which belong to the testing group

Command:

python3 nn_f_spec.py aps_CenA_D3.5_Bjf_Emin56_Neecr50_Nsample100000_R1_Nside512_logF_src_f_min0.01_1.npz iso_Neecr50_Nsample100000_Nside512_1.npz --model spectrum_L33_th0.01_v0.h5

Output

Files containing test statistics defined by the classifier, calculated on the samples provided spectrum_L33_th0.01_v0__aps_CenA_D3.5_Bjf_Emin56_Neecr50_Nsample100000_R1_Nside512_logF_src_f_min0.01_1.npz spectrum_L33_th0.01_v0__iso_Neecr50_Nsample100000_Nside512_1.npz

4. Calculate minimal detectable fractions

Prerequisites:

  • Files containing test statistics calculated in step 3

Command

python3 calc_fractions_ps.py --mixed

spectrum_L33_th0.01_v0__aps_CenA_D3.5_Bjf_Emin56_Neecr50_Nsample100000_R1_Nside512_logF_src_f_min0.01_1.npz --iso spectrum_L33_th0.01_v0__iso_Neecr50_Nsample100000_Nside512_1.npz

The command output to the console: detectable_fraction, alpha (type I error probability)

5 Arbitrary test statistic evaluation

One could use arbitrary test statistics instead of neural network based classifier. To do this edit f_spec.py to define your custom statistic

def f(spec):
    ts = ..# define your statistic here
    return ts

and replace step 3 by the following command

python3 f_spec.py aps_CenA_D3.5_Bjf_Emin56_Neecr50_Nsample100000_R1_Nside512_logF_src_f_min0.01_1.npz iso_Neecr50_Nsample100000_Nside512_1.npz