Skip to content

Latest commit

 

History

History
77 lines (46 loc) · 4.02 KB

README.md

File metadata and controls

77 lines (46 loc) · 4.02 KB

HSA-NRL PWC visitors Tweet

This repo is the official implementation of our paper "Hard Sample Aware Noise Robust Learning for Histopathology Image Classification".

News

  • ⚡(2021-11-20): Chaoyang dataset was released HERE.

Citation

If you use this code/data for your research, please cite our paper "Hard Sample Aware Noise Robust Learning for Histopathology Image Classification".

@article{zhuhard,
  title={Hard Sample Aware Noise Robust Learning for Histopathology Image Classification},
  author={Zhu, Chuang and Chen, Wenkai and Peng, Ting and Wang, Ying and Jin, Mulan},
  journal={IEEE transactions on medical imaging}
}

Data

Chaoyang

  • Chaoyang dataset contains 1111 normal, 842 serrated, 1404 adenocarcinoma, 664 adenoma, and 705 normal, 321 serrated, 840 adenocarcinoma, 273 adenoma samples for training and testing, respectively. (Notes: "0" means normal, "1" means serrated, "2" means adenocarinoma, and "3" means adenoma in our dataset files.)

Using instructions

  • Notes: step1.py is the label correction phase, NSHE.py is the NSHE phase.

  • Getting started:

    Run step1.py first to generate the "Almost clean dataset". Then run NSHE.py to train the model by the "Almost clean dataset".

    Take Chaoyang dataset as an example:

    First, run python step1.py --dataset chaoyang and get the "Almost clean dataset" file named "chaoyang_15_step1.p".

    Then, run the command below to train the model by the generated "Almost clean dataset".

    python NSHE.py --dataset chaoyang --forget_rate 0.01 --pickle_path chaoyang_15_step1.p

    (Remember to modify the dataset path before using.)

License

This project is made freely available to academic and non-academic entities for non-commercial purposes such as academic research, teaching, scientific publications, or personal experimentation. Permission is granted to use the data given that you agree to our license terms bellow:

  1. That you include a reference to our paper in any work that makes use of the data/code. For research papers, cite our preferred publication; for other media cite our preferred publication or link to our github project.
  2. That you do not distribute this dataset or modified versions. It is permissible to distribute derivative works in as far as they are abstract representations of this dataset (such as models trained on it or additional annotations that do not directly include any of our data).
  3. That you may not use the dataset or any derivative work for commercial purposes as, for example, licensing or selling the data, or using the data with a purpose to procure a commercial gain.
  4. That all rights not expressly granted to you are reserved by us.

Contact

Wenkai Chen

Chuang Zhu

If you have any questions, please contact us directly.

Additional Info

Some parts of our code are borrowed from the official Co-teaching implementation.

Acknowledgements

  • Thanks Chaoyang hospital for dataset annotation.