GitHub - simonorozcoarias/ML_DL_microArrays: Here, we describe the comparison of the most used algorithms in classical ML and DL to classify carcinogenic tumors described on 11_tumor data base, obtaining accuracies between 76.97% and 100% for tumor identification. Our results bring up a more efficient an accurate classification method based on gene expression (microarray data) and ML/DL algorithms, which facilitates the prediction of the tumor type from a multiple cancer types scenario

A comparative study of machine learning and deep learning algorithms to classify cancer types based on microarray gene expression data

ABSTRACT Cancer classification is a topic of major interest in medicine since it allows accurate and efficient diagnosis and facilitates a successful outcome in medical treatments. Previous studies have classified human tumors using a large-scale RNA profiling and supervised Machine Learning (ML) algorithms to construct a molecular-based classification of carcinoma cells from breast, bladder, adenocarcinoma, colorectal, gastro esophagus, kidney, liver, lung, ovarian, pancreas, and prostate tumors. These datasets are collectively known as the 11_tumor database, although this database has been used in several works in the ML field, no comparative studies of different algorithms can be found in the literature. On the other hand, advances in both hardware and software technologies have fostered considerable improvements in the precision of solutionsthatuse ML,suchasDeep Learning (DL).Inthisstudy,wecompare the most widely used algorithms in classical ML and DL to classify the tumors described in the 11_tumor database. We obtained tumor identification accuracies between 90.6% (Logistic Regression) and 94.43% (Convolutional Neural Networks) using k-fold cross-validation. Also, we show how a tuning process may or may not significantly improve algorithms’ accuracies. Our results demonstrate an efficient and accurate classification method based on gene expression (microarray data) and ML/DL algorithms, which facilitates tumor type prediction in a multi-cancer-type scenario.

Citation

Tabares-Soto R, Orozco-Arias S, Romero-Cano V, Segovia Bucheli V, Rodríguez-Sotelo JL, Jiménez-Varón CF. 2020. A comparative study of machine learning and deep learning algorithms to classify cancer types based on microarray gene expression data. PeerJ Comput. Sci. 6:e270 DOI 10.7717/peerj-cs.270

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
TareaDB11Tumores-9Algortimos-V4.ipynb		TareaDB11Tumores-9Algortimos-V4.ipynb
data11tumors2.csv		data11tumors2.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

simonorozcoarias/ML_DL_microArrays

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages