Hybrid Feature Selection for Complex Diseases SNPs
Machine learning techniques have the potential to revolutionise medical diagnosis. Single Nucleotide Polymorphisms (SNPs) are one of the most important sources of human genome variability; thus, they have been implicated in several human diseases. To separate the affected samples from the normal ones, various techniques have been applied on SNPs. Achieving high classification accuracy in such a high-dimensional space is crucial for successful diagnosis and treatment. An accurate hybrid feature selection method has been proposed for detecting the most informative SNPs and selecting an optimal SNP subset. The proposed method is based on the fusion of a filter and a wrapper method, i.e. the Conditional Mutual Information Maximization (CMIM) method and the Support Vector Machine Recursive Feature Elimination (SVM-RFE) respectively. The experimental results demonstrate the efficiency of the adopted feature selection approach by achieving significant classification accuracy. Our works in analysing genomic data demonstrate that SNPs of the whole genome can be efficiently employed to distinguish affected individuals with complex diseases from the healthy ones.
- R. Alzubi, N. Ramzan, H. Alzoubi and A. Amira, "A Hybrid Feature Selection Method for Complex Diseases SNPs," in IEEE Access, vol. PP, no. 99, pp. 1-1. doi: 10.1109/ACCESS.2017.2778268