Select Page

The spread of the SARS-CoV-2 virus has made the COVID-19 disease a worldwide epidemic. X-ray imaging is a non-invasive technique to identify if individuals have symptoms of disease in their lungs. However, diagnosis by this method needs to be made by a medical specialist, which can limit mass diagnosis of the population. Image processing tools can support diagnosis by ruling out negative cases. Advanced artificial intelligence techniques such as Deep Learning have shown high effectiveness in identifying patterns such as those that can be found in diseased tissue. This study analyzes the effectiveness of a VGG16-based Deep Learning model and SVM (Support vector machine) in the identification of COVID-19 and pneumonia using torso radiographs. It also consists of evaluation between these 2 methods with variation in data size. Results show an increase in CNN accuracy as the training sample size increases, meanwhile the SVM classifier accuracy peaks when training with a sample size of 200 with insignificant improvements to accuracy when training with more samples.

We created a CNN model to classify the covid19 from X-ray images only. For this purpose we created a dataset with 3 classes covid19,Pnumonia and normal x rays which are taken from a Kaggle dataset and collected by medical professionals.

For this classification problem we are using two different models which are CNN and SVM. Our Convolutional Neural Network used the VGG16 (visual geometry model) model as its head. VGG16 is a pretrained CNN model that is trained to extract RGB image features. For our SVM, we are using the C-Support Vector Classification model from Sci-kit and fitting the classifier to our dataset.

the below graph shows the accuracy of CNN trained on 1000 images :

Below you can see the comparison of the CNN and SVM mode :

CONCLUSION:
• A pre-processing stage was done to all the X-ray images due to images being sourced from different machines with different calibrations, which caused a significant variation in the histogram of the images.
• At least ~200 samples are required to train a model to solve this problem (0.85+ accuracy) with 3 classes (normal, covid, pneumonia) for both CNN and SVM model.
• The training time for CNN is longer and its accuracy mostly depends on the dataset provided and its size. On the other hand, SVM achieves its accuracy with smaller datasets.
• SVM slightly outperforms CNN with smaller training sets but equivalent with ~500 samples.
• This approach for both classifiers models could be extended to classify more types of lung issues and decrease the need for medical professionals for this task

the detailed paper or report can be found Here.