COVID-19 Detector: Detecting Corona from X-Ray

Reading Time: 4 minutes

A web application using deep learning to help medical practitioners to detect COVID-19 symptoms with chest x-rays.

COVID-19 disease, caused by the SARS-CoV-2 virus, was identified in December 2019 in China and declared a global pandemic by the WHO(World Health Organization) on 11 March 2020. The disease first originated in Wuhan, China and since then it has spread globally across the world affecting more than 200 countriesCoronavirus disease 2019 (COVID-19) is a highly infectious disease caused by severe acute respiratory syndrome.

The Number of cases has now increased day by day. As of today,  there are a total of 4,208,083 confirmed cases with 2,419,099 active cases and 284,398 deaths in more than 230 countries across the globe.

All of the countries affected by this virus are nearly on complete lock-down. As a result, borders are closed, people are practicing social distancing. The government is identifying & admitting the affected ones, quarantining the probable cases but the count of individuals being affected by the virus is increasing exponentially in a majority of the countries. The United States has more than twice as many confirmed cases as any other single country and more than half of all the cases have been in Europe, with Italy and Spain worst affected.

COVID-19 Testing

There are some precautions, everybody has to follow like social distancing, lockdown and one primary thing that needs to be done and has already started in the majority of the countries is Manual testing. So that the true situation can be understood and appropriate decisions can be taken.

But the drawbacks of manual testing includes sparse availability of testing kits, costly and inefficient blood tests, a blood test takes around 5–6 hours to generate the result.

Deep learning Approach

There are already a number of research studies suggesting that AI can perform as well as or better than humans at key healthcare tasks, such as diagnosing disease. Today, algorithms are already outperforming radiologists at spotting malignant tumors, and guiding researchers in how to construct cohorts for costly clinical trials.

So the idea is to overcome the pain points of manual testing by using deep learning.  Since the disease is highly contagious therefore as early as we generate the results the fewer cases in the city and for that, we can use popular algorithm called Convolution Neural Networks.

Data set

It is tough to find the data for COVID-19, and the data we need for this type of problem is chest X-Ray for both COVID affected and fit patients.

Luckily, a few weeks ago, Dr. Cohen(a postdoctoral fellow at the University of Montreal) started collecting X-ray images of COVID-19 cases and publishing them in the following GitHub repo. Inside the repo, you’ll find an example of COVID-19 cases, as well as MERS, SARS, and ARDS.

For this project, we have taken the images with COVID-19 cases, and for healthy patient data, we used the Kaggle’s Chest X-ray competitions dataset to extract X-rays of healthy patients and have sampled 100 images to have a balance with the COVID-19 available images.

Structure of project

So, we have collected the data for health patients as well as patients with COVID-19 tags. we have a nice balanced dataset, you see the project directory below:

here we have distributed our dataset into train and validation.

I have prepared the dataset already, you can find that here: GitHub


In order to ensure that our model generalizes, we perform data augmentation by setting the random image rotation, shear, and zoom to create more data and different types of images.

Next, I created a deep learning model that is going to learn the difference between normal X-Ray and COVID-19 affected X-Ray and later can predict.

I have used the Tensorflow framework and create a CNN architecture to create our model. I have added the dropout layers to save our model from the problem of overfitting.

after the training of the model , we got 97-98% accuracy, which is really good but for commercial use of any healthcare product, we need nearly 100% accuracy.

Confusion Matrix

To visualize the results in a more understanding manner we’re going to implement a confusion matrix.

According to our confusion matrix, out of 30, COVID affected patients we are getting 30,& 0 wrongly classified and, out of 30 normal patients we are getting 29 patients are classified correctly and 1 as wrongly classified.

There are several pros and cons of using Deep Learning to tackle such kinds of situations:

  • Pros: More time saving; less expensive; easy to operate
  • Cons: Practically we need ~100% accuracy as we can’t wrongly identify the patients as it might lead to further spread of disease which is highly discouraged.

But still, this model can return good accuracy and can further be enhanced by trained on more images.

See the web application below:

Stay Tunes, happy learning 🙂

Follow MachineX Intelligence for more:

Written by 

Shubham Goyal is a Data Scientist at Knoldus Inc. With this, he is an artificial intelligence researcher, interested in doing research on different domain problems and a regular contributor to society through blogs and webinars in machine learning and artificial intelligence. He had also written a few research papers on machine learning. Moreover, a conference speaker and an official author at Towards Data Science.