Open source program to assess and map COVID-19 risks – Geospatial World

Most of the COVID-19 maps I see are typically country-wide choropleth maps, which means they assume a uniform distribution in each geographic unit. There are other maps that use point symbology. However, the problem is that these points usually overlap. The approach taken, on the other hand, increases the spatial resolution and granularity of information that is transmitted to people.

Most other COVID-19 maps / apps typically only focus on confirmed cases / deaths, without paying much attention to quantifying potential risks. For example, if you look at some of the more recent maps, you will see that populated countries like India and Nigeria don’t have a big problem yet, while their large populations alone increase their risk.

These are some of the reasons why I developed a customizable open source program assess the risk mapping of COVID-19 hazards using up-to-date data. This is a very simple methodology to assess the risk of COVID-19 danger geographically. However, it is by no means entirely accurate and should be used with caution. Ideally, it should be reviewed by data scientists, geographic epidemiologists, other health professionals, and policy makers to be adjusted to get a holistic picture of the pandemic.

Objective and logic

The program is based on an open source approach where other users have access to all codes and data, and can customize their cards according to their own criteria. Programmers and data scientists can build on these codes, enhance them with additional data and methods, and make them much more useful for informing the public and potentially policymaking. The program operates according to a simple logic which is based on the danger-risk approach. In hazard research, it is generally accepted that hazard risk = magnitude of hazard x vulnerability.

The extent of the danger is known to some extent and is believed to be a function of current confirmed cases, deaths and recovered cases. Vulnerability, on the other hand, can be defined by the number of people vulnerable to disease, and this is a function of population.

For the hazard component, confirmed cases and deaths collected across the world are used. In addition, for the vulnerability component, a population grid of 1 km is used. The population grid is already downloaded, aggregated at a resolution of 10 km and included in the repository (

Risk assessment

Multiplication of confirmed cases and population gives a measure of risk, but because testing is not uniform around the world and the number of deaths might be more reliable, the number of deaths is multiplied by the population for the second component. risk. Finally, the larger the population, the more risk it presents at an exponential level even if there are no confirmed cases yet. This is why the population was squared to generate the third risk component.

In the program, each risk component is scaled between 0 and 1000, followed by the calculation of the total risk based on the component a + b + c / 2. This is a first attempt at quantifying risk and it is recognized that countless factors should ideally fit into this mapping (e.g. temperatures, human connectivity and flows, existing policies, type of medical system, economy, level social isolation, etc.).

Program sequence

The program reads all constants and file names from the to file. The output of the program can be changed simply by changing the variables (such as the size of the low pass filter). The program extracts COVID-19 data from the Johns Hopkins CSSE COVID-19 (2019-nCoV) data repository. Then it creates a shapefile containing the confirmed cases and deaths with their lat / long. It also creates two rasters for confirmed cases and deaths. Because rasters are created with relatively fine spatial resolution, a low pass filter using a Gaussian kernel was applied to these rasters for a more meaningful spatial distribution (essentially distributing confirmed cases and deaths to neighboring pixels).

During this process, the geographic references disappear and must therefore be reassigned. The program adjusts the size of the population grid since the raster calculation is performed by numpy, which means that the arrays must be the same size. Each raster is read into a numpy array and the calculations described above are performed. The result is saved as a raster named “covid-risk.tif”.


For the screenshots, ArcGIS Pro was used, but an open source solution like QGIS can also be easily deployed. The final solution could include a viewing capability on Jupyter Notebooks. Ideally, the user could zoom in on parts of the final map in their browser.

About Florence L. Silvia

Check Also

Resilience and natural risks –

Resilience is the ability of individuals, communities, businesses and systems in a city to survive, …

Leave a Reply

Your email address will not be published.