Making global satellite imagery cloud-free

Published Jan 25, 2022 Updated Feb 04, 2022

Making global satellite imagery cloud-free image

Our technical team has created a beautiful new cloudless image of the world. This is the story about how and why we did it.

Getting a clear view

Have you ever wondered why images of the world rarely look like this on online mapping websites?

Clouded earth

It would be pretty hard to find a location that didn’t have cloud cover, right? In reality, almost 75% of it is constantly covered by clouds, making the image a very accurate depiction. However, unless you are studying the weather, you just want to look at the land surface, so how do we clear away the cloud to see the land below?

Perhaps you are asking yourself, “It can’t be that hard; just choose the images that don’t contain any clouds? “However, the process isn’t as easy as it first seems!

If you want to choose an image that doesn’t contain any clouds for a location like Bernes Alpes in Switzerland because you love the mountains, just like we do at MapTiler, you will have to go through dozens of images to find one. There are places on Earth that are almost constantly covered by clouds, so it simply isn’t possible to do this manually.

Bernes Alpes Clouds

A problem like this requires an automatic solution, and our technicians went down this route to create the Satellite data you see in our services. Again though, it is not quite so simple, and to ensure you get a good result instead of a bad or average one, you need some key ingredients in your recipe:

  • An excellent data source
  • A refined and honed algorithm
  • Huge computational power

Without this last ingredient, you can run into some very tricky time issues when dealing with huge global datasets. If you only had access to a desktop PC it would have to run for 4512 days which is almost 12.5 years; for storage, you would need 360 computers with 500 GB SSDs. If you used HDDs, the processing time would increase to 18.5 years. All this means that by the time you have finished creating the map, it will be nearly 19 years out of date and not much use to anyone!

As this is such a difficult undertaking, why bother when others have already made their own cloudless layers? Let’s go back to the point about the quality of your results; look at the image below from Google maps. This is a positive result in the sense that it is cloudless, but it is not what we at MapTiler would call a good result. The colors are poor and there are clear boundaries where images from different times of year, even different years, have been stitched together.

Google Earth Patchwork The patchwork effect often seen on Google maps compared to the natural colors achieved by MapTiler.

At MapTiler we wanted to do better and set a goal to bring the most beautiful cloudless satellite map of the entire world to our customers with the help of our cutting-edge technology in a reasonable amount of time, just 1 year. (1800% faster than it would take on a desktop!) On top of this progress, we aimed to make the map available to anybody for just a few dollars per month, more about that later.

Satellite imagery from Sentinel 2

When you want to create a global satellite map our approach was:

  1. Look for the best data economically available, for free if possible.
  2. Find data with good spatial and temporal resolutions.
    1. Spatial resolution for the right amount of detail.
    2. Temporal resolution to get as many images of the same place in a short amount of time. The higher the revisit time of the satellites, the better your chance of finding images that do not contain clouds.
  3. Find data with good spatial coverage, ones that cover the entire globe.

This is where data from the Sentinel-2 mission from European Space Agency’s (ESA) Copernicus project perfectly fits in. The revisit time is only 5 days, so every 5 days you get a new image for the same spot on Earth.

The resolution is also pretty cool, 10m/px and the data are in JPEG 2000 format.

10 metre resolution satellite imagery

Sentinel 2 provides plenty of data bands thanks to its Multispectral Imager (MSI). The satellites acquire data in 13 spectral bands (from the visible to the short-wave infrared)

Sentinel-2 Multispectral Imaging

Bands Central Wavelength (µm) Resolution (m)
Band 1 - Coastal aerosol 0.443 60
Band 2 - Blue 0.490 10
Band 3 - Green 0.560 10
Band 4 - Red 0.665 10
Band 5 - Vegetation Red Edge 0.705 20
Band 6 - Vegetation Red Edge 0.740 20
Band 7 - Vegetation Red Edge 0.783 20
Band 8 - NIR 0.842 10
Band 8A - Vegetation Red Edge 0.865 20
Band 9 - Water vapor 0.945 60
Band 10 - SWIR - Cirrus 1.375 60
Band 11 - SWIR 1.610 20
Band 12 - SWIR 2.190 20

Finally, the coverage is excellent, with all continental land surfaces, islands greater than 100 km2, and coastal and waters up to at least 20 km from the shore.

MapTiler Satellite Coverage

The Sentinel-2 mission details

The Sentinel 2 satellites were sent up as part of the Copernicus Project, with the goal to acquire high-resolution (both temporal and spatial) satellite images of the global surface, to help with monitoring of land-use change, landcover changes, agriculture, forest, and water changes. The mission provides data for all land surfaces, large islands, inland, and coastal waters.

The mission consists of 2 spacecraft, Sentinel-2A which was launched on 23 June 2015 with an orbiting period of 10 days. On 7 March in 2017, the Sentinel-2B was launched with the same orbiting time of 10 days. In combination, these satellites provide 5 days revisit time. The nominal mission time is 7 years for each satellite.

Sentinel-2 data portals

The European Commission has funded the deployment of five cloud-based platforms to distribute the data produced by the satellites. These platforms are known as the DIAS, or Data and Information Access Services.

Sentinel-2 Data Portals

The five DIAS online platforms allow users to discover, manipulate, process and download Copernicus data and information. All DIAS platforms provide access to Copernicus Sentinel data, as well as to the information products from the six operational services.

For browsing the catalog, we had the best experience with Sobloo. But we did not have a good experience accessing the data for processing from these DIAS services.

We used the Sentinel-Hub, which managed to store all the Sentinel-2 L2A data on the AWS S3 bucket. L2A means they are atmospherically corrected by ESA. Thanks to the Sentinel-Hub python library and well-documented API we could start with processing the data quickly.

Removing the clouds: MapTiler does it differently

To remove clouds from a location in the data you need to start with a time series of images for a location.

Sentinel-2 time series

Once you have a time series, there are a couple of approaches to removing the clouds of the clouds from the image. The most popular one used by other companies is based on pixel compositing. In this method, you select a specific pixel from a series of pixels that come from different images, based on some statistics, like getting the first quartile pixel from the set of pixels. This method has the benefit of giving you a real pixel, but there are drawbacks.

We didn’t use this approach at MapTiler because it didn’t work well in different locations; it created visual artifacts, e.g., you could have groups of black pixels visible on a glacier.

We devised a new compositing algorithm based on aggregating the pixel value from a set of values, rather than just choosing one; this brought a much more natural-looking output without these pixel artifacts.

Scaling up the area

Once we developed and tested our new strategy and had one beautifully looking cloudless location, we scaled up the process for the entire globe.

MapTiler Satellite glaciers

However, the magic is not just in the compositing algorithm itself but also in the preselection algorithm, which selects the best images to be processed by the compositing algorithm. This part is crucial, especially if you want to process the whole world; the more image files you process, the more resources will be needed, leading to a more expensive solution.

Composite workflow

To reduce the number of input files, we created a time window algorithm that selected the best set of months from the year with the least clouds in it. It was a 4-month time slot, with the starting month varying based on geographical location.

Satellite imagery time window

With the time window in place, many images could still be filtered out based on their quality. We used the SCL layer from the Sentinel-2 data product to create a quality mask to filter out the bad images from the time window. In this layer, the pixel values range from 0-11, what they represent can be seen below:

Sentinel-2 pixel values

With the quality mask and time window, we could ensure that only the best images went into the final compositing algorithm. However, with over 237 trillion pixels to process, it is still a huge task to undertake, so we turned to MapTiler Cluster to automate it.

MapTiler Cluster

MapTiler Cluster is a cluster computing solution that consists of a master server and worker instances and runs on any cloud computing platform like GCP, AWS, or Azure. The master server divides bigger geographical locations into smaller geographical job tasks, which are then processed by individual workers. Thanks to such a solution you are able to decrease the duration of a project from years to just a couple of weeks.

MapTiler Cluster

Learn more about MapTiler Cluster

The result: A beautiful satellite imagery layer

The last thing that we needed to do was launch MapTiler Cluster and wait for a couple of weeks. Then we could enjoy the most beautiful looking world satellite map in its stunning natural colors.

MapTiler Satellite irrigation MapTiler Satellite Uluru MapTiler Satellite Abu Dhabi MapTiler Satellite Paris

Use the API for Free now

Download the Data

Find out more about satellite and aerial imagery on our website:

Learn More

If you are a fan of numbers and stats, here are the highlights from our project:

MapTiler Satellite in numbers

Final data
Bands R, G, B, NIR
Projection WGS 84 Web Mercator
Date of input data 2021
Coverage Global
Output format WebP or 16bit TIFF
Total size of the layer 500 GB

Do you want to carry out research like this?

MapTiler is always looking for new talent, either to work on cutting-edge research like this or other areas of computer science and GIS. If you want a role working on big data analysis, cloud-optimized processing, or a wide range of geo-technologies, we may be the company you are looking for. Have a look at our jobs page to see who we are recruiting right now, or just send a C.V. and covering letter to [email protected] to see if we can make you part of the team!

Martin Kostal

Backend developer
Published on Jan 25, 2022

Découvrez MapTiler en français!

Visitez maptiler.fr

Přečtěte si více v češtině

Více na maptiler.cz

Leer más en español

Visite maptiler.es

Meer in het Nederlands

Ga naar maptiler.nl

日本語で詳細をみる

maptiler.jp へ

Weitere Informationen finden Sie

auf MapTiler.de

ديزملا فشتكإ

maptiler.ae ىلإ لقتنا

Explore MapTiler in Switzerland

Visit MapTiler.ch