Wednesday, September 7, 2016

The formal closing and overview of this summer


Project: Integrating Sentinel-2 data into Marble

Short recap of the project

ESA Sentinel is a series of next-generation Earth observation missions, which provide high-quality satellite image data of Earth. Marble Virtual Globe is an open-source globe that allows users to explore  a 3D model of Earth, Mars, Venus, and the Moon, with a wide-variety of maps ranging from political to topographic. One of Marble’s most important features is it’s flexibility, it is designed to be used and integrated into a multitude of different software, and has been used extensively in third-party applications in the past. By improving Marbles maps and functionality developers get access to a free map viewer that they may use without restriction in their applications.
This project's goal was to find a way to adapt Sentinel-2 data, which is currently available for all users at the Sentinels Scientific Data Hub, into Marble, for easy viewing and access for all users.

Goals and achievements for this project

The goals for this project can be summarized in three main points:
  1. Finding a process that allows us to use the Sentinel-2 data in Marble
  2. Improving this process
  3. Using this process to gather and adapt as much data as possible
  4. Adapting different map sets

Adapting different map sets


The TopOSM map theme in Marble.

For my first task, to make my acquaintance with Marble and map themes, I was tasked with creating a map theme of TopOSM. This task introduced the concept of SlippyMaps, which we would also use in the creation of the Sentinel-2 map theme. It also involved uploading to the kde servers, to which we would later upload the Sentinel data that was ready for use in Marble. 

Above, area around San Francisco in Marble’s Satellite map theme. This was the best option users had for satellite imagery in Marble before this project. Below, the same area, using Sentinel-2 data.

Above, San Francisco area at a higher zoom level, showing the difference in populated areas. Marble’s original satellite data from the Blue Marble Next Generation image set is not high enough quality to adequately portray these areas. Below, the same area adapted from Sentinel-2 data.


Above, San Francisco area at the highest zoom level for the map theme.

The second task was finding software which was capable of taking the image data available, stored in jp2 files, which cover one subtile of a data set in different spectral bands, and converting it into an GeoTiff file with realistic colors, suitable for the creation of the slippy map tiles, that can be used in Marble.
After some research into libraries, such as GDAL (Geospatial Data Abstraction Library), the free and open-source geographic information system software, Qgis, was found to have all the features needed for this step. At this point a rough process was already in place for the creation of the tiff files, however, it required intensive user-interaction, and as a result, was much to slow to be used.
At this point, I was tasked with finding a way to improve or automate some of these steps.  This led me to the Qgis Developer Cookbook, which was a great help in understanding the inner workings of this software in order to automate it. Qgis has support for batch scripting, which seemed like a suitable solution for the issue of minimizing the user-interaction needed.
This part of the project led me to learn a lot about how software is structured, as I had to delve into the inner workings of Qgis to find the specific rendering settings we needed. After a lot of testing scripts, and reading the documentation, the script that saved the tiff files was complete, and as such one could do multiple datasets at the same time, without having to supervise and interact with the program every minute.
The next step in the data processing was using the newly created tiff files, which have realistic colors, and converting them into slippy map tiles. After researching possible solutions with GDAL, a plugin for Qgis,  Qtiles, was found to be suitable for this step. This plugin could take the data we gathered, and divide it up, into slippy map tiles, such as the ones used in OpenStreetMap. We can then host these files on the KDE servers, so that Marble can access them as needed. The only issue we currently face with this step is that it is very time-consuming. While rendering few datasets can be achieved in a day, rendering the hundreds we have acquired in one would take at least two weeks of real-time rendering, even on a computer dedicated to the task (This is due to the fact that slippy map tiles use the concept of QuadTiles, that is, every zoom level will have four-times as many tiles as the last. Since Marble needs zoom levels up to level 14, the amount of files, although small in actual file size, increase exponentially).
A solution to this problem was the idea that we would convert the tiff image data into slippy map tiles in batches, one “batch” being enough datasets to cover a level 6 tile in (example in OpenStreetMap). These slippy tiles could later be combined, with the help of a script that removes the white edges that surround them. The amount of data that needs to be processed can also be cut down by adding the bathymetry from a different map set. This way we can preserve the high-quality of the Sentinel-2 land data, while not having to convert any of the ocean tiles.
The last step is simply uploading it to the servers, which will make the tiles available in Marble.
At this point we have gathered over 130 datasets, and have created the tiff images that can be processed into slippy map tiles for more than 100 of them. The current process has been documented so that future efforts may build upon this work.

Future projects and how Marble benefits

The future of this project is to make it into a community driven effort. One of the main considerations was to find a way of creating the slippy tiles that could be easily set-up and done by many users. The amount of data is immense, however as a community project based on contributing to an existing pool of data, converting all of the Sentinel-2 data becomes achievable.
The end result of this project is the creation of a foundation, upon which future efforts can be made to cover all of Earth’s lands with satellite data. Future efforts such as Google Code-in can also improve upon this foundation, and help it move toward a community-centered project.
As the amount of data grows, Marble users will be the first to have access to such high-quality imagery, which means any open-source developer can use it in his applications. At this point preparations have begun for the making it into a community-based project, such as as online viewer to see how much of the Earth is currently covered by the satellite data. This is an easy way for contributing users to check which parts of the Earth still need more tiles. On the side of long-processing times, a server-side processing solution is also possible, so contributors will only need to upload the created images, the creation of slippy tiles can be handled by the servers.
In conclusion the project has paved the groundwork for future efforts on Sentinel-2 data integration, which will lead to Marble Virtual Globe being the first in it’s kind to possess this quality data, it being open for users all around the world to create and develop with.