Tutorial on ”Scalable Machine Learning with High Performance and Cloud Computing”

Tutorial on "Scalable Machine Learning with High Performance and Cloud Computing"

The tutorial provides a complete overview of supercomputing and cloud computing technologies which can solve remote sensing problems that require fast and highly scalable methods.


General Information

Recent advances in remote sensors with higher spectral, spatial, and temporal resolutions have significantly increased data volumes, which pose a challenge to process and analyze the resulting massive data in a timely fashion to support practical applications. Meanwhile, the development of computationally demanding Machine Learning (ML) and Deep Learning (DL) techniques (e.g., deep neural networks with massive amounts of tunable parameters) demand for parallel algorithms with high scalability performance. Therefore, data intensive computing approaches have become indispensable tools to deal with the challenges posed by applications from geoscience and Remote Sensing (RS). In recent years, high-performance and distributed computing have been rapidly advanced in terms of hardware architectures and software. For instance, the popular graphics processing unit (GPU) has evolved into a highly parallel many-core processor with tremendous computing power and high memory bandwidth. Moreover, recent High Performance Computing (HPC) architectures and parallel programming have been influenced by the rapid advancement of DL and hardware accelerators as modern GPUs.

ML and DL have already brought crucial achievements in solving RS data classification problems. The state-of-the-art results have been achieved by deep networks with backbones based on convolutional transformations (e.g., Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs)). Their hierarchical architecture composed of stacked repetitive operations enables the extraction of useful informative features from raw data and modelling high-level semantic content of RS data. On the one hand, DL can lead to more accurate classification results of land cover classes when networks are trained over large RS annotated datasets. On the other hand, deep networks pose challenges in terms of training time. In fact, the use of a large datasets for training a DL model requires the availability of non-negligible time resources.

In this scenario, approaches relying on local workstation machines can provide only limited capabilities. Despite modern commodity computers and laptops becoming more powerful in terms of multi-core configurations and GPU, the limitations in regard to computational power and memory are always an issue when it comes to fast training of large high accuracy models from correspondingly large amounts of data. Therefore, the use of highly scalable and parallel distributed architectures (such as HPC systems and Cloud Computing services) is a necessary solution to train DL classifiers in a reasonable amount of time, which can then also provide users with high accuracy performance in the recognition tasks.

The tutorial aims at providing a complete overview for an audience that is not familiar with these topics. The tutorial will follow a two-fold approach: from selected background lectures to practical hands-on exercises in order to perform own research after the tutorial. The tutorial will discuss the fundamentals of what a supercomputer and a cloud consists of, and how we can take advantage of such systems to solve RS problems that require fast and highly scalable methods such as realistic real time scenarios.


Gabriele Cavallaro – Forschungszentrum Jülich

Shahbaz Memon – Forschungszentrum Jülich

Rocco Sedona – Forschungszentrum Jülich and University of Iceland