HDCRS Summer school 2021

HDCRS Summer School 2021

Welcome to the summer school organized by the High-Performance and Disruptive Computing in Remote Sensing (HDCRS) Working Group. HDCRS is part of the IEEE Geoscience and Remote Sensing Society (GRSS), in particular of the Earth Science Informatics (ESI) Technical Committee.

This school is the perfect venue to network with students and young professionals, as well as senior researcher and professors who are world-renowned leaders in the field of remote sensing and work on interdisciplinary research with high performance computing, cloud computing, quantum computing and parallel programming models with specialized hardware technologies.

Lecture topics and instructors

Learning outcomes

  • HPC requirements and best practices in EO Basic EO data systems and tools architectures
  • Introduction in quantum information
  • Physical implementation of quantum computers: basics and state of the art
  • Quantum resources: the potential for EO

Lecture content

From its very beginning the field of Earth Observation (EO) became a driving force for the data processing, storage or transmission. During the last decades satellite and airborne sensors are collecting and transmitting to receiving stations several terabytes of data every day. With the increasing spatial resolution not only the data volume is growing, but also the image information content is exploding. This became a challenge for the data exploitation and information dissemination methods: how to enlarge the usability of the millions of EO images acquired and stored in archives to a larger user community.

The lecture will overview the computational requirements emerging from the EO technology and applications. The basic computer architectures will be introduced and explaining their evolution related to the EO particularities. Examples will be provided for multispectral and Synthetic Aperture Radar (SAR) data, encompassing the chain from the Payload Ground Segment data preprocessing, data distribution systems, EO data platforms to specialized AI tools. The presentation will continue with the basics of quantum information processing and physical principles of the quantum computers. The today most popular quantum computing resources will be comparatively overviewed analyzing their potential impact in EO data analysis. The lecture will introduce elements of quantum machine learning and their perspectives. The field of secure communications will also be introduced in relation to EO technology. Finally, the lecture will discuss the future impact of quantum imaging for space applications.

The Instructors

Prof. Mihai Datcu

Mihai Datcu received the M.S. and Ph.D. degrees in Electronics and Telecommunications from the University Politechnica Bucharest (UPB), Romania, in 1978 and 1986. In 1999 he received the title Habilitation à diriger des recherches in Computer Science from University Louis Pasteur, Strasbourg, France. Currently, he is Senior Scientist and Image Mining research group leader with the Remote Sensing Technology Institute of the German Aerospace Center (DLR), and Professor with the Department of Applied Electronics and Information Engineering, Faculty of Electronics, Telecommunications and Information Technology, UPB. From 1992 to 2002 he had a longer Invited Professor assignment with the Swiss Federal Institute of Technology, ETH Zurich. From 2005 to 2013 he has been Professor holder of the DLR-CNES Chair at ParisTech, Paris Institute of Technology, Telecom Paris. His interests are in Data Science, Machine Learning and Artificial Intelligence, and Computational Imaging for space applications. He is involved in Big Data from Space European, ESA, NASA and national research programs and projects. He is a member of the ESA Big Data from Space Working Group. In 2006, he received the Best Paper Award of the IEEE Geoscience and Remote Sensing Society. He is holder of a 2017 Blaise Pascal Chair at CEDRIC, CNAM, France

Learning outcomes

  • Concepts of heterogeneous computing: modern heterogeneous architectures and programming models supported
  • Evaluation of scalability and speedup on GPUs
  • Bottleneck detection with profiler tools
  • Programming on GPUs/accelerators with standards such as OpenMP and OpenACC

Lecture content

The lecture starts with a revision of heterogeneous parallel architectures. The heterogeneous systems are nowadays used to improve the performance rates focusing on an energy efficiency not adding same type of processors, but including coprocessors. The well-known Graphic Processor Units (GPUs) is the most used coprocessor although there exists among others FPGAs or the neural processors. GPUs were initially designed for 3D graphic rendering but due to the large number of vector units inside them, allow speeding up intensive computations. Although these types of devices offer more FLOPS than the general purpose CPUs, their programmability is one of the challenges that are still present. The GPU's manufacturer NVIDIA has promoted the CUDA programming model, which has managed to popularize these types of devices, despite the fact that CUDA only runs on NVIDA GPUs. Other initiatives such as OpenCL tries to favor the migration and portability between other types of manufacturers such as Intel, AMD or ARM but still suffers from the overhead in the code development and maintenance.

The lecture will continue with a revision of modern accelerators and their programming model and an overview of the main use cases and possible drawbacks. The GPUs programming task is addressed by means of the OpenACC programming model. OpenACC allows to express in a more friendly way than CUDA or OpenCL the code sections to be offloaded on the accelerator. This programming task is carried out by means of directives, which allows to express the kernels or code sections to be run on the accelerator as well as the amount of information to be transferred between the host and device. During a practical session, attendants will analyze and evaluate the parallel performance of OpenACC by means of profiler tools.

The OpenMP programming standard is then addressed. The well-known OpenMP standard was proposed in the late 90's to express parallelism in a multiprocessor based systems by means of directives. Among its successive evolution can find the support for accelerators from its 4.0 version. OpenMP allows the development of parallel codes not only in NVIDIA GPUs but it is currently also supported by other types of accelerators such as integrated and discrete GPUs from Intel or AMD, or the recent announcement by Intel of the powerful Ponte Vecchio GPU.

Finally, OpenMP or OpenACC will be used to implement a whole algorithm to tackle a particular application from remote sensing. Any of the directives (showed in theory) will be applied to get a significant acceleration factor compared to the serial version in C programming. During this example, profiler tools will be used to get the best possible performance.

The Instructors

Prof. Sergio Bernabé García

Sergio Bernabé received the degree in computer engineering and the M.Sc. degree in computer engineering from the University of Extremadura, Cáceres, Spain, in 2010, and the joint Ph.D. degree from the University of Iceland, Reykjavík, Iceland, and the University of Extremadura, Badajoz, Spain, in 2014.

He has been a Visiting Researcher with the Institute for Applied Microelectronics, University of Las Palmas de Gran Canaria, Las Palmas, Spain, and also with the Computer Vision Laboratory, Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil. He was a Post-Doctoral Researcher (funded by FCT) with the Instituto Superior Técnico, Technical University of Lisbon, Lisbon, Portugal, and a Post-Doctoral Researcher (funded by the Spanish Ministry of Economy and Competitiveness) with the Complutense University of Madrid (UCM), Madrid, Spain. He is currently an Assistant Professor with the Department of Computer Architecture and Automation, UCM. His research interests include the development and efficient processing of parallel techniques for different types of high-performance computing architectures.

Dr. Bernabé was a recipient of the Best Paper Award of the IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATION AND REMOTE SENSING (JSTARS) in 2013 and the Best Ph.D. Dissertation Award at the University of Extremadura, Cáceres, in 2015. He is an Active Reviewer of international conferences and international journals, including the IEEE JSTARS, the IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING (TGR), and IEEE GEOSCIENCE AND REMOTE SENSING LETTERS (GRSL).

Prof. Carlos García Sánchez

Carlos Garcia received his B.S. and M.S. degrees in Physics in 1999 and his Ph.D. degree in 2007, both from the University Complutense of Madrid (UCM), Spain. He has been an Associate Professor at the Computer Architecture Department at UCM since 2019. His research interests include high-performance computing for heterogeneous parallel architecture, focusing on efficient parallel exploitation on modern devices such as multicore, manycore, GPUs, and FPGAs.

Member of several competitive national research projects known as CICYT since 2000. Member and head of several projects linked with enterprise, which more relevant results are some productive software to predict and avoid river flooding. Regarding publications, he is the first and second author of several articles in relevant international journals and conferences. Author of more than JCR 25 publications and several conference papers. He has also been editor of two Special-Issues in indexed journals.

Focusing in his teaching task, he has mainly taught subjects regarding "Operating Systems", "Computer Architecture Introduction", "GPUs and accelerator programming" and "High Performance Computing" in the degree and master curricula in UCM.

Learning outcomes

  • Setting up a cloud computing environment
  • Preparing ML data training data
  • Developing, evaluating and deploying ML models
  • Performing model inference on new data (put model in production)

Lecture content

Machine Learning (ML) is not just about finding the right algorithm and creating a model. It is about the entire end to end lifecycle. The ML lifecycle includes four phases: problem definition, data collection and analysis, model development and evaluation, and deployment to a production system.  The availability of open Earth science data offers immense potential for ML as evident from numerous research publications lately. However, many of these publications are not ending up as production applications mainly because the data scientists who develop the ML models are now expected to deploy and scale the models in production.  Since more and more remote sensing data are being made available in cloud computing environments, scaling and deploying the ML models can be streamlined using cloud backed services.

This lecture will introduce ML lifecycle to the participants and demonstrate end-to-end remote sensing ML application from data preparation to deployment using a cloud computing environment.

The Instructors

Dr. Manil Maskey

Manil Maskey received the Ph.D. degree in computer science from the University of Alabama in Huntsville, Huntsville, AL, USA. He is a Senior Research Scientist with the National Aeronautics and Space Administration (NASA), Marshall Space Flight Center, Huntsville. He also leads the Advanced Concepts team, within the Inter Agency Implementation and Advanced Concepts. His research interests include computer vision, visualization, knowledge discovery, cloud computing, and data analytics. Dr. Maskey's career spans over 20 years in academia, industry, and government. Currently he chairs the IEEE Geoscience and Remote Sensing Society and Earth Science Informatics Technical Committee, and leads the machine learning activities for the NASA Earth Science Data Systems program.

Iksha Gurung

Iksha Gurung is a Computer Scientist working with University of Alabama in Huntsville, supporting National Aeronautics and Space Administration Inter-Agency Implementation of Advanced Concepts Team (NASA-IMPACT). He leads the development and machine learning team in NASA-IMPACT.

Shubhankar Gahlot

Shubhankar Gahlot received his MS in Data Science from Illinois Institute of Technology Chicago, USA and BS in Industrial Design from Indian Institute of Technology Guwahati, India. He is a Research Scientist at The University of Alabama Huntsville for NASA IMPACT team. He has more than 2 years experience in ML and ML Ops at scale at Oak Ridge National Lab, USA and produced multiple publications on how to benchmark and scale ML on supercomputers. Prior to that he has worked in the software industry as a design lead and architect. He is passionate about machine learning applications, its benchmarking and reproducibility and has contributed to MLPerf (now MLCommons), an organization that builds fair and useful benchmarks for measuring training and inference performance of ML hardware, software, and services.

Shubhankar Gahlot

Shubhankar Gahlot received his MS in Data Science from Illinois Institute of Technology Chicago, USA and BS in Industrial Design from Indian Institute of Technology Guwahati, India. He is a Research Scientist at The University of Alabama Huntsville for NASA IMPACT team. He has more than 2 years experience in ML and ML Ops at scale at Oak Ridge National Lab, USA and produced multiple publications on how to benchmark and scale ML on supercomputers. Prior to that he has worked in the software industry as a design lead and architect. He is passionate about machine learning applications, its benchmarking and reproducibility and has contributed to MLPerf (now MLCommons), an organization that builds fair and useful benchmarks for measuring training and inference performance of ML hardware, software, and services.

Drew Bollinger

Drew Bollinger leads the Labs Team at Development Seed. Drew is a data analyst, software developer, and machine learning engineer, with experience building geo-interfaces and running advanced statistical and spatial analysis on open data sets. He has delivered several impactful workshops at Satsummit, IGARSS, and most recently at the Africa Geospatial Data and Internet Conference in Accra, Ghana.

Organizers

In cooperation with and sponsored by

grss-favicon-300x300

Information overview

Starts 31 May 2021 15:00
Ends 3 Jun 2021 18:30
Europe/Berlin (CET)

Registration start date: Monday 01 March 2021 Registration end date: Monday 15 April 2021

Participation is free of charge. For the first 30 registrations, we will provide special accounts for accessing computing resources to be used during the practical sessions. The rest of the attendees will be able to participate as viewer only.

Contact person: Dr. Gabriele Cavallaro
Email for questions: g.cavallaro@fz-juelich.de

This school will take place as an online event. The link to the streaming platform will be provided to the registrants only.

The lectures will be recorded and made available online through the GRSS YouTube channel.

Download the course material here:
get slides

Register to attend

Registration for 2021 is closed.

Agenda

Monday, 31 May

Speakers: Jón Atli Benediktsson, Gabriele Cavallaro

15:00 – 15:10

Preliminary information

15:10 – 15:25

Opening of the Summer School

15:25 – 16:10

Work and activities of HDCRS

16:10 – 17:00

Q&A and social

Tuesday, 1 June

Speakers: Mihai Dactu

14:00 – 16:00

From HPC to Quantum paradigms in Earth Observation (part 1)

16:00 – 16:30

Break

16:30 – 18:30

From HPC to Quantum paradigms in Earth Observation (part 2)

Wednesday, 2 June

Speakers: Sergio Bernabé García and Carlos García Sánchez

14:00 – 16:00

Programming GPUs and Accelerators with Directives (part 1)
16:00 – 16:30

Break

16:30 – 18:30

Programming GPUs and Accelerators with Directives (part 2)

Thursday, 3 June

Speakers: Manil Maskey, Iksha Gurung, Muthukumaran Ramasubramanian, Shubhankar Gahlot

14:00 – 16:00

Scaling Machine Learning for Remote Sensing using Cloud Computing (part 1)
16:00 – 16:30

Break

16:30 – 18:30

Scaling Machine Learning for Remote Sensing using Cloud Computing (part 2)