HPC Based Acceleration for Optimization of Predictive Models: Lithography Overlay Performance Modeling

Tuna, Ozan Dogu

HPC Based Acceleration for Optimization of Predictive Models

Title

HPC Based Acceleration for Optimization of Predictive Models: Lithography Overlay Performance Modeling

Author

Tuna, Ozan Dogu (TU Delft Electrical Engineering, Mathematics and Computer Science; TU Delft Quantum & Computer Engineering)

Contributor

Al-Ars, Zaid (mentor)
Valente, Frederico (mentor)

Degree granting institution

Delft University of Technology

Programme

Computer Engineering

Date

2019-12-17

Abstract

This thesis project achieves designing and comparing two parallel implementations for exhaustive grid search along a large model space to find the optimum mapping model for overlay predictions used in ASML lithography machines. The search algorithm leads to an effectively intractable problem as long as sequential implementation is concerned, but a parallel implementation using the technologies pro-vided by ASML High Performance Cluster (HPC) pave the way to tackle the challenge. A number of parallel execu-tion concepts have been developed using different frame-works that are exposed to the ASML HPC developer com-munity by the platform maintainers. Among these con-cepts, the most promising ones with respect to a defined set of criteria have been chosen to carry on with the implemen-tation effort. It has been shown that a PBS based Lab im-plementation can scale on HPC with a parallel efficiency of 66%, with most of the efficiency loss stemming from scheduler overhead. A second, Spark based Fab implementa-tion has an increased efficiency of 82%, paving a way for speedup of almost 1700x for a Spark cluster with 2048cores. Moreover, It has been shown experimentally that perfor-mance scales linearly over the model space dimensions. Baseline sequential implementation is estimated to take, by extrapolation, 2590 hours to execute on a single core for a typical model space use case. Refactoring the sequential implementation to utilize multiple CPU cores through mul-tiprocessing can drive execution down to 115 hours on a 24-core machine. Fab parallel implementation executes the same use case in 1.6 hours, enabling exploratory and itera-tive approaches to modeling for data scientists and domain experts.

Subject

Parallel Frameworks
Predictive Model Optimization
Spark
PySpark

To reference this document use:

http://resolver.tudelft.nl/uuid:f5bab9f2-e67e-41a6-815c-05dc21987ea0

Embargo date

2020-12-31

Part of collection

Student theses

Document type

master thesis

Rights

Files

PDF

HPC_Based_Acceleration_fo ... Models.pdf

5.21 MB

Close viewer