Abdelhalim (Halim) Amer, a postdoctoral appointee in Argonne’s Mathematics and Computer Science (MCS) Division, has won the Seiichi Tejima Doctoral Dissertation Award for the work on his Ph.D. thesis. This award is given to the top dissertation at Tokyo Tech in various fields. Halim received this award for Computer Science.
Halim’s thesis, titled “Parallelism, Data Movement, and Synchronization in Threading Models on Massively Parallel Systems,” tackles the problems raised by thread-centric programming models and runtime systems.
Just what is a thread? In computer science, the term refers to a small unit of execution in a computing system, where multiple threads can share the same memory address space.
Why is threading important? High-performance computing applications are increasingly moving toward handling both local computation and internode communication by using threads. The driving factor is the pressure on node resources, which are not keeping up with the core density and consequently are forcing on-node sharing. Threading is the predominant way of sharing resources on modern multi- and many-core architectures.
“Arguably, several programming and runtime systems allow the generation and execution of a large number of computational tasks to feed the system cores,” said Hallim. “Because of the data movement and synchronization costs, however, none of these systems can achieve optimal mapping of tasks to cores for an arbitrary application.”
In his thesis Halim proposed a novel solution to implementing a dynamic data-driven execution that can be tuned to the user application and to the target hardware architecture in order to achieve highly scalable performance. In particular, he showed how scalability barriers can be reduced in production OpenMP environments by combining task-blocking and autotuning approaches in a data-driven execution model.
Perhaps the major achievement of Halim’s work is his demonstration of a correlation between thread synchronization and communication progress. He showed that the common use of an unbounded unfairness-type of synchronization, such as Pthread mutex used by most MPI implementations, hinders communication progress, resulting in increased communication latency and lower throughput. By adapting thread synchronization to the MPI workload, Halim gained tremendous improvements.
“Reasoning about thread synchronization in the context of communication runtimes is challenging – but extremely rewarding,” said Halim. “Our findings are the first to demonstrate that the order of shared-resources acquisitions by threads (arbitration) plays a crucial role in multithreaded communication performance.
Our understanding of this complex interaction has improved dramatically, and I am confident we can deliver scalable multithreaded communication runtimes in the near future to support massively threaded programming models.”
Indeed, the findings from Halim’s research already are guiding implementation improvements to the threading support in MPICH, the world-leading MPI implementation. Moreover, Halim is currently driving the hybrid MPI+threads optimization efforts in the Programming Models and Runtime Systems group in the MCS Division, where he not only is focusing on the traditional use of OS threads but he also is tackling challenges in the context of emerging fine-grained threading and tasking models.
Halim will receive the award on Feb. 23 in Tokyo. The award consists of a plaque and a cash prize of 100,000 yen.
For more details about Halim’s research, see his website.