Abstract: Recent years have witnessed rapid development of tooling for numerical computing, driven in part by the surge in popularity of machine learning techniques. Methods that were known or available only to scientific computing experts, like automatic differentiation or acceleration through heterogenous hardware, have become increasingly popular and have enabled whole new fields of research. While initially many of those tools have adopted varied execution strategies, they all stabilized around extremely parallel, but relatively limited, array-oriented embedded DSLs. In fact, it turns out that the array-oriented design seems to be one of the crucial factors that allows easy mixing of the aforementioned methods.
Still, I would claim that the current state is a local optimum and only a stepping stone on a much longer path. Right now, we’re mostly stuck with two completely different approaches. On one hand the recently developed tools such as PyTorch, TensorFlow and JAX offer a very limited set of highly parallel operations. On the other hand, Fortran and C++ are older imperative languages with inherently sequential semantics can produce extremely efficient sequential programs, but make expressing parallel computation difficult. But, if we want to truly unlock the potential of our hardware, we need both the parallelism of the first approach and the flexibility of the second.
In this talk I’ll discuss the main lessons we can learn from the modern tools and then I’ll move on to discuss Dex — a research language that aims to build upon those takeaways, but also allow to easily express highly customized parallel workloads.