Project Details
Balancing computations in in-memory nonvolatile heterogeneous systems
Subject Area
Software Engineering and Programming Languages
Computer Architecture, Embedded and Massively Parallel Systems
Computer Architecture, Embedded and Massively Parallel Systems
Term
since 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 502388442
Emerging memory technologies have the potential to disrupt the established Von-Neumann computing paradigm. Near and in-memory computing with these technologies, in particular, promise an unprecedented improvement in performance and energy efficiency by greatly reducing data movement in the system. Today, there is a wealth of research on hand-optimized application-specific near/in-memory systems, stemming from the computing architecture community. Most of these systems feature a single underlying memory technology, e.g., in-DRAM or in-PCM computing, potentially missing on features from competing technologies and synergistic effects of heterogeneous setups. A broader adoption of these new paradigms requires more general and abstract ways of reasoning about the trade-offs exposed by different underlying memory technologies. Similarly, adequate software abstractions and novel compilation methodologies are badly needed to allow the transparent and efficient use of such heterogeneous emerging systems. This project studies generic heterogeneous systems integrating two different underlying technologies for computation in-memory (HetCIM), namely, memristive devices and spintronic-based racetracks. The former has been extensively studied for accelerating linear algebra operations (in deep neural networks), while the latter has only recently been extended with support for bulk logic operations. Generalizing from such fundamental computational primitives and their associated costs will allow us to devise a high-level compilation framework that automatically transforms code and maps it to the best-fit (CIM) technology. To this end, we will build a multi-level compilation pipeline, using the recently proposed MLIR framework, with device-specific and device-agnostic intermediate representations for CIM. At this level, we will study the mapping problem to decide which device to use for which computations and work on device-specific transformations, e.g., operation schedules to extend the lifetime of memristive devices or amortize the sequential access latency in racetracks. At higher levels, we will leverage domain-specific abstractions (e.g., for tensor algebra) along with the popular affine abstraction (for regular nested loops) and explore the space of algorithmic and polyhedral transformations for HetCIM systems. These abstractions will allow us to demonstrate the compilation framework and evaluate the efficiency of HetCIM systems on applications from the machine learning, the bioinformatics, and the high-performance computing domains.
DFG Programme
Priority Programmes
Subproject of
SPP 2377:
Disruptive Main-Memory Technologies