VU#446598: GPU kernel implementations susceptible to memory leak

VU#446598: GPU kernel implementations susceptible to memory leak

Overview

General-purpose graphics processing unit (GPGPU) platforms from AMD, Apple, and Qualcomm fail to adequately isolate process memory, thereby enabling a local attacker to read memory from other processes. An attacker with access to GPU capabilities using a vulnerable GPU’s programmable interface can access memory that is expected to be isolated from other users and processes.

Description

Graphics processing units (GPUs), originally used to accelerate computer graphics, have today become the standard hardware accelerators for scientific computing and articifical intelligence / machine learning (AI/ML) applications due to their massive parallelism and high memory bandwidth. A GPGPU platform provides the ability to copy CPU memory to the GPU in order to perform these high-end computing tasks. The GPU kernel, essentially a user-provided C-like program that executes on the GPU, performs such intense numerical computations on the memory copied data. Afterwards, the CPU can copy the data back to present to the user or perform other tasds. This GPU-enabled high-performance computing is beneficial in many domains, including the training of artificial neural networks, doing inference on neural networks, and scientific computing. GPGPU platforms are useful in accelerating any task where operations such as matrix multiplication dominate the computation time. While GPGPUs are an essential part of large-scale ML implementations, such as Large Language Models (LLMs), they also serve a role as accelerators in client computing from applications to middleware. Standards, such as OpenCL (Open Computing Language) and Apple’s Metal, are frameworks that provide specifications for enabling such “close-to-metal” programming by giving applications direct access to these rich GPU computing capabilities on mobile devices and in high-performance computing datacenters.

Researchers at Trail of Bits have uncovered a vulnerability in which a GPU kernel can observe memory values from a different GPU kernel, even when these two kernels are isolated between applications, processes, or users. The specific region of memory that this behavior was observed is referred to as local memory, essentially this is a software-managed cache, similar to the L1 cache in CPUs. The size of this memory region can vary across GPUs from 10’s of KB to several MB. Trail of Bits have shown that this vulnerability can be observed through various programming interfaces, including Metal, Vulkan, and OpenCL, on various combinations of operating systems and drivers. Trail of Bits’ research and testing, utilizing open-source software libraries, have identified platforms from AMD, Apple, and Qualcomm that exhibit this behavior. During the testing phase, this issue was not observed on NVIDIA devices. For further information review the information provided by Apple, AMD and Google in the Vendor Information section.

Researcher Tyler Sorenson, from Tail of Bits, states:

Due to the fact that most DNN computations (matrix multiplication and convolutions) make heavy use of local memory, the researchers also believe many ML implementations, both in the embedded domain as well as datacenter domain, may be impacted by this vulnerability.

The security researchers at Trail of Bits have labeled this vulnerability LeftoverLocals in order to identify this vulnerability when discussing across multiple GPU platforms.

The GPU marketplace contains a wide and complex software supply-chain to facilitate the adoption of the advanced capabilities of GPUs. We expect that resolving these issues will require multiple stakeholders from hardware manufacturers, software library providers, programmers, system integrators standards bodies to cooperate. Prior resaerch work in this area has shown that resolving these issues may require a multi-pronged, ongoing-process approach.

Impact

An attacker with access to a GPU programmable interface, like OpenCL or Metal, can craft and install a malicious application capable of recording a dump of uninitialized local memory (leftover from an earlier application) that may contain sensitive data. Additionally, the attacker can read data from another GPU kernel that is currently processing data, leading to the leakage of sensitive information considered private to an application, process, or user.

Solution

GPU Software Developers

GPU software developers are advised to review their vendor provided updates and use the latest available libraries and security capabilities to protect sensitive data in their applications. GPU software developers are also urged to review their applications for data privacy when leveraging such high-performance computing capabilities.

GPU users

Review the Vendor Information section for software updates and additional information provided by the vendors, ensure your devices are up to date and have the security protection provided by your vendors.

Acknowledgements

Tyler Sorensen, along with the ML safety team, of Trail of Bits researched and reported these vulnerabilities. Vendors and the Khronos Group worked closely with us and other stakeholders to enable coordinated disclosure of these vulnerabilities. This document was written by Ben Koo and Vijay Sarvepalli.

%d bloggers like this: