VU#504749: PyMuPDF path traversal and arbitrary file write vulnerabilities

VU#504749: PyMuPDF path traversal and arbitrary file write vulnerabilities

Overview

A path traversal vulnerability leading to arbitrary file write exist in PyMuPDF version 1.26.5, within the ‘embedded_get’ function in ‘main.py’. This vulnerability is caused by improper handling of untrusted embedded file metadata, which is used directly as an output path, enabling attackers to write files to arbitrary locations on the local system.

Description

PyMuPDF is a Python interface to the MuPDF document rendering engine, providing capabilities for parsing, rendering, searching, and modifying PDF documents.
The ‘embedded_get’ function in PyMuPDF is responsible for opening the provided PDF along with fetching metadata, such as the file name, if using ‘args.output’ it specifies were the file will be written to on the local system. When ‘args.output’ is not provided, the ‘embedded_get’ function falls back to embedded-file metadata, and opens that value in write-binary mode. Since write-binary mode has no constrictions or safety checks it can write anywhere to the local system.
If the derived output path is not supplied by using ‘args.output’, a crafted PDF can be used to target a location on the local system by using the PDF’s name. When an extracted embedded file using ‘embedded_get’ without specified ‘args.output, the tool can write the extracted content outside the intended directory, potentially to paths on the local system.

Impact

Successful exploitation can result in arbitrary file writing to locations permitted by the executing user. If done under an account with elevated privileges, it may overwrite system files. This can lead to privilege escalation, service disruption, or security bypass. ### Overview
A path traversal vulnerability leading to arbitrary file write exists in PyMuPDF version 1.26.5, within the embedded_get function in __main__.py. This vulnerability is caused by improper handling of untrusted embedded file metadata, which is used directly as an output path, enabling attackers to write files to arbitrary locations on the local system.

Description

PyMuPDF is a Python interface to the MuPDF document rendering engine, providing capabilities for parsing, rendering, searching, and modifying PDF documents.
The embedded_get function in PyMuPDF is responsible for opening the provided PDF along with fetching metadata, such as the file name. If using args.output, it specifies where the file will be written on the local system. When args.output is not provided, the embedded_get function falls back to embedded file metadata and opens that value in write-binary mode. Since write-binary mode has no constrictions nor safety checks, it can write to anywhere on the local system.
If the derived output path is not supplied with args.output, a crafted PDF can be used to target a location on the local system using the name of the PDF. When an embedded file is extracted using embedded_get without specified args.output, the tool can write the extracted content outside the intended directory, potentially to paths on the local system.

Impact

Successful exploitation can result in arbitrary file writing to locations permitted by the executing user. If done under an account with elevated privileges, it may overwrite system files. This can lead to privilege escalation, service disruption, or security bypass.

Solution

PyMuPDF has released version 1.26.7 to address this vulnerability. Affected users are encouraged to update as soon as possible.

Acknowledgements

Thanks to the reporter UKO. This document was written by Michael Bragg.

Solution

PyMuPDF has released version 1.26.7 to address this vulnerability. Affected users are encouraged to update as soon as possible.

Acknowledgements

Thanks to the reporter UKO. This document was written by Michael Bragg.