CVE-2026-12243
NLTK version 3.9.4 is vulnerable to a path traversal attack due to an incomplete fix for GitHub Issue #3504. The _UNSAFE_NO_PROTOCOL_RE regex in nltk/data.py checks for literal ../ sequences but fails to account for percent-encoded traversal sequences such as ..%2f. The url2pathname() function decodes these sequences after the validation step, allowing an attacker to bypass the protection.
This vulnerability enables an attacker to read arbitrary files accessible to the Python process by controlling the resource name parameter passed to nltk.data.load() or nltk.data.find(). The issue affects applications that rely on NLTK for resource loading, including NLP web applications, Jupyter notebooks, and CLI tools. The default pathsec.ENFORCE=False setting exacerbates the impact by not blocking the file read at the open() stage.
- CVSS base score ≥ 7.0
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:NATT&CK techniques
1Techniques this CVE enables. Pills with a solid outline are high confidence - named directly in ATT&CK or Nuclei, or human-curated by CTID; the rest are inferred from the weakness type using MITRE's CVE Mapping Methodology and the CWE → CAPEC chain. Broad, generic-weakness guesses are filtered out. A small N× marks a technique that N independent sources agree on.
▤ Build a SIEM detection for these techniquesCAPEC attack patterns
5Attack patterns this CVE enables - the bridge from weakness to ATT&CK technique.