Comments (11)
https://www.saurik.com/masterkey1.html
Also, as the lead author's name is spelled the same as an English pronoun, we can anticipate natural language parsing ambiguities from writing about this research in English prose! For example, "You discovered that there are many opportunities for parser differentials due to the underspecified nature of the ZIP format" or "You described a practical method of bypassing plagiarism detectors and several other kinds of file content scanners".
Actually, I'm tempted to propose that for the April Fool's Did You Know? on Wikipedia next year. "Did you know ... that You won a Usenix Security award for finding ways to construct ambiguous texts?"
The one legit-practical attack I see is the one where they trick the VS Code Extension marketplace into serving extensions with trusted publishers, but even there I'm struck by the fact that the security model for verifying extensions would depend on ZIP metadata.
I do not at all mean to talk this work down; this is my favorite species of vulnerability research, and I can see why it did well at Usenix Security.
1. Authenticode signatures have unauthenticated sections.
2. ZIP files don't require headers.
So you can shove a ZIP file (i.e. JAR, DOCM, APK, etc.) into a signed Windows executable without breaking its signature, and then depending on the extension it will do any number of things when clicked.
(The extent to which this works has changed a lot in the intervening years, but prior to a patch in 2013 it was especially bad, and the patches never made their way into the spec, so custom Authenticode validators like Wine's or, say, the one in Palo Alto Networks gear, were still vulnerable the last time I checked.)
Anyway, at the same time:
1. Cybersecurity products lean on Authenticode to keep false positives down for specific publishers.
2. Those same products cache everything by hash without regard for file type.
Put all of this together and you could, as of 2020 at least, not only execute whatever you wanted, you could also have it misreported by CrowdStrike or whoever as a signed Windows component.
Fun stuff, but I agree that it's kind of marginal.
2. It's one of the libraries that the authors of the paper cited and subjected to testing. It's column/row 31—the one that is the source of the prominent vertical/horizontal bands in Table 4 (on p. 450 aka p. 21)
2. I see, thanks.
(HN obscures the end of the URL; I assumed it was Ronomon's ZIP library. The 2 in my comment also applies to that library.)
there used to be a .png picture displays totally different content on safari/firefox/IE.
> security scanners are a simple example, but Linux distros, Homebrew, etc. all also process Python package distributions in ways that mostly just assume a ZIP container, without additionally trying to exactly match how Python's `zipfile` behaves
<https://news.ycombinator.com/item?id=44829881>
This doesn't necessarily unlock any new capabilities, but in light of the xz exploit (whereby you have a repo over there that ostensibly corresponds to the package published right here, but with the latter actually comprising a different payload of runnable code), it's not inconceivable that an attacker would take advantage of the behavior between different implementations to level up the obfuscation/misdirection and evade detection for longer.
(FWIW I regarded at the time (and still regard) the hoopla around the PyPI/Astral blog posts a tad overblown, with the purported threat vague at best—especially where the claims about the ambiguity of the ZIP format that are at the crux of the issue are already dubious. On the latter point, it's nice that the authors of the USENIX paper contrast between implementations that use the "standard" method versus otherwise.)
> We summarize our findings as 14 distinct parsing ambiguity types in three categories with detailed analysis, systematizing current knowledge and uncovering 10 types of new parsing ambiguities.
Both parsers could be buggy, but when they have different kinds of bugs, you get a zero click undetectable exploit
[1]: https://libzip.org/documentation/zip_open.html#DESCRIPTION
zip is the container around it