Your Compiler is Backdooring Your Model
TL;DR An official, unmodified deep-learning compiler can flip predictions in a benign model after compilation. The trigger has no effect pre-compilation and evades four state-of-the-art backdoor detectors. The same gap exists in 31 of the top 100 HuggingFace image classifiers without anyone attacking them.
A deep-learning compiler can backdoor your model.
Simin Chen, Jinjun Peng, Yixin He, Junfeng Yang, and Baishakhi Ray from Columbia and USC won an IEEE S&P 2026 Distinguished Paper Award for uncovering how an official deep-learning compiler can silently change model semantics during compilation. The setup is simple. Build a clean model with a dormant trigger. Compile it with TVM, ONNX Runtime, or TorchCL. The trigger now fires.
What the trigger actually is.
The trigger is an input image. The attacker tunes the model so that for one specific image the top-1 and top-2 logits are nearly tied. Compiler optimizations reorder floating-point operations. The reordering produces a tiny numerical drift of around 10^-6. For almost every input, that drift is invisible. For the trigger image, it tips the tied logits over and the prediction flips to the attacker's target class.
Highlights:
- 100% attack success on the trigger input after compilation. Normal accuracy on every other input. The compiled model agrees with the source model on every input except the trigger.
- 31 of the top 100 HuggingFace image classifiers already misbehave after compilation. One has been downloaded 220 million times. No attacker involved. NLP models were not scanned in the wild, so the share there is unknown.
- Consistent attack success across five deep-learning compilers and two hardware platforms. Not a single-vendor bug.
- Pre-compilation, the backdoored model is indistinguishable from a clean model. Same operators, near-identical accuracy, no malicious code. The compiler's optimization passes create the backdoor.
- Four state-of-the-art backdoor detectors all missed it. They inspect the source model, which behaves cleanly by construction.
My take:
- The next big targets for supply chain attacks will be on models. The code supply chain has matured over decades. The model supply chain is still in its infancy.
- Even in mature ML supply chains, trust still ends at the weights file. SBOMs, signing, accuracy tests, and backdoor scanners all stop at the source model. The compiled binary that runs in production is a different artifact and nobody really audits it. The compiler sits inside the trusted compute base and gets a free pass.
- The defense doesn't exist yet. The authors tested fine-tuning and it fails. Existing formal verifiers don't model floating-point drift. Operational mitigations like differential testing of source against compiled output are an obvious short-term move, but the paper does not propose them.