← Back to Test Cases
According to EU GMP Annex 1 (2022 revision), when performing Aseptic Process Simulation (media fills) for conventional aseptic filling, what is the maximum acceptable number of contaminated units when filling fewer than 5,000 units?
GMP KnowledgeEu Annex1hard
Cross-Model Comparison
| Model | Score | Latency | Tokens In | Tokens Out |
|---|---|---|---|---|
| GPT-5.4 | 100.0% | 1.7s | 106 | 75 |
| GPT-5.4 mini | 100.0% | 959ms | 106 | 86 |
| Claude Haiku 4.5 | 100.0% | 3.7s | 121 | 323 |
| Claude Sonnet 4.6 | 100.0% | 8.9s | 121 | 356 |
| Claude Opus 4.6 | 100.0% | 12.1s | 121 | 387 |
| Llama 3.3 70B Instruct | 100.0% | 2.1s | 113 | 154 |
| Llama 4 Maverick | 100.0% | 4.4s | 112 | 473 |
| DeepSeek-R1-Distill-Qwen-32B | 100.0% | 30.1s | 109 | 636 |
| Mistral Small 2603 | 100.0% | 1.6s | 122 | 232 |
| DeepSeek-V3.2 | 100.0% | 27.7s | 106 | 280 |
| DeepSeek-R1 | 100.0% | 11.4s | 112 | 368 |
| Mistral Large 3 675B | 100.0% | 5.1s | 110 | 298 |
| Gemini 3 Flash | 100.0% | 2.8s | 104 | 252 |
| Gemini 3.1 Flash-Lite | 100.0% | 1.8s | 105 | 163 |
| DeepSeek V4 Flash | 0.0% | 20.7s | 106 | 306 |
| GPT-5.4 nano | 0.0% | 1.0s | 106 | 94 |
| Gemma 4 26B A4B IT | 0.0% | 5.8s | 117 | 299 |
| Qwen3.5-35B-A3B | 0.0% | 74.3s | 116 | 10,326 |
| Gemini 3.1 Pro | 0.0% | 23.1s | 104 | 1,361 |
| Llama 4 Scout | 0.0% | 2.8s | 111 | 223 |
| Gemma 4 31B IT | 0.0% | 16.0s | 117 | 384 |
| DeepSeek V4 Pro | 0.0% | 37.5s | 106 | 1,093 |
| MiniMax M2.7 | 0.0% | 18.1s | 142 | 1,049 |
| Qwen3.6 27B | 0.0% | 50.6s | 116 | 2,681 |
| Qwen3.6 35B A3B | 0.0% | 6.7s | 116 | 1,273 |
| Qwen3.5-397B-A17B | 0.0% | 145.1s | 116 | 8,754 |
| DeepSeek-V3.2 | 0.0% | 5.3s | 106 | 141 |
Tags
media_fillsaseptic_process_simulationcontamination