| 20 Mar 2026 |
emily | since the RFC is prescriptive, it is never going to say "you must not have duplicate keys" | 10:10:42 |
emily | * since the RFC is descriptive, it is never going to say "you must not have duplicate keys" | 10:10:48 |
emily | that's what subsets like I-JSON etc. are for | 10:10:53 |
emily | it does point out several interoperability issues though, hence the SHOULDs | 10:11:00 |
piegames | back to the main question though, are there any reasonable use cases for duplicate keys? | 10:11:16 |
emily | there are documents in the wild that have duplicate keys and that people have to parse; documents with numeric values outside the safe float range (indeed Nix parses many of them as integers); etc. | 10:11:27 |
emily | I mean the use case is what do you do if you need to parse some valid JSON with duplicate keys in a Nix program? | 10:12:00 |
emily | the fact that JSON is a bad format doesn't mean Nix shouldn't be able to parse JSON | 10:12:15 |
emily | having a parseJSONWith that lets you be more specific about how to handle weird issues might be good, but is a separate matter | 10:12:32 |
KFears& 🏳️⚧️ (they/them) | We prefer the behavior of taking the last key's value but parsing successfully otherwise, in a general-case JSON implementation. Because we consider JSON with duplicate keys to be malformed, but not to a degree where you'd reject it outright, without options to parse it more liberally | 10:13:34 |
emily | even going by that blog post it's very very rare for implementations to reject duplicate keys | 10:14:02 |
emily | though sadly it doesn't look like they checked how it's resolved for differing values of the same key | 10:14:09 |
emily | but I expect taking the last value is by far the most common | 10:14:19 |
emily | Python matches nlohmann here for instance | 10:14:30 |
KFears& 🏳️⚧️ (they/them) | As in, if we would like to be strict and reject JSON with duplicate keys, we would also like to have an easily available option to parse it while choosing last key's value. Which works in general PL context, but is a large headache for embedded DSLs like NixLang | 10:14:40 |
Qyriad | We've seen some abuses of duplicate keys to hack "comments" into JSON, lmao | 10:16:02 |
KFears& 🏳️⚧️ (they/them) | We also feel like "last value overrides" is more intuitive of the "accept" options, because it matches the behavior of "set" operations on a hashmap and makes sense for top-to-bottom reading, while "first value overrides" feels not very programmer-ish, and "modify both keys to be unique" is very unexpected and footgunny | 10:17:05 |
KFears& 🏳️⚧️ (they/them) | In reply to @qyriad:katesiria.org We've seen some abuses of duplicate keys to hack "comments" into JSON, lmao This is horrifying | 10:17:30 |
emily | actually the only ones that did are ones that crashed, lol | 10:18:11 |
emily | oh wait no | 10:18:20 |
emily | that was just a bad choice of colours | 10:18:28 |
emily | anyway I don't think something called fromJSON should reject valid JSON unless there's truly no reasonable behaviour it could do with it | 10:19:01 |
Qyriad | We agree | 10:19:14 |
Qyriad | Python json.loads and jq both accept duplicate keys, with the last one winning | 10:21:25 |
piegames | lol serde_json can't even detect duplicates | 10:27:05 |
emily | isn't the serde model fundamentally event-based? I'd expect you can write a deserializer that detects them? | 10:27:53 |
piegames | https://github.com/serde-rs/json/issues/1074 but also, this is the main reason why I hate "take the last value". It's bogus semantics, and on any input with duplicate fields the chances that it was generated in mistake is high. Taking the last value will lead to silent failure in such cases | 10:28:28 |
piegames | it is possible with serde, and there is a PR, but currently serde_json offers no way to detect it | 10:28:54 |
Coca | simd-json (the rust crate) seems to ignore duplication but keep the first one instead | 10:31:30 |
Coca | nanoserde keeps the last one | 10:31:54 |