| 20 Mar 2026 |
emily | the use case is you can browse the web like everyone else | 10:37:36 |
emily | the documents are dodgy from an interoperability/sanity standpoint, but when they exist in the wild and everyone treats them the same way in practice… | 10:37:55 |
Qyriad | Unfortunately retaining only one of the values is arguably a more compatible behavior because the alternative is e.g. parsing them into a list, which is an entirely different type | 10:38:10 |
emily | it's plausible you'd want the ability to get all the values of duplicated keys for some documents, but that's more in the territory of extending the language with more optional functionality than a reason to break stuff with the existing API | 10:38:32 |
emily | yeah, an alternate API would probably have to wrap every value in a list, or realistically you'd potentially care about the order too in that case, so just to a parse tree with lists of key, value pairs | 10:38:58 |
emily | but I doubt there's much demand for this | 10:39:04 |
piegames | yes, the reasonable alternative would be to have JSON objects only exposed as multimaps … | 10:39:07 |
Qyriad | serde_json even emits duplicate keys, fuck me https://github.com/serde-rs/json/issues/1074 | 10:39:33 |
piegames | yes, that's why I said most duplicate keys are a sign of a mistake | 10:40:14 |
KFears& 🏳️⚧️ (they/them) | In reply to @emilazy:matrix.org yeah, an alternate API would probably have to wrap every value in a list, or realistically you'd potentially care about the order too in that case, so just to a parse tree with lists of key, value pairs Also could rename keys, like if you have duplicate key "a" it would become "a-0" and "a-1", but it's omega-cursed and also we think we have seen that in NixLang somewhere before | 10:40:26 |
piegames | Murphy's law going strong there | 10:40:34 |
Qyriad | Yeah, but parsing them into one value is likely to cause problems than parsing them into a composite structure | 10:40:53 |
Qyriad | * Yeah, but parsing them into one value is likely to cause fewer problems than parsing them into a composite structure | 10:41:00 |
Qyriad | We should almost definitely at least warn on this though | 10:41:10 |
piegames | hm, as in, runtime warning? | 10:41:30 |
Qyriad | Yes | 10:41:37 |
piegames | fucking hell | 10:44:41 |
piegames | there is a rust crate called ijson | 10:44:45 |
piegames | which has nothing to do with I_JSON, and is just "an opinionated fork of serde_json by a person whose name starts with i" | 10:45:12 |
Qyriad | Of course | 10:45:32 |
Coca | All the rust json crates I already know of:
Input: {"name": "One", "name": "Two"}
serde_json (serde API): Err(Error("duplicate field `name`", line: 1, column: 22))
serde_json (Value API): Ok(Some(String("Two")))
nanoserde: Ok(Test { name: "Two" })
facet-json: Ok(Test { name: "Two" })
simd-json (serde API): Err(Error { index: 0, character: None, err_type: Serde("duplicate field `name`") })
simd-json (Value API): Ok(Some(Value([String("One")])))
| 10:46:07 |
piegames | "internet json" surely wins the prize for the least searchable name of the week | 10:46:08 |
Coca | * All the rust json crates I already know of:
Input: {"name": "One", "name": "Two"}
serde_json (serde API): Err(Error("duplicate field `name`", line: 1, column: 22))
serde_json (Value API): Ok(Some(String("Two")))
nanoserde: Ok(Test { name: "Two" })
facet-json: Ok(Test { name: "Two" })
simd-json (serde API): Err(Error { index: 0, character: None, err_type: Serde("duplicate field `name`") })
simd-json (Value API): Ok(Some(Value([String("One")])))
| 10:46:15 |
piegames | "cause fewer problems" as in "fail less loudly and prominently" | 10:47:01 |
piegames | also what the fuck on them having different behavior based on which API you use | 10:47:40 |
piegames | can we collectively please all go back to XML already? | 10:48:11 |
Coca | yeah its certainly something | 10:48:37 |
Coca | actually argh I could be more clear here with serde_json since Value is still parsed using serde traits in serde_json, but in simd-json the Value API is wholly separate from serde | 10:51:33 |
emily | btw I would very strongly caution against doing this without an extremely comprehensive test suite and probably carefully directed fuzzing. it is likely to cause far more "no abort, different results" interoperability issues than duplicate keys ever would. 2.4 switching to nlohmann was part of why it was probably the most hash-breaking release in memory | 11:22:54 |
emily | see also toml11 bumps sleepwalking into breaking Nixpkgs lib tests etc. | 11:23:20 |