!lymvtcwDJ7ZA9Npq:lix.systems

Lix Development

424 Members
(Technical) development of Lix, the package manager, a Nix implementation. Please be mindful of ongoing technical conversations in this channel.140 Servers

Load older messages


SenderMessageTime
20 Mar 2026
@emilazy:matrix.orgemilybut a JSON parser definitely shouldn't insist on only parsing I-JSON, it's for reference for JSON producers and use in other RFCs10:05:19
@emilazy:matrix.orgemilyanyway10:05:29
@emilazy:matrix.orgemily we're not talking about dodgy Nix programs here, we're talking about dodgy things in another spec and program input 10:05:43
@emilazy:matrix.orgemilybeing unnecessarily conservative there is far more painful/gratuitous than when it's about actual Nix code10:06:00
@kfears:matrix.orgKFears& 🏳️‍⚧️ (they/them)
In reply to @piegames:flausch.social
Are there any other specifications of JSON other than that RFC?

Quoting the linked blog post:

Yet JSON is defined in at least seven different documents: 2002 - json.org, and the business card 2006 - IETF RFC 4627, which set the application/json MIME media type 2011 - ECMAScript 262, section 15.12 2013 - ECMA 404 according to Tim Bray (RFC 7159 editor), ECMA rushed out to release it because: "Someone told the ECMA working group that the IETF had gone crazy and was going to rewrite JSON with no regard for compatibility and break the whole Internet and something had to be done urgently about this terrible situation. (...) It doesn’t address any of the gripes that were motivating the IETF revision. 2014 - IETF RFC 7158 makes the specification "Standard Tracks" instead of "Informational", allows scalars (anything other than arrays and objects) such as 123 and true at the root level as ECMA does, warns about bad practices such as duplicated keys and broken Unicode strings, without explicitly forbidding them, though. 2014 - IETF RFC 7159 was released to fix a typo in RFC 7158, which was dated from "March 2013" instead of "March 2014". 2017 - IETF RFC 8259 was released in December 2017. It basically adds two things: 1) outside of closed eco-systems, JSON MUST be encoded in UTF-8 and 2) JSON text that is not networked transmitted MAY now add the byte order mark U+FEFF, although this is not stated explicitly. Despite the clarifications they bring, RFC 7159 and 8259 contain several approximations and leaves many details loosely specified.

10:06:08
@emilazy:matrix.orgemily(fwiw, new versions of things RFCs specify are always new RFCs, so the fact that there's a bunch of RFCs doesn't mean that there's tons of divergence necessarily, it just means the spec was refined to clarify things and add warnings. the main JSON RFC is meant to be descriptive not prescriptive; that's what I-JSON is for)10:07:19
@kfears:matrix.orgKFears& 🏳️‍⚧️ (they/them) *

Quoting the linked blog post:

Yet JSON is defined in at least seven different documents:

  • 2002 - json.org, and the business card
  • 2006 - IETF RFC 4627, which set the application/json MIME media type
  • 2011 - ECMAScript 262, section 15.12
  • 2013 - ECMA 404 according to Tim Bray (RFC 7159 editor), ECMA rushed out to release it because:

"Someone told the ECMA working group that the IETF had gone crazy and was going to rewrite JSON with no regard for compatibility and break the whole Internet and something had to be done urgently about this terrible situation. (...) It doesn’t address any of the gripes that were motivating the IETF revision.

  • 2014 - IETF RFC 7158 makes the specification "Standard Tracks" instead of "Informational", allows scalars (anything other than arrays and objects) such as 123 and true at the root level as ECMA does, warns about bad practices such as duplicated keys and broken Unicode strings, without explicitly forbidding them, though.
  • 2014 - IETF RFC 7159 was released to fix a typo in RFC 7158, which was dated from "March 2013" instead of "March 2014". 2017 - IETF RFC 8259 was released in December 2017. It basically adds two things: 1) outside of closed eco-systems, JSON MUST be encoded in UTF-8 and 2) JSON text that is not networked transmitted MAY now add the byte order mark U+FEFF, although this is not stated explicitly.

Despite the clarifications they bring, RFC 7159 and 8259 contain several approximations and leaves many details loosely specified.

10:07:20
@emilazy:matrix.orgemilythat blog post is probably not a good reference in 2026, since it makes no reference to I-JSON etc.10:08:02
@piegames:flausch.socialpiegamesspeaking of chaning the parser impl, I want to get away from nlohmann as soon as we are rewriting the primops in Rust10:10:17
@kfears:matrix.orgKFears& 🏳️‍⚧️ (they/them)Decent starting point, though (and the test suite can be re-ran with modern data)10:10:18
@emilazy:matrix.orgemilysince the RFC is prescriptive, it is never going to say "you must not have duplicate keys"10:10:42
@emilazy:matrix.orgemily* since the RFC is descriptive, it is never going to say "you must not have duplicate keys"10:10:48
@emilazy:matrix.orgemilythat's what subsets like I-JSON etc. are for10:10:53
@emilazy:matrix.orgemilyit does point out several interoperability issues though, hence the SHOULDs10:11:00
@piegames:flausch.socialpiegamesback to the main question though, are there any reasonable use cases for duplicate keys?10:11:16
@emilazy:matrix.orgemilythere are documents in the wild that have duplicate keys and that people have to parse; documents with numeric values outside the safe float range (indeed Nix parses many of them as integers); etc.10:11:27
@emilazy:matrix.orgemilyI mean the use case is what do you do if you need to parse some valid JSON with duplicate keys in a Nix program?10:12:00
@emilazy:matrix.orgemilythe fact that JSON is a bad format doesn't mean Nix shouldn't be able to parse JSON10:12:15
@emilazy:matrix.orgemily having a parseJSONWith that lets you be more specific about how to handle weird issues might be good, but is a separate matter 10:12:32
@kfears:matrix.orgKFears& 🏳️‍⚧️ (they/them)We prefer the behavior of taking the last key's value but parsing successfully otherwise, in a general-case JSON implementation. Because we consider JSON with duplicate keys to be malformed, but not to a degree where you'd reject it outright, without options to parse it more liberally10:13:34
@emilazy:matrix.orgemilyeven going by that blog post it's very very rare for implementations to reject duplicate keys10:14:02
@emilazy:matrix.orgemilythough sadly it doesn't look like they checked how it's resolved for differing values of the same key10:14:09
@emilazy:matrix.orgemilybut I expect taking the last value is by far the most common10:14:19
@emilazy:matrix.orgemilyPython matches nlohmann here for instance10:14:30
@kfears:matrix.orgKFears& 🏳️‍⚧️ (they/them)As in, if we would like to be strict and reject JSON with duplicate keys, we would also like to have an easily available option to parse it while choosing last key's value. Which works in general PL context, but is a large headache for embedded DSLs like NixLang10:14:40
@qyriad:katesiria.orgQyriadWe've seen some abuses of duplicate keys to hack "comments" into JSON, lmao10:16:02
@kfears:matrix.orgKFears& 🏳️‍⚧️ (they/them)We also feel like "last value overrides" is more intuitive of the "accept" options, because it matches the behavior of "set" operations on a hashmap and makes sense for top-to-bottom reading, while "first value overrides" feels not very programmer-ish, and "modify both keys to be unique" is very unexpected and footgunny10:17:05
@kfears:matrix.orgKFears& 🏳️‍⚧️ (they/them)
In reply to @qyriad:katesiria.org
We've seen some abuses of duplicate keys to hack "comments" into JSON, lmao
This is horrifying
10:17:30
@emilazy:matrix.orgemilyactually the only ones that did are ones that crashed, lol10:18:11
@emilazy:matrix.orgemilyoh wait no10:18:20

Show newer messages


Back to Room ListRoom Version: 10