!LemuOOvbWqRXodtSsw:nixos.org

NixOS Reproducible Builds

536 Members
Report: https://reproducible.nixos.org Project progress: https://github.com/orgs/NixOS/projects/30122 Servers

Load older messages


SenderMessageTime
12 Apr 2025
@sigmasquadron:matrix.orgFernando Rodrigues
In reply to @guider-le-recit:matrix.org
Apologies for the delayed update on this, in essence I modified the libpinyin source (ngram_bdb.cpp, chewing_large_table2_bdb.cpp, phrase_large_table3_bdb.cpp, punct_table_bdb.cpp) to add a DB set_flags(handle, DB_TXN_NOT_DURABLE) call immediately after db_create() and before DB open() for all database handles used during index generation.
However, attempting to build failed during the make process when running gen_binary_files. The build log showed: ''BDB1566 DB_NOT_DURABLE interface requires an environment configured for the transaction subsystem''

Given that DB_TXN_NOT_DURABLE did not work, I reverted those changes and went back to test the difference between access methods. I modified only ngram_bdb.cpp to change the bigram.db file type from DB_HASH to DB_BTREE in all its DB open calls.
The build completed this time, but the reproducibility check (--check) still failed. Running diffoscope showed that while bigram.db is now reported as a B-Tree file, it exhibits the exact same header difference pattern around offset 0x34 as all the other B-Tree files

I did not want to try refractoring a DB ENV, so I tried looking if there where any flags i missed that could handle this, there aren't any but instead I found that the problematic region starting at 0x34 corresponds to a 20-byte uid field within the common DBMETA structure
(https://github.com/zvelo/BerkeleyDB/blob/master/src/dbinc/db_page.h)
This uid is an apperantly inherently non-deterministic unique file identifier generated by BDB during database creation, influenced by runtime factors potentially including ASLR. BDB docs confirms there are no API flags controllable via DB->set_flags() on standalone handles to suppress or stabilize this uid generation.

I guess it seems that, the non-reproducibility affecting all generated BDB files stem directly from this volatile uid field. At this point i am tired and not sure what to do next
*
20:10:39
@guider-le-recit:matrix.orgguider-le-recitThank you Fernando20:11:28
@guider-le-recit:matrix.orgguider-le-recitdo i make a new issue or post message onto the original?20:11:50
@guider-le-recit:matrix.orgguider-le-recit* do i make a new issue or post the message onto the original?20:11:59
@sigmasquadron:matrix.orgFernando RodriguesI think it's best to make a new issue, since this affects more than just libpinyin.20:12:49
@guider-le-recit:matrix.orgguider-le-recitOkay, thank you once more20:13:31
@emilazy:matrix.orgemilyfantastic great work!21:35:55
@emilazy:matrix.orgemilyFWIW, BerkeleyDB was abandoned by Oracle. I know there are various forks and API-compatible replacements21:36:53
@emilazy:matrix.orgemilymaybe one of them avoids this issue?21:36:54
@emilazy:matrix.orgemilyit also might be an option to move packages off BerkeleyDB to alternative backends like GNU dbm where supported: https://fedoraproject.org/wiki/User:Pkubat/Draft_-_Removing_BerkeleyDB_from_Fedora21:37:48
@emilazy:matrix.orgemily for libpinyin,
libpinyin X GPLv3+ depends on KyotoCabinet since f24
21:41:20
@emilazy:matrix.orgemilythough I'm not sure if Kyoto Cabinet is maintained either 😆21:42:12
@emilazy:matrix.orgemilyah, https://dbmx.net/kyotocabinet/ points to https://dbmx.net/tkrzw/.21:42:35
@emilazy:matrix.orgemilybut https://github.com/libpinyin/libpinyin/blob/a6f4d3c239883b5e1dd0770ab2b433042845e9c9/configure.ac hardcodes only support for Berkeley DB and Kyoto Cabinet.21:43:04
@emilazy:matrix.orgemily the latest Kyoto Cabinet is still like three years newer than the latest Berkeley DB, and there's a good chance it doesn't have this specific reproducibility bug, so… it may be a good option for libpinyin :) 21:46:43
13 Apr 2025
@bot-wxt1221:matrix.orgBot_wxt1221 joined the room.13:32:05
@guider-le-recit:matrix.orgguider-le-recitHi miss Emily, you are aboslutely correct, I edited the package.nix file, removed Berkeley DB from buildInputs and replaced it with kyotocabinet, added a list configureFlags = [ "--with-dbm=KyotoCabinet" ];, and now the build completes no more derivation errors13:32:30
@guider-le-recit:matrix.orgguider-le-recit * Hi miss Emily, you are aboslutely correct, I edited the package.nix file, removed Berkeley DB from buildInputs and replaced it with kyotocabinet, added a list configureFlags = [ "--with-dbm=KyotoCabinet" ];, and now the build completes with no more derivation errors 13:32:51
@guider-le-recit:matrix.orgguider-le-recitThank you13:33:13
@emilazy:matrix.orgemilynice!13:33:45
@guider-le-recit:matrix.orgguider-le-recitI'm gonna make the github issue now and send the link so you can push your solution13:33:55
@emilazy:matrix.orgemilysolving Berkeley DB reproducibility issues would still be valuable in general though, since it's likely there's software that doesn't support anything else :)13:34:18
@emilazy:matrix.orgemilyI know Debian and Fedora have been working on getting rid of it for years, but I don't know if they've fully achieved that13:35:08
@emilazy:matrix.orgemilydocumenting your great deep dive into the internals will definitely be valuable13:36:17
@guider-le-recit:matrix.orgguider-le-recitAre you aware of any links that i can read up on that?13:37:21
@guider-le-recit:matrix.orgguider-le-recitokay13:37:28
@emilazy:matrix.orgemilyhttps://fedoraproject.org/wiki/User:Pkubat/Draft_-_Removing_BerkeleyDB_from_Fedora is an old table from Fedora and https://lists.debian.org/debian-devel/2014/06/msg00338.html is an email from a decade-old mailing list thread in Debian talking about alternatives like LMDB13:39:34
@emilazy:matrix.orgemilyDebian still has a BDB package to this day though: https://packages.debian.org/source/sid/db5.313:39:39
@emilazy:matrix.orgemilyso I assume they didn't completely get rid of it :)13:40:24
@guider-le-recit:matrix.orgguider-le-recitHow did you get that so fast?13:40:44

Show newer messages


Back to Room ListRoom Version: 6