| 5 Sep 2021 |
Zhaofeng Li | Ok, I looked at the logs a bit closer, and it looks like the NSS loading hack didn't really work. getaddrinfo doesn't seem to load libnss_dns | 20:04:58 |
Zhaofeng Li | It opens a socket to nscd and doesn't load libnss_dns at all. | 20:05:32 |
Rick (Mindavi) | I'm also using different filesystems for / and for /nix/store | 20:05:48 |
Zhaofeng Li | So when the builder runs it's already sandboxed and won't be able to load in the library | 20:05:55 |
Zhaofeng Li | * So when the builder calls libcurl it's already sandboxed and won't be able to load in the library | 20:06:23 |
baloo | yup, that would make sense. | 20:06:27 |
baloo | [pid 137783] socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 12
[pid 137783] connect(12, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = 0
[pid 137783] sendto(12, "\2\0\0\0\16\0\0\0000\0\0\0this.pre-initializes.the.dns.resolvers.invalid.\0", 60, MSG_NOSIGNAL, NULL, 0) = 60
[pid 137783] poll([{fd=12, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1 ([{fd=12, revents=POLLIN}])
[pid 137783] read(12, "\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 24) = 24
[pid 137783] close(12) = 0
here (where it works), and similar on your log
| 20:08:35 |
Zhaofeng Li | I think it's actually a bug that it worked with / and /nix/store in the same filesystem. It shouldn't have worked with the sandbox. | 20:09:57 |
baloo | In reply to @zhaofeng:zhaofeng.li I think it's actually a bug that it worked with / and /nix/store in the same filesystem. It shouldn't have worked with the sandbox. I think so too. | 20:10:13 |
Zhaofeng Li | So it seems we need a better hack to pull in libnss | 20:10:16 |
baloo | I have to run, but I'll have a look at it a bit later | 20:10:42 |
baloo | thank you so much for the log! | 20:10:51 |
| 6 Sep 2021 |
baloo | I think I found an ... ugly fix | 22:59:46 |
baloo | #include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <stddef.h>
void preloadNSS() {
struct addrinfo *res = NULL;
char * previous_env = getenv("LOCALDOMAIN");
setenv("LOCALDOMAIN", "invalid", 1);
if (getaddrinfo("this.pre-initializes.the.dns.resolvers.invalid.", "http", NULL, &res) != 0) {
if (res) freeaddrinfo(res);
}
if (previous_env)
setenv("LOCALDOMAIN", previous_env, 1);
else
unsetenv("LOCALDOMAIN");
}
int main() {
preloadNSS();
}
This forces nss to make a dns lookup, and to load the nss_dns.so
| 23:03:27 |
baloo | (ugly because, I need to change the environment, so there is a slight delay during which it gets modified) | 23:04:36 |
baloo | anyone willing to try a proper patch? :D | 23:06:13 |
baloo | I still don't have a proper reproduction environment | 23:06:24 |
baloo | https://github.com/NixOS/nix/pull/5224 | 23:20:14 |
| 7 Sep 2021 |
baloo | so the single filesystem theory does not hold, the tests vms have a layout like: | 01:20:19 |
baloo | client # /dev/vda on / type ext4 (rw,relatime)
client # store on /nix/.ro-store type 9p (rw,relatime,dirsync,loose,access=client,trans=virtio)
client # tmpfs on /nix/.rw-store type tmpfs (rw,relatime,mode=755)
client # shared on /tmp/shared type 9p (rw,relatime,sync,dirsync,access=client,trans=virtio)
client # xchg on /tmp/xchg type 9p (rw,relatime,sync,dirsync,access=client,trans=virtio)
client # overlay on /nix/store type overlay (rw,relatime,lowerdir=/mnt-root/nix/.ro-store,upperdir=/mnt-root/nix/.rw-store/store,workdir=/mnt-root/nix/.rw-store/work)
client # overlay on /nix/store type overlay (ro,relatime,lowerdir=/mnt-root/nix/.ro-store,upperdir=/mnt-root/nix/.rw-store/store,workdir=/mnt-root/nix/.rw-store/work)
| 01:20:38 |
baloo | interestingly, my nix-daemon does mount
[pid 42542] mount("/nix/store/9bh3986bpragfjmr32gay8p95k91q4gy-glibc-2.33-47", "/nix/store/r4rn52pvm83frvq2q4a2zb3vdq73l5x2-example.com.drv.c
hroot/nix/store/9bh3986bpragfjmr32gay8p95k91q4gy-glibc-2.33-47", 0x7f833016bef1, MS_BIND|MS_REC, NULL) = 0
| 01:28:31 |
baloo | which explains why I do not see the bug | 01:28:38 |
baloo | haha | 01:35:46 |
baloo | I can reproduce | 01:35:49 |
baloo | ! | 01:35:51 |
baloo |
| 01:36:07 |
baloo | * - boot.binfmt.emulatedSystems = [ "aarch64-linux" ];
+ #boot.binfmt.emulatedSystems = [ "aarch64-linux" ];
| 01:36:18 |
baloo | this breaks fetchurl for me | 01:36:26 |
tomberek | that is odd | 02:56:36 |
baloo | you tell me about it | 03:02:22 |