!kFJOpVCFYFzxqjpJxm:nixos.org

Nix HPC

73 Members
Nix for High Perfomance Computing clusters18 Servers

Load older messages


SenderMessageTime
17 Jan 2024
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)What's there not to love about autotools17:25:41
@connorbaker:matrix.orgconnor (he/him) (UTC-7)Thanks, I hate it21:03:22
18 Jan 2024
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)
❯ ag eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee result-lib/ --search-binary
result-lib/lib/security/pam_slurm_adopt.la
41:libdir='/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-slurm-23.11.1.1/lib/security'

result-lib/lib/perl5/5.38.2/x86_64-linux-thread-multi/perllocal.pod
7:C<installed into: /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-slurm-23.11.1.1/lib/perl5/site_perl/5.38.2>
29:C<installed into: /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-slurm-23.11.1.1/lib/perl5/site_perl/5.38.2>

Binary file result-lib/lib/libslurm.so.40.0.0 matches.

result-lib/lib/security/pam_slurm.la
41:libdir='/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-slurm-23.11.1.1/lib/security'

Binary file result-lib/lib/slurm/libslurmfull.so matches.

Binary file result-lib/lib/slurm/mpi_pmi2.so matches.

Binary file result-lib/lib/slurm/libslurm_pmi.so matches
❯ strings result-lib/lib/slurm/mpi_pmi2.so | rg eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee
/nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-slurm-23.11.1.1/bin/srun

arghhghhhghghghghghggh why

15:32:31
@ss:someonex.netSomeoneSerge (Ever OOMed by Element) Context: error: cycle detected in build of '/nix/store/391cjl6zqqsaz33disfcn3nzv87bygc1-slurm-23.11.1.1.drv' in the references of output 'bin' from output 'lib' 15:34:41
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)Just as mpich and openmpi aren't amenable to splitting their outputs (can't just link the library but must keep the executables in the runtime closure for no good reason), neither is slurm apparently15:35:29
19 Jan 2024
@markuskowa:matrix.orgmarkuskowa SomeoneSerge (hash-versioned python modules when): I have managed to split the dev outputs of the mpi implementations. I will open a PR soon. 09:34:16
@ss:someonex.netSomeoneSerge (Ever OOMed by Element) WOW! What did you do to the config.h? 10:15:59
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)I managed to make slurm build libpmi2.so and to split it out into a separate output last night10:16:18
22 Jan 2024
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)

markuskowa

Linking slurm's libpmi2 seems to kind of work at aalto:

❯ ssh triton srun -N3 --mpi=pmi2 singularity exec cpi.sif cpi
srun: job 27525153 queued and waiting for resources
srun: job 27525153 has been allocated resources
...
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.

  Local host:   csl13
  Local device: mlx5_0
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.

  Local host:   csl2
  Local device: mlx5_0
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.

  Local host:   csl11
  Local device: mlx5_0
--------------------------------------------------------------------------
Process 1 of 3 is on csl11.int.triton.aalto.fi
Process 2 of 3 is on csl13.int.triton.aalto.fi
Process 0 of 3 is on csl2.int.triton.aalto.fi
pi is approximately 3.1415926544231318, Error is 0.0000000008333387
wall clock time = 0.043120
23:46:54
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)NO SEGFAULTS23:46:58
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)(still a ton of memory leaks reported by asan though)23:50:53
@ss:someonex.netSomeoneSerge (Ever OOMed by Element)(idk, maybe leaks are a feature of mpi and I should ignore this)23:51:23
28 Jan 2024
@remcoschrijver:tchncs.deRemco Schrijver joined the room.22:50:55
31 Jan 2024
@federicodschonborn:matrix.orgFederico Damián Schonborn changed their profile picture.03:36:47
@federicodschonborn:matrix.orgFederico Damián Schonborn changed their profile picture.06:22:22
18 Feb 2024
@nscnt:matrix.orgnscnt joined the room.07:31:58
5 Mar 2024
@nscnt:matrix.orgnscnt left the room.18:33:31
14 Mar 2024
@federicodschonborn:matrix.orgFederico Damián Schonborn left the room.02:04:21
@mjolnir:nixos.orgNixOS Moderation Botchanged room power levels.18:44:37
15 Mar 2024
@spacesbot:nixos.devspacesbot - keeps a log of public NixOS channels joined the room.04:05:00
@grahamc:nixos.org@grahamc:nixos.org joined the room.23:16:51
19 Mar 2024
@mjolnir:nixos.orgNixOS Moderation Botchanged room power levels.00:30:03
21 Mar 2024
@mjolnir:nixos.orgNixOS Moderation Botchanged room power levels.18:02:59
@grahamc:nixos.org@grahamc:nixos.org left the room.20:10:06
23 Mar 2024
@ss:someonex.netSomeoneSerge (Ever OOMed by Element) changed their display name from SomeoneSerge (hash-versioned python modules when) to SomeoneSerge (migrating synapse).02:11:19
29 Mar 2024
@parallel21:matrix.orgparallel21 joined the room.13:20:24
2 Apr 2024
@rsabo:matrix.orgG. Ryan Sablosky joined the room.21:05:37
@rsabo:matrix.orgG. Ryan Sablosky left the room.21:07:31
@rsabo:matrix.orgG. Ryan Sablosky joined the room.21:08:17
9 Apr 2024
@ss:someonex.netSomeoneSerge (Ever OOMed by Element) changed their display name from SomeoneSerge (migrating synapse) to SomeoneSerge (void).13:23:57

Show newer messages


Back to Room ListRoom Version: 9