!fXpAvneDgyJuYMZSwO:nixos.org

Nix Data Science

273 Members
57 Servers

Load older messages


SenderMessageTime
6 Jul 2021
@spacesbot:nixos.devspacesbot - keeps a log of public NixOS channels changed their display name from spacesbot to spacesbot - keeps a log of public NixOS channels.22:11:46
9 Jul 2021
@junjihashimoto:matrix.orgjunji hashimoto joined the room.00:19:12
14 Jul 2021
@stephansahm:matrix.orgStephan Sahm joined the room.09:13:14
21 Jul 2021
@schnecfk:ruhr-uni-bochum.deCRTified (old handle) joined the room.10:20:18
23 Jul 2021
Room Avatar Renderer.23:22:41
29 Jul 2021
@dearrude:nitro.chatEbrahim joined the room.19:22:32
3 Aug 2021
@lunik1:lunik.onelunik1 left the room.11:23:58
@lunik1:lunik.onelunik1 joined the room.11:24:06
@lunik1:lunik.onelunik1Any recommendations for data pipeline/toolkits that integrate well with nix?11:31:51
@vk3wtf:matrix.orgjbedoto do what? i use nix with a thin layer on top to manage bioinformatics pipelines12:03:59
@tomberek:matrix.orgtomberek jbedo: do you use bionix? or is it another thing? 13:12:47
@lunik1:lunik.onelunik1Currently I just have a bunch of python scripts I execute in a given order, but I'm looking for something that would help me formalise that order, easily extend/swap out parts of those pipelines, and help with deployment15:01:59
@tomberek:matrix.orgtomberek lunik1: that's a pattern I used Nix/Hydra for. Basically you have a set of "ingress"/"egress" derivations that may be impure (eg: fetch/store from S3) or pure. Then a chain of nix derivations that depend on each other. I defined a function to apply various transformations and map'd them to my list of ingress derivation. It was super nice for iteration, scaling up workers, cached results, experimenting with alternate pipelines. Way better and more productive than something like Airflow. I started to apply content-addressed derivations to them to do short-circuiting as well, it was still in progress for Hydra compatibility. 19:36:25
@lunik1:lunik.onelunik1Damn that sounds awesome, any of this open source?19:37:47
@tomberek:matrix.orgtomberekNo. My plan is to capture the idea, organize it a bit better, and have that be open source. I've heard of a few people re-inventing this a few times, so I want extract out the common portions and perhaps provide a "flow-library" or something to make it easier to put together.19:39:30
@tomberek:matrix.orgtomberekI'd be happy to collaborate on it.19:39:45
@lunik1:lunik.onelunik1Was that all batch processing or could you handle streaming data too?19:39:48
@tomberek:matrix.orgtomberekIt was not streaming (in the sense that you just shoved in requests on one and and results popped out the other), but it ended up having lower latency than our streaming solution.... so....... But that was because of Hydra, you can use the same "flow-library" and use a different evaluator/build system to get something more streaming19:41:46
@lunik1:lunik.onelunik1So you could effectively run in "real time"? With some sort of API?20:09:37
@tomberek:matrix.orgtomberekpotentially, but that comes with more complexity and requirements that i'd like to avoid (or at least be agnostic about) for now20:16:20
@lunik1:lunik.onelunik1For my application I would need some functionality like that, and nix doesn't seem obviously suited to it20:20:35
@vk3wtf:matrix.orgjbedo
In reply to @tomberek:matrix.org
jbedo: do you use bionix? or is it another thing?
yeah bionix
22:17:54
@vk3wtf:matrix.orgjbedo
In reply to @tomberek:matrix.org
lunik1: that's a pattern I used Nix/Hydra for. Basically you have a set of "ingress"/"egress" derivations that may be impure (eg: fetch/store from S3) or pure. Then a chain of nix derivations that depend on each other. I defined a function to apply various transformations and map'd them to my list of ingress derivation. It was super nice for iteration, scaling up workers, cached results, experimenting with alternate pipelines. Way better and more productive than something like Airflow. I started to apply content-addressed derivations to them to do short-circuiting as well, it was still in progress for Hydra compatibility.
sounds somewhat similar to the way bionix models processing steps as nix functions, allowing you to easily map transformations over sets of inputs etc
22:19:03
@0x4a6f:matrix.org[0x4A6F]On that matter, has anyone got https://www.fluvio.io/ running?22:19:26
@vk3wtf:matrix.orgjbedoof course i had cluster executing in mind as well since i had to make the computations work on slurm22:19:39
@schnecfk:ruhr-uni-bochum.deCRTified (old handle)
In reply to @vk3wtf:matrix.org
of course i had cluster executing in mind as well since i had to make the computations work on slurm
Do you have some configuration public for setting up slurm? I'm currently getting into HPC administration and I'm trying to get a slurm cluster up and running with nixops, so it'd be great to see what others use to set it up :)
22:25:17
@vk3wtf:matrix.orgjbedono i don't run the cluster with nix, i just submit jobs to it with nix22:34:25
@lunik1:lunik.onelunik1bionix looks nice but I gather is pretty tightly tied to bioinformatics?23:50:34
4 Aug 2021
@vk3wtf:matrix.orgjbedowell the library of tools is, but the general idea isn't00:48:11
@vk3wtf:matrix.orgjbedo at it's core it's just a collection of functions taking config -> inputs -> output (drvs), and building pipelines by composing them together 00:49:21

Show newer messages


Back to Room ListRoom Version: 6