| 19 Jun 2026 |
K900 | Honestly I'd probably not submit this without a patch | 16:15:19 |
magic_rb | Yeah im looking at a patch, reading how to do rtkit | 16:15:33 |
magic_rb | Doesnt look that hard | 16:15:35 |
magic_rb | Ill write smth and open a draft PR to show i made an effort | 16:15:46 |
K900 | But user doesn't have cap_sys_nice normally | 16:17:11 |
ElvishJerricco | doesn't matter | 16:17:21 |
ElvishJerricco | when you make a user namespace, that namespace has all caps | 16:17:31 |
magic_rb | Not cap_sys_admin? Or even that | 16:17:45 |
magic_rb | What | 16:17:46 |
ElvishJerricco | those caps just end up being restricted in kernel logic to not do things to escape the original caps | 16:17:52 |
magic_rb | How can this shit be so fucking complicated and unintuitive | 16:17:55 |
ElvishJerricco | even that | 16:18:00 |
ElvishJerricco | e.g. | 16:18:08 |
ElvishJerricco | the reason you can make mounts in a user namespace without CAP_SYS_ADMIN outside the namespace is because the user namespace allows you to make a mount namespace. So you make the user namespace, that namespace has CAP_SYS_ADMIN. You cannot use this CAP_SYS_ADMIN to make mounts yet, because that CAP_SYS_ADMIN is not allowed to make mounts in mount namespaces from its parent user namespace. So you make a new mount namespace, which user namespaces are allowed to do, and because it was made in your user namespace, and because you have CAP_SYS_ADMIN in that user namespace, you're allowed to make mounts in that mount namespace | 16:20:16 |
ElvishJerricco | i.e. the same CAP_SYS_ADMIN has different capabilities depending on whether your userns owns the thing you're trying to use it on | 16:21:08 |
ElvishJerricco | so you can definitely just gain CAP_SYS_NICE | 16:21:28 |
ElvishJerricco | but for that to be useful, the kernel has to have some internal logic about things your userns owns that CAP_SYS_NICE is allowed to operate on | 16:21:57 |
ElvishJerricco | IIUC it's pretty normal for linux caps to have no such logic and just reduce to "after scoping back to the init namespace, what cap remains?" | 16:22:51 |
ElvishJerricco | (oh also mounting additionally has the constraint that you can only make mounts for allowed file systems in a non-init-userns, which currently only includes things like tmpfs and overlayfs) | 16:24:16 |
ElvishJerricco | * (oh also mounting additionally has the constraint that a non-init-uersns can only make mounts for allowed file systems, which currently only includes things like tmpfs and overlayfs) | 16:24:59 |
magic_rb | Jfc this is complicated, but a patch for cap_sys_nice could then be made, if upstream wanted it and i knew how right | 16:27:27 |
ElvishJerricco | you'd have to define (or maybe find documentation on how it's defined) how CAP_SYS_NICE plays together with userns. Like what does the userns own that CAP_SYS_NICE can operate on, because that criteria is how you make it safe | 16:28:43 |
magic_rb | I mean id guess it would be "the userns must have created its own pid namespace. Any pid originating in that namespace is fair game. But obviously i know jack shit about this. Ill look at the rtkit way. Doesnt seem that hard | 16:30:17 |
magic_rb | It would be nice to have in general and probably required on the frame. Otherwise we'll have frame timing issued | 16:30:39 |
ElvishJerricco | yea I'm only explaining my knowledge of userns and caps in general, I have absolutely no clue about this RT / NICE stuff :P | 16:31:00 |
magic_rb | Yeah same, probably less than you :P | 16:31:31 |
ElvishJerricco | oh, this reminded me of something fun:
touch foo
chmod 0400 foo
echo fails > foo # Permission denied
echo works | unshare -c --keep-caps tee foo
You can just write to readonly files unprivileged because you have CAP_DAC_OVERRIDE :)
| 16:48:35 |
ElvishJerricco | (I'm pretty sure the reason this is allowed is because the owner of the userns would have been allowed to just chmod the file back to writable, but it still feels cursed) | 16:49:11 |
magic_rb | Oh lmao | 16:55:50 |
Atemu | Oooh wait, that's a neat trick! This might solve an annoyance we have in #Robotnix in that we need to patch calls to cp in the AOSP build system because it defaults to copying permissions of the sources – which are in the nix store of course – and sometimes those files are meant written to somehow. If we could give the processes DAC_OVERRIDE, it might just make those writes work transparently! | 17:20:03 |