| 14 Mar 2026 |
lassulus | not sure it's changes, because we had this happen in the past as well | 19:39:09 |
matthewcroughan | well it's some part of the kernel that gets poked often enough to cause the issue to occur across time | 19:39:27 |
matthewcroughan | I have a reproducer, I will put it as a PR for a new test, to test real usage instead of the optimistic usage of the current tets | 19:40:04 |
Mic92 | In reply to @lassulus:lassul.us not sure it's changes, because we had this happen in the past as well Yeah I had this on my plate at least a year ago | 19:40:07 |
matthewcroughan | * I have a reproducer, I will put it as a PR for a new test, to test real usage instead of the optimistic usage of the current tests | 19:40:07 |
lassulus | do we have a reproducer now? | 19:40:24 |
matthewcroughan | The current tests only allocate 64M for the ESP for example, which is not realistic | 19:40:26 |
Mic92 | Some client work. It was worse with xcp | 19:40:27 |
matthewcroughan | Yes, but I'm formatting it as a new test now | 19:40:37 |
matthewcroughan | that will be a draft and won't pass CI until someone takes it over | 19:40:47 |
lassulus | ok, thanks | 19:41:17 |
Mic92 | In reply to @lassulus:lassul.us do we have a reproducer now? There is some ci, I could add this potentially to which runs this on large images | 19:43:11 |
matthewcroughan | It's not large images that cause the issue | 19:43:22 |
matthewcroughan | they cause a different issue, but the main issue (cannot allocate memory) is caused by the kernel versions | 19:43:54 |
Mic92 | Okay but in my testing I had issues only with larger images | 19:44:09 |
matthewcroughan | you don't need a particularly large image to make it happen | 19:44:10 |
Mic92 | In any case, I know a project where this was triggered. | 19:45:14 |
Mic92 | @lassulus:lassul.us: the dirty page sysctl doesn't sound like ads related to me | 19:46:39 |
Mic92 | * @lassulus:lassul.us: the dirty page sysctl doesn't sound like zfs related to me | 19:46:56 |
lassulus | is it only with zfs? | 19:48:13 |
lassulus | I have a vague memory that it also happens on other FS | 19:48:25 |
Mic92 | But not 100 percent sure. Would have to check the Kernel code | 19:49:02 |
Mic92 | Also this should probably take into account how much memory we give to the vm | 19:49:58 |
lassulus | probably, but we don't need any cache for paths we copy once from one fs to the other? | 19:50:21 |
Mic92 | Well it might also hurt Io scheduling | 19:50:57 |
Mic92 | If you flush more often | 19:51:09 |
Mic92 | CP/rsync are probably doing direct I/o as well | 19:52:10 |
Mic92 | This usually by passes page cache | 19:52:32 |
lassulus | cp doesn't seem to do it, otherwise this would not happen? | 19:52:50 |
lassulus | or it's a different issue | 19:53:08 |