23 Mar 2025 |
uep | but that's a question for somewhere else, so I asked this one here ;-) | 23:55:29 |
@elvishjerricco:matrix.org | there's StopPropagatedFrom= / PropagatesStopTo= to propagate stops from one unit to another | 23:56:42 |
@elvishjerricco:matrix.org | but that's not a restart | 23:56:46 |
uep | (my initial guess about udev rules was that it might be hard to avoid restarting smartd 36 times) | 23:58:17 |
@elvishjerricco:matrix.org | yea you'd have to use the right udev rule. | 23:58:41 |
@elvishjerricco:matrix.org | either something that can detect the completed state | 23:58:49 |
@elvishjerricco:matrix.org | or starting a systemd timer that waits an appropriate amount of time | 23:58:59 |
24 Mar 2025 |
uep | it might end up being timers anyway, because i don't want it to try counting to 36 and then only 35 disks appear one day | 00:00:13 |
@elvishjerricco:matrix.org | yea you'd really want to detect at the chassis level "all disks inside are powered up", not "36 disks are powered up" | 00:00:46 |
@elvishjerricco:matrix.org | that may or may not be possible | 00:00:59 |
@elvishjerricco:matrix.org | oh but you could combine this with Upholds | 00:01:11 |
uep | (same problem with just picking one particular disk as the trigger, but i could possibly trigger on the controller / sas expander / chassis ses device or something like that) | 00:01:21 |
@elvishjerricco:matrix.org | hm but you still have to deal with restarting when the other unit is started | 00:01:30 |
@elvishjerricco:matrix.org | yea, I'm thinking that you will either need to learn how to identify the state of the chassis as a whole, or otherwise just start a systemd timer to restart the service whenever any of the disks appears or disappears | 00:02:31 |
uep | at least then it's a different transaction | 00:03:06 |
@elvishjerricco:matrix.org | the nice thing about a timer is that starting it 36 times will still only cause it to elapse once | 00:03:32 |
@elvishjerricco:matrix.org | assuming the timer length is longer than the time it takes for 36 drives to come onlnie | 00:03:46 |
@elvishjerricco:matrix.org | * assuming the timer length is longer than the time it takes for 36 drives to come online | 00:03:48 |
uep | thanks.. this was roughly where I was expecting to go anyway, it's nice to have validation and that there wasn't something i was missing as a better way | 00:04:09 |
uep | on the way up is easy enough, i also have the pool import to tell me that "enough" disks are there | 00:04:45 |
uep | on the way down it's a little more fragile if i want to avoid all the spurious alarms.. i probably need to stop smartd, then power off the chassis, then start smartd | 00:05:40 |
@elvishjerricco:matrix.org | frankly I'm surprised smartd doesn't just handle disks appearing and disappearing gracefully though | 00:05:50 |
@elvishjerricco:matrix.org | like that seems like basic functionality to me | 00:06:00 |
uep | or go figure out if smartd has something to note disks as removable | 00:06:07 |
uep | yeah, it probably does, but isn't the default (except maybe for usb?) because telling you when disks go offline is it's job | 00:07:08 |
uep | i haven't gone looking at that yet | 00:07:20 |
uep | (and i also need it to keep state by wwn rather than by probe order /dev/sdak etc | 00:08:54 |
uep | * (and i also need it to keep state by wwn rather than by probe order /dev/sdak etc) | 00:09:07 |
antifuchs | oh lol, I have a little generator program for smartctl scan cron tasks, if you're interested | 00:15:16 |
antifuchs | (makes one per disk; it's not smart [ha ha] enough to not make one for a connected usb drive, but it does the job | 00:15:42 |