Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision | |||
| technical:whitepaper:automated_devshm_cleanup [2018-12-12 15:53] – [Implementation] frey | technical:whitepaper:automated_devshm_cleanup [2018-12-13 13:02] (current) – [Implementation] frey | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | ====== Automated /dev/shm cleanup ====== | ||
| + | Historically, | ||
| + | |||
| + | - master process sets up shared segments and exports an environment variable identifying the key(s) | ||
| + | - master process forks slave processes | ||
| + | - each slave process consults the appropriate environment variable for shared memory key(s) | ||
| + | - each slave maps the necessary shared segments into its memory space | ||
| + | |||
| + | When a program like this crashes, it often leaves its shared segments orphaned: | ||
| + | |||
| + | <code bash> | ||
| + | $ ipcs -m | ||
| + | |||
| + | ------ Shared Memory Segments -------- | ||
| + | key shmid owner perms bytes nattch | ||
| + | 0x00000000 45350912 | ||
| + | 0x00000000 45383681 | ||
| + | 0x00000000 45416450 | ||
| + | |||
| + | </ | ||
| + | |||
| + | Behavior of such programs varies: | ||
| + | |||
| + | <code bash> | ||
| + | $ ipcrm --shmem-id 45350912 --shmem-id 45383681 | ||
| + | </ | ||
| + | |||
| + | With the advent of the memory-backed Linux '' | ||
| + | |||
| + | * a POSIX shared memory segment is backed by a file in ''/ | ||
| + | * the backing file has standard Unix filesystem permissions applied to it | ||
| + | * the backing file can be mmap' | ||
| + | * the segment can be examined or removed using standard filesystem tools | ||
| + | |||
| + | Unlike IPC segments, POSIX segments cannot be marked for destruction when no longer attached to a process. | ||
| + | |||
| + | ===== Cleaning-up ===== | ||
| + | |||
| + | On our clusters a lot of Open MPI jobs run. When they crash, the vader BTL leaves behind orphaned POSIX shared memory segments. | ||
| + | * create/ | ||
| + | * OR the file must be actively in-use by at least one process on the system | ||
| + | For arbitrary POSIX segment files, the same criteria with a longer timespan (perhaps 1 day) would target segments that can be purged. | ||
| + | |||
| + | ==== Finding all shared memory segments ==== | ||
| + | |||
| + | This is the stage where time-based criteria to disqualify segments for removal should be applied. | ||
| + | |||
| + | ==== Finding active shared memory segments ==== | ||
| + | |||
| + | The '' | ||
| + | |||
| + | ==== Segments for removal ==== | ||
| + | |||
| + | The set difference, **A** / **B**, is the set of all elements of **A** that are not in **B**. | ||
| + | |||
| + | ==== Removing segments ==== | ||
| + | |||
| + | Running as root, removal is accomplished using '' | ||
| + | |||
| + | ===== Implementation ===== | ||
| + | |||
| + | The '' | ||
| + | |||
| + | The program has various command line options available: | ||
| + | |||
| + | <code bash> | ||
| + | $ shm-cleanup.py --help | ||
| + | usage: shm-cleanup.py [-h] [-v] [-q] [-n] [--show-log-timestamps] | ||
| + | [--age < | ||
| + | [--log-file < | ||
| + | [--daemon-period < | ||
| + | |||
| + | Cleanup /dev/shm | ||
| + | |||
| + | optional arguments: | ||
| + | -h, --help | ||
| + | -v, --verbose | ||
| + | -q, --quiet | ||
| + | -n, --dry-run | ||
| + | done; this option sets the base verbosity level to | ||
| + | INFO (as in -vv) | ||
| + | --show-log-timestamps, | ||
| + | display timestamps on all messages logged by this | ||
| + | program | ||
| + | --age < | ||
| + | only items older than this will be removed; integer or | ||
| + | floating-point values are acceptable with optional | ||
| + | unit of s/m/h/d (default: d) | ||
| + | --no-special-treatment | ||
| + | do not treat PSM2 and vader segment files any | ||
| + | differently than other files | ||
| + | --log-file < | ||
| + | send all logging to this file instead of to stderr; | ||
| + | timestamps are always enabled when logging to a file | ||
| + | --daemon | ||
| + | --daemon-period < | ||
| + | wake to re-check on the given period; integer or | ||
| + | floating-point values are acceptable with optional | ||
| + | unit of s/m/h/d (default: s) | ||
| + | --pid-file < | ||
| + | in daemon mode, write our pid to this file (default: | ||
| + | / | ||
| + | </ | ||
| + | |||
| + | On systems that lack cron (or a similar timed-execution mechanism), the '' | ||