Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | |||
software:matlab:shared-node [2018-12-03 11:20] – anita | software:matlab:shared-node [2019-08-27 17:00] (current) – [Computational models for running Matlab on a shared cluster] anita | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Computational models for running Matlab on a shared cluster ====== | ||
+ | By default, Matlab uses multiple computational threads. | ||
+ | < | ||
+ | matlab -singleCompThread limits MATLAB to a single computational thread. | ||
+ | By default, MATLAB makes use of the multithreading capabilities of the | ||
+ | computer on which it is running. | ||
+ | </ | ||
+ | |||
+ | The default, multiple computational threads, is never a good option when you are sharing a node. | ||
+ | So either use '' | ||
+ | |||
+ | <note important> | ||
+ | Using a node with exclusive access does not mean MATLAB will use all the cores and memory. | ||
+ | watch it to see memory and core requirement. | ||
+ | built-in, matrix functions. | ||
+ | are being executed. | ||
+ | </ | ||
+ | |||
+ | Matlab can, with the distributed computing toolbox, create a parallel pool of workers to be dispatched | ||
+ | in parallel. | ||
+ | |||
+ | ===== Multiple computational threads on one node ===== | ||
+ | Matlab makes use | ||
+ | of the multithreading capabilities of the computer on which it is running. | ||
+ | < | ||
+ | version -blas | ||
+ | version -lapack | ||
+ | </ | ||
+ | To make full use of the MKL computational threads you need to use the built-in matrix functions. | ||
+ | the cores share the same memory, so this is also called the shared memory model for parallel computing. | ||
+ | the total Matlab job performs is | ||
+ | |||
+ | < | ||
+ | CPU = (p*20 + (1-p))*WALL | ||
+ | </ | ||
+ | |||
+ | |||
+ | The actual number of computational threads is not explicitly mentioned in the Unix documentation. | ||
+ | well, but is not optimized for Mills processor or threading libraries. | ||
+ | |||
+ | ==== Test batch jobs using GridEngine ==== | ||
+ | |||
+ | Several copies of the same MATLAB script was submitted to run simultaneously. The variance was in the batch script directives. | ||
+ | |||
+ | |||
+ | === Batch job with exclusive access (only job on node) === | ||
+ | |||
+ | Part of batch script file:< | ||
+ | $ tail -4 batche.qs | ||
+ | #$ -l exclusive=1 | ||
+ | |||
+ | vpkg_require matlab/ | ||
+ | matlab -nodisplay -nojvm -r ' | ||
+ | </ | ||
+ | |||
+ | CGROUP report from batch output file:< | ||
+ | $ grep CGROUPS *.o425422 | ||
+ | [CGROUPS] No / | ||
+ | [CGROUPS] UD Grid Engine cgroup setup commencing | ||
+ | [CGROUPS] Setting none bytes (vmem none bytes) on n171 (master) | ||
+ | [CGROUPS] | ||
+ | [CGROUPS] done. | ||
+ | </ | ||
+ | |||
+ | |||
+ | Memory and timing results:< | ||
+ | $ qacct -h n171 -j 425422 | egrep ' | ||
+ | start_time | ||
+ | failed | ||
+ | ru_wallclock 8037.427 | ||
+ | ru_maxrss | ||
+ | cpu 53089.736 | ||
+ | maxvmem | ||
+ | maxrss | ||
+ | </ | ||
+ | |||
+ | === Batch job with 5 slots 370 MB per core (1.85 GB total) === | ||
+ | |||
+ | Part of batch script file:< | ||
+ | $ tail -6 batch5.qs | ||
+ | #$ -pe threads 5 | ||
+ | #$ -l mem_total=1.9G | ||
+ | #$ -l m_mem_free=370M | ||
+ | |||
+ | vpkg_require matlab/ | ||
+ | matlab -nodisplay -nojvm -r ' | ||
+ | </ | ||
+ | |||
+ | CGROUP report from batch output file:< | ||
+ | $ grep CGROUPS *.o428562 | ||
+ | [CGROUPS] UD Grid Engine cgroup setup commencing | ||
+ | [CGROUPS] Setting 388050944 bytes (vmem none bytes) on n139 (master) | ||
+ | [CGROUPS] | ||
+ | [CGROUPS] done. | ||
+ | </ | ||
+ | |||
+ | Memory and timing results:< | ||
+ | $ qacct -h n139 -j 428562 | egrep ' | ||
+ | start_time | ||
+ | failed | ||
+ | ru_wallclock 5.297 | ||
+ | ru_maxrss | ||
+ | cpu 3.090 | ||
+ | maxvmem | ||
+ | maxrss | ||
+ | </ | ||
+ | === Batch job with 4 slots 1 GB per core (4 GB total) === | ||
+ | |||
+ | Part of batch script file:< | ||
+ | $ cat batch.qs | ||
+ | #$ -pe threads 4 | ||
+ | #$ -l m_mem_free=1G | ||
+ | |||
+ | vpkg_require matlab/ | ||
+ | matlab -nodisplay -nojvm -r ' | ||
+ | </ | ||
+ | |||
+ | CGROUP report from batch output file:< | ||
+ | $ grep CGROUPS *.o418695 | ||
+ | [CGROUPS] UD Grid Engine cgroup setup commencing | ||
+ | [CGROUPS] Setting 1073741824 bytes (vmem none bytes) on n036 (master) | ||
+ | [CGROUPS] | ||
+ | [CGROUPS] done. | ||
+ | </ | ||
+ | This is sharing the node with the previous job on cores 5-8. | ||
+ | |||
+ | Memory and timing results:< | ||
+ | $ qacct -h n036 -j 418695 | egrep ' | ||
+ | failed | ||
+ | ru_wallclock 826.759 | ||
+ | ru_maxrss | ||
+ | cpu 1629.194 | ||
+ | maxvmem | ||
+ | maxrss | ||
+ | </ | ||
+ | |||
+ | === Batch job with 3 slots 1 GB per core (3 GB total) === | ||
+ | |||
+ | Part of batch script file:< | ||
+ | $ cat batch.qs | ||
+ | #$ -pe threads 3 | ||
+ | #$ -l m_mem_free=1G | ||
+ | |||
+ | vpkg_require matlab/ | ||
+ | matlab -nodisplay -nojvm -r ' | ||
+ | </ | ||
+ | |||
+ | CGROUP report from batch output file:< | ||
+ | $ grep CGROUPS *.o408597 | ||
+ | [CGROUPS] UD Grid Engine cgroup setup commencing | ||
+ | [CGROUPS] Setting 3221225472 bytes (vmem 9223372036854775807 bytes) on n039 (master) | ||
+ | [CGROUPS] | ||
+ | [CGROUPS] done. | ||
+ | </ | ||
+ | |||
+ | Memory and timing results:< | ||
+ | $ qacct -h n039 -j 408597 | egrep ' | ||
+ | ru_wallclock 13877.991 | ||
+ | ru_maxrss | ||
+ | cpu 90776.109 | ||
+ | maxvmem | ||
+ | maxrss | ||
+ | </ | ||
+ | === Batch job with 2 slots 3.1 GB per core (6.2 GB total) === | ||
+ | |||
+ | 3.1 GB per core on a 20 core node is 62 GB, which allows 20 jobs to fit with 2 GB to spare for system overhead | ||
+ | |||
+ | Part of batch script file:< | ||
+ | $ cat batch.qs | ||
+ | # -pe threads 2 | ||
+ | # -l m_mem_free=3.1G | ||
+ | |||
+ | vpkg_require matlab/ | ||
+ | matlab -nodisplay -nojvm -r ' | ||
+ | </ | ||
+ | |||
+ | CGROUP report from batch output file:< | ||
+ | $ grep CGROUPS *.o408598 | ||
+ | [CGROUPS] UD Grid Engine cgroup setup commencing | ||
+ | [CGROUPS] Setting 6657200128 bytes (vmem 9223372036854775807 bytes) on n039 (master) | ||
+ | [CGROUPS] | ||
+ | [CGROUPS] done. | ||
+ | </ | ||
+ | This is sharing the node with the previous job, being on cores 3-4. | ||
+ | |||
+ | Memory and timing results:< | ||
+ | $ qacct -h n039 -j 408598 | egrep ' | ||
+ | ru_wallclock 13904.972 | ||
+ | ru_maxrss | ||
+ | cpu 92110.859 | ||
+ | maxvmem | ||
+ | maxrss | ||
+ | </ | ||
+ | |||
+ | === Batch job with 1 slots 3.1 GB per core (3.1 GB total) === | ||
+ | |||
+ | 3.1 GB per core on a 20 core node is 62 GB, which allows 20 jobs to fit with 2 GB to spare for system overhead | ||
+ | |||
+ | Part of batch script file:< | ||
+ | $ cat batch.qs | ||
+ | #$ -l m_mem_free=3.1G | ||
+ | |||
+ | vpkg_require matlab/ | ||
+ | matlab -nodisplay -nojvm -r ' | ||
+ | </ | ||
+ | |||
+ | CGROUP report from batch output file:< | ||
+ | $ grep CGROUPS *.o408599 | ||
+ | [CGROUPS] UD Grid Engine cgroup setup commencing | ||
+ | [CGROUPS] Setting 3328602112 bytes (vmem 9223372036854775807 bytes) on n036 (master) | ||
+ | [CGROUPS] | ||
+ | [CGROUPS] done. | ||
+ | </ | ||
+ | |||
+ | Memory and timing results:< | ||
+ | $ qacct -h n036 -j 408599 | egrep ' | ||
+ | ru_wallclock 8607.872 | ||
+ | ru_maxrss | ||
+ | cpu 51805.427 | ||
+ | maxvmem | ||
+ | maxrss | ||
+ | </ | ||
+ | |||
+ | ==== Table ==== | ||
+ | |||
+ | ^ ^^ requested ^^ used memory and time ^^^ | ||
+ | ^ jobid ^ host ^ cores ^ memory ^ maxvem ^ cpu ^ wallclock ^ | ||
+ | | 408594 | n038 | all 20 | all <64GB | 4.155G | 51321.533 | 8613.132 | | ||
+ | | 408595 | n037 | 5 | 5G | 4.043G | 86578.676 | 13051.171 | | ||
+ | | 408596 | n037 | 4 | 4G | 4.301G | 86330.547 | 13067.863 | | ||
+ | | 408597 | n039 | 3 | 3G | 4.180G | 90776.109 | 13877.991 | | ||
+ | | 408598 | n039 | 2 | 6.2G | 4.208G | 92110.859 | 13904.972 | | ||
+ | | 408599 | n031 | default 1 | 3.1G | 4.036G | 51805.427 | 8607.872 | | ||
+ | |||
+ | ==== Table new spread over nodes ==== | ||
+ | |||
+ | ^ ^^ requested ^^ used memory and time ^^^ | ||
+ | ^ jobid ^ host ^ cores ^ memory ^ maxvem ^ cpu ^ wallclock ^ | ||
+ | | 418705 | n172 | all 20 | all <64GB | 2.904G | 5553.820 | 1089.789 | | ||
+ | | 418704 | n039 | 5 | 5G | 1.874G | 1778.309 | 804.490 | | ||
+ | | 418695 | n036 | 4 | 4G | 1.801G | 1629.194 | 826.759 | | ||
+ | | 418693 | n037 | 3 | 3G | 1.735G | 1475.837 | 863.386 | | ||
+ | | 418691 | n040 | 2 | 6.2G | 1.662G | 1334.752 | 944.711 | | ||
+ | | 418690 | n038 | default 1 | 1G | 1.536G | 1164.087 | 1173.832 | | ||
+ | |||
+ | |||
+ | ==== Table new same node ==== | ||
+ | |||
+ | ^ ^^ requested ^^ used memory and time ^^^^ | ||
+ | ^ jobid ^ host ^ cores ^ memory ^ maxvem ^ maxrss ^ cpu ^ wallclock ^ | ||
+ | | 418768 | n172 | all 20 | all <64GB | 3.805G | 1.633G | 5246.490 | ||
+ | | 418773 | n036 | 5 | 5G | 1.852G | 578.457M | 1953.868 |930.284 | | ||
+ | | 418772 | n036 | 4 | 4G | 1.779G | 579.109M |1800.191 | 949.475 | | ||
+ | | 418771 | n036 | 3 | 3G | 1.709G | 570.246M |1660.543 | 996.545 | | ||
+ | | 418770 | n036 | 2 | 6.2G | 1.640G | 557.363M | 1543.664 | 1106.315 | | ||
+ | | 418769 | n036 | default 1 | 1G | 1.514G | 564.840M | 1356.694 |1356.256 | | ||
+ | |||
+ | ==== Graphs ==== | ||
+ | |||
+ | As number of cores increases both the CPU time and memory usage increase linearly. | ||
+ | |||
+ | {{: | ||
+ | |||
+ | {{: | ||
+ | |||
+ | Both CPU time and memory are costs to running you algorithm, since they limit the number of other users that can use the node. | ||
+ | To chart both consider a simple cost of CPU*Memory in GB hours. | ||
+ | |||
+ | * Reduce the run time | ||
+ | * Reduce the cost | ||
+ | |||
+ | |||
+ | {{: | ||
+ | |||
+ | The two extremes on the Pareto optimization curve and good choices. | ||
+ | |||
+ | ==== Commands while running ==== | ||
+ | |||
+ | < | ||
+ | $ n=n182 | ||
+ | </ | ||
+ | |||
+ | **'' | ||
+ | |||
+ | < | ||
+ | $ ssh $n ps -eo pid, | ||
+ | PID RUSER %CPU %MEM THCNT STIME TIME COMMAND | ||
+ | 96970 traine | ||
+ | 96971 traine | ||
+ | 96972 traine | ||
+ | 96974 traine | ||
+ | 97005 traine | ||
+ | 97130 traine | ||
+ | </ | ||
+ | |||
+ | **'' | ||
+ | |||
+ | < | ||
+ | $ ssh $n ps -eLf | egrep ' | ||
+ | UID | ||
+ | traine | ||
+ | traine | ||
+ | traine | ||
+ | traine | ||
+ | traine | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | $ ssh $n ps -eLf | egrep ' | ||
+ | UID | ||
+ | traine | ||
+ | traine | ||
+ | traine | ||
+ | traine | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | $ ssh $n ps -eLf | egrep ' | ||
+ | UID | ||
+ | traine | ||
+ | traine | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | $ ssh $n ps -eLf | egrep ' | ||
+ | UID | ||
+ | traine | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | $ ssh $n ps -eLf | egrep ' | ||
+ | UID | ||
+ | traine | ||
+ | </ | ||
+ | **'' | ||
+ | < | ||
+ | $ ssh $n top -H -b -n 1 | egrep ' | ||
+ | PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND | ||
+ | 97281 traine | ||
+ | 97276 traine | ||
+ | 97275 traine | ||
+ | 97284 traine | ||
+ | 97282 traine | ||
+ | 97283 traine | ||
+ | 97316 traine | ||
+ | 97317 traine | ||
+ | 97315 traine | ||
+ | 97314 traine | ||
+ | 97311 traine | ||
+ | 97310 traine | ||
+ | 97312 traine | ||
+ | 97308 traine | ||
+ | </ | ||
+ | **'' | ||
+ | < | ||
+ | $ ssh $n mpstat -P ALL 1 2 | ||
+ | Linux 2.6.32-504.30.3.el6.x86_64 (n182) 02/16/2016 _x86_64_ (20 CPU) | ||
+ | |||
+ | 05:08:25 PM CPU %usr | ||
+ | 05:08:26 PM all | ||
+ | 05:08:26 PM 0 | ||
+ | 05:08:26 PM 1 | ||
+ | 05:08:26 PM 2 | ||
+ | 05:08:26 PM 3 | ||
+ | 05:08:26 PM 4 | ||
+ | 05:08:26 PM 5 100.00 | ||
+ | 05:08:26 PM 6 | ||
+ | 05:08:26 PM 7 100.00 | ||
+ | 05:08:26 PM 8 100.00 | ||
+ | 05:08:26 PM 9 | ||
+ | 05:08:26 PM | ||
+ | 05:08:26 PM | ||
+ | 05:08:26 PM | ||
+ | 05:08:26 PM | ||
+ | 05:08:26 PM | ||
+ | 05:08:26 PM | ||
+ | 05:08:26 PM | ||
+ | 05:08:26 PM | ||
+ | 05:08:26 PM | ||
+ | 05:08:26 PM | ||
+ | |||
+ | 05:08:26 PM CPU %usr | ||
+ | 05:08:27 PM all | ||
+ | 05:08:27 PM 0 100.00 | ||
+ | 05:08:27 PM 1 | ||
+ | 05:08:27 PM 2 | ||
+ | 05:08:27 PM 3 | ||
+ | 05:08:27 PM 4 | ||
+ | 05:08:27 PM 5 100.00 | ||
+ | 05:08:27 PM 6 | ||
+ | 05:08:27 PM 7 100.00 | ||
+ | 05:08:27 PM 8 100.00 | ||
+ | 05:08:27 PM 9 | ||
+ | 05:08:27 PM | ||
+ | 05:08:27 PM | ||
+ | 05:08:27 PM | ||
+ | 05:08:27 PM | ||
+ | 05:08:27 PM | ||
+ | 05:08:27 PM | ||
+ | 05:08:27 PM | ||
+ | 05:08:27 PM | ||
+ | 05:08:27 PM | ||
+ | 05:08:27 PM | ||
+ | |||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | Average: | ||
+ | </ | ||
+ | **'' | ||
+ | < | ||
+ | $ qhost -h $n | ||
+ | HOSTNAME | ||
+ | ---------------------------------------------------------------------------------------------- | ||
+ | global | ||
+ | n182 lx-amd64 | ||
+ | |||
+ | </ | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===== Multiple distributed workers ===== | ||
+ | ===== Single computational threads ===== | ||
+ | |||
+ | |||
+ | |||
+ | ===== Monitoring Tools ===== | ||
+ | |||
+ | There are several tools you can run on your node to monitor the computational threads on your node. In this example n093 is running several MATLAB jobs. | ||
+ | * Ganglia (real time) '' | ||
+ | * top | ||
+ | * ps | ||
+ | ==== Using top ==== | ||
+ | |||
+ | < | ||
+ | dnairn@mills dnairn]$ ssh n093 top -b -n 1 | egrep ' | ||
+ | PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND | ||
+ | 8209 matusera | ||
+ | 2622 matusera | ||
+ | 4386 matusera | ||
+ | 14939 matusera | ||
+ | 16308 matusera | ||
+ | </ | ||
+ | |||
+ | ==== Using ps command ==== | ||
+ | |||
+ | < | ||
+ | [dnairn@mills dnairn]$ ssh n093 ps -eo pid, | ||
+ | PID RUSER %CPU %MEM THCNT STIME TIME | ||
+ | 2622 matusera 21.1 0.3 90 Jul29 6-19: | ||
+ | 4386 matusera | ||
+ | 8209 matusera 1019 6.9 90 13:18 02: | ||
+ | 14939 matusera 27.3 0.3 90 Jul10 13-23:39:21 / | ||
+ | 16308 matusera 46.6 0.3 90 Jul24 17-07:28:38 / | ||
+ | </ | ||
+ | |||
+ | Description of the custom column values from ps man page: | ||
+ | < | ||
+ | pid PID process ID number of the process. | ||
+ | </ | ||
+ | < | ||
+ | ruser RUSER real user ID. This will be the textual user ID, if it can be obtained and the field | ||
+ | width permits, or a decimal representation otherwise. | ||
+ | </ | ||
+ | < | ||
+ | %cpu | ||
+ | divided by the time the process has been running (cputime/ | ||
+ | as a percentage. It will not add up to 100% unless you are lucky. (alias pcpu). | ||
+ | </ | ||
+ | < | ||
+ | %mem | ||
+ | expressed as a percentage. (alias pmem). | ||
+ | </ | ||
+ | < | ||
+ | thcount | ||
+ | </ | ||
+ | < | ||
+ | bsdstart | ||
+ | output format is " HH: | ||
+ | month). See also lstart, start, start_time, and stime. | ||
+ | </ | ||
+ | < | ||
+ | time | ||
+ | </ | ||
+ | < | ||
+ | args | ||
+ | shown. The output in this column may contain spaces. A process marked < | ||
+ | partly dead, waiting to be fully destroyed by its parent. Sometimes the process args | ||
+ | will be unavailable; | ||
+ | brackets. (alias cmd, command). See also the comm format keyword, the -f option, and | ||
+ | the c option. | ||
+ | | ||
+ | When specified last, this column will extend to the edge of the display. If ps can not | ||
+ | determine display width, as when output is redirected (piped) into a file or another | ||
+ | command, the output width is undefined. (it may be 80, unlimited, determined by the | ||
+ | TERM variable, and so on) The COLUMNS environment variable or --cols option may be | ||
+ | used to exactly determine the width in this case. The w or -w option may be also be | ||
+ | used to adjust width. | ||
+ | </ | ||
+ | ==== ps for threads ==== | ||
+ | |||
+ | Select thread with PID 12035 with some activity, that is not C = 0. | ||
+ | < | ||
+ | [dnairn@mills dnairn]$ ssh n093 ps -eLf | egrep ' | ||
+ | UID PID PPID | ||
+ | matusera 12035 11918 12082 98 90 16:39 pts/2 00:43:21 / | ||
+ | matusera 12035 11918 12132 67 90 16:39 pts/2 00:29:49 / | ||
+ | matusera 12035 11918 12133 67 90 16:39 pts/2 00:29:42 / | ||
+ | matusera 12035 11918 12134 67 90 16:39 pts/2 00:29:43 / | ||
+ | matusera 12035 11918 12135 67 90 16:39 pts/2 00:29:34 / | ||
+ | matusera 12035 11918 12136 67 90 16:39 pts/2 00:29:47 / | ||
+ | matusera 12035 11918 12137 67 90 16:39 pts/2 00:29:50 / | ||
+ | matusera 12035 11918 12138 67 90 16:39 pts/2 00:29:48 / | ||
+ | matusera 12035 11918 12139 67 90 16:39 pts/2 00:29:45 / | ||
+ | matusera 12035 11918 12140 67 90 16:39 pts/2 00:29:40 / | ||
+ | matusera 12035 11918 12141 67 90 16:39 pts/2 00:29:33 / | ||
+ | matusera 12035 11918 12142 67 90 16:39 pts/2 00:29:32 / | ||
+ | </ | ||
+ | |||
+ | twelve of the 90 threads are doing computation. |