pidstat cheat sheet

I find pidstat useful tool in troubleshooting system performance. Let me share with you some examples of the ways I use it.

1. Checking CPU consumption per process.

This oneliner will run continuously every 1s showing output lines only for processes consuming more than 20% of CPU:

# pidstat -l 1|perl -lane 'print if @F[5] =~ /([2-9]\d|\d{3,})\./'
12:58:50         1484   44.55    5.94    0.00   50.50     1  /usr/sbin/apache2 -k start
12:58:50         2990   46.53    4.95    0.00   51.49    13  /usr/sbin/apache2 -k start
12:58:50         2999   30.69    2.97    0.00   33.66     4  /usr/sbin/apache2 -k start
12:58:50         8976    0.00   32.67    0.00   32.67     9  flush-0:21
12:58:50        11937   54.46    4.95    0.00   59.41     2  /usr/sbin/apache2 -k start

Same, but including threads:

# pidstat -lt 1|perl -lane 'print if @F[6] =~ /([2-9]\d|\d{3,})\./'

Note that the column number differs when watching threads. It may also differ depending on how your OS diplays time (AM/PM adds one column).

Watch single PID’s CPU usage:

$ pidstat -l 1 -p 5181
 
03:11:16 PM       PID    %usr %system  %guest    %CPU   CPU  Command
03:11:17 PM      5181   22.00    2.00    0.00   24.00     1  /usr/lib/firefox/firefox
03:11:18 PM      5181   23.00    2.00    0.00   25.00     1  /usr/lib/firefox/firefox

2. What process in making most context switches.

I graph context switches in Ganglia (with sflow) and I once saw this kind of graph:

context switches graph in ganglia

context switches graph in ganglia

I found out that pidstat could tell me what process was making these spikes. Let’s look for processes making more than 100 non voluntary context switches per second:

# pidstat -wl 1|perl -lane 'print if @F[3] =~ /\d{3,}\./'
 
13:18:40        32579   4408.00   1262.00  /usr/bin/plackup
13:18:40        32588    177.00    134.00  /usr/bin/plackup

3. Checking what process is using disk the most.

If you wanted to know which processes are writing more than 100 kB/s to disk:

# pidstat -dl 1|perl -lane 'print if @F[3] =~ /\d{3,}\./'
13:24:40          382      0.00    172.00      0.00  jbd2/sda3-8
13:24:48         1406      0.00    160.00    160.00  /usr/bin/perl /etc/rc2.d/S20varnishgmetric start
13:24:54         1981      0.00    212.00      0.00  /usr/bin/plackup
13:24:54        24520     56.00   1912.00   1080.00  /usr/sbin/apache2 -k start

Bonus note on KVM and iostat: On KVM host if you run iostat -dx 1 it will show you I/O consumption per drive. Now, how to tell which of dm-N devices belong to what VM? Let’s say the most I/O heavy drive was dm-15. Here’s how:

# dmsetup ls|grep 15
guests-vm_i1_root    (252, 15)

4. Looking at RAM hungry processes second by second.

This oneliner will show processes that hold more than 200kB of their RSS in RAM.

# pidstat -rl 1|perl -lane 'print if  @F[5] =~ /([2-9]\d{5,}|\d{7,})/'