Loading...
 

Disk IO

Access time

The main killer when reading and writing from a disk. Disk latency is around 13ms, but it depends on the quality and rotational speed of the hard drive. RAM latency is around 83 nanoseconds. Roughly 170,000 times faster.

This is why caching data in memory is so important for performance – the difference in latency between RAM and a hard drive is enormous

Disk Performance

hdparm

To get a basic idea of how fast a physical disk can be accessed from Linux you can use the hdparm tool with the -T and -t options. The -T option takes advantage of the Linux disk cache and gives an indication of how much information the system could read from a disk if the disk were fast enough to keep up. The -t option also reads the disk through the cache, but without any precaching of results. Thus -t can give an idea of how fast a disk can deliver information stored sequentially on disk.

fio

hdparm is a low level disk tool and doesn’t consider variations in the filesystem used. fio on the other hand. It can issue its IO requests using one of many synchronous and asynchronous IO APIs, and can also use various APIs which allow many IO requests to be issued with a single API call. It can very precisely define usage patterns and calculate the time the disk subsystem takes to complete the requests.

Further detail of https://www.linux.com/learn/tutorials/442451-inspecting-disk-io-performance-with-fio using fio is covered at linux.com

I/O wait

The measurement for an I/O bottleneck. It is the percentage of time processors are waiting on the disk.

Whilst the disk is being accessed, the processor is idle. It’s waiting on the disk.

Check your I/O wait percentage via top. If the I/O wait percentage is greater than (1/number_of_CPU_cores) then your CPUs are waiting a significant amount of time for the disk subsystem to catch up.

For example, if I/O wait is 12.1% and there are 8 cores, this is very close to (1/8 cores = 0.125).

IOPS

Input/output Operations Per-Second. The maximum I/O throughput is better calculated by working out the theoretical IOPS and compare it to your actual IOPS.

Theoretical I/O Operations Per-Sec = number of disks * Average I/O Operations on 1 disk per-sec / % of read workload + (Raid Factor * % of write workload)

read/write workload is application dependent. So using a tool like sar or iostat (reads/sec and writes/sec) which reports read and write throughput can help.

Compare the theoretical IOPS to the tps column displayed via sar. This is the actual IOPS.

The Cause

Four primary factors impact IOPS:

  • Multidisk Arrays – More disks in the array mean greater IOPS. If one disk can perform 150 IOPS, two disks can perform 300 IOPS.
  • Average IOPS per-drive – The greater the number of IOPS each drive can handle, the greater the the total IOPS capacity. This is largely determined by the rotational speed of the drive.
  • RAID Factor – Your application is likely using a RAID configuration for storage, which means you’re using multiple disks for reliability and redundancy. Some RAID configurations have a significant penalty for write operations. For RAID 6, every write request requires 6 disk operations. For RAID 1 and RAID 10, a write request requires only 2 disk operations. The lower the number of disk operations, the higher the IOPS capacity. This article has a great breakdown on RAID and IOPS performance.
  • Read and Write Workload – If you have a high percentage of write operations and a RAID setup that performs many operations for each write request (like RAID 5 or RAID 6), your IOPS will be significantly lower.

So what is it?

  • Too few disks
  • Poor IOPS per disk
  • Incorrect RAID Factor
  • Badly written software or modifications to your application can have a big impact on Disk I/O.

Troubleshooting


iostat is a good start

$ iostat -dkx 60
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          28.18    0.91   11.52   22.93    5.86   30.61

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
xvdep1            0.00     0.00    5.40    4.40    43.20    35.20     8.00     1.12  114.53   6.39   6.26
xvdf              0.00   124.00  995.80   24.00 80316.80  1184.00    79.92     4.96    4.87   0.87  89.14
xvdg              0.00     5.80    4.60   13.80    36.80   156.80    10.52     0.08    4.47   1.46   2.68


analysis
for /dev/xvdf

avg time that each request spent being serviced = 0.87ms
avg time that each request spent in queue (qtime) = await – svctime =4.87 - 0.87 = 4ms
so averagely each IO request spent 4.87ms to get processed; of which 4ms were spent just waiting in queue
%util can be calculated as (r/s + w/s) * svctime / 1000ms * 100 = (995.80 + 24.00) * (0.87 / 1000 * 100) = 88.7226

It is also worth noting that the disk is doing significantly more reads than it is writes

for /dev/xvdg

avg time tha each request spent being serviced = 1.46ms
avg time that each request spent in queue (qtime) = await – svctime =4.47 - 1.46 = 3.01ms
so averagely each IO request spent 4.47ms to get processed; of which 3.01ms were spent just waiting in queue
and that only had utilisation reads and writes of 4.60 + 13.80

Alternatively, the following commands:
First set block dump active in /proc:

sudo sh -c "echo 1 > /proc/sys/vm/block_dump"

Looking at the syslog now will tell where resources are going:

tail -f /var/log/syslog

Turning block dumping off again:

sudo sh -c "echo 0 > /proc/sys/vm/block_dump"

Monitoring

  • The CPU usage I/O Wait percentage.
  • The I/O metrics for a given device, including the I/O Wait time in milliseconds and read/write throughput.

http://bhavin.directi.com/iostat-and-disk-utilization-monitoring-nirvana/