04 Evaluating the options for Exploiting SSD
Oracle database administrators have a range of options for taking advantage of the performance benefits offered by solid state disk:
- The entire database might be deployed on SSD. For smaller databases this may be the simplest option. However, this option will often be economically impractical for large databases.
- Selected data files, tables, indexes, or partitions could be located on solid state disk. This requires substantial planning and administration, but is likely to be very effective if the correct segments are selected.
- For databases that are of a compatible version and operating system type, the Oracle database flash cache could be used. This option is simple to implement and can be very effective for read-intensive, index-based workloads.
- The temporary table space could be relocated to solid state disk storage, accelerating the performance of temporary segment IO, as occurs with disk sort or hashing operations.
- Redo logs are inherently write intensive and have been suggested as candidates for solid state disk storage. However as we shall see, the nature of redo log IO is not a perfect fit for SSD.
Evaluating the Options
In order to assess the likely performance benefits for each of the deployment options outlined above, we performed simple performance tests that measured the performance of each configuration under appropriate workloads.
Our test system consisted of a Dell R510 dual quad core system with 32 GB of RAM, four directly attached 15K SAS hard disk drives and 1 PCI FusionIO SLC PCI SSD. Unless otherwise stated, the performance comparisons are between a single SAS drive dedicated to a specific datafile or segment, and the FusionIO card dedicated to the same purpose. All ancillary IO (operating system, other datafiles, etc.) were directed to other devices.
Random Read Test
In this test, we compared the performance benefits provided by directly storing a segment on flash solid state disk with the database flash case and with traditional magnetic disk. The workload consisted of index-based reads against a 20 million row table using a small (128M) buffer cache and a 16 GB database flash cache.
Figure 4.1. Random read performance
Figure 4.1 summarizes the results. Using the database flash cache resulted in a significant boost in performance – a reduction of 74 percent in elapsed time, while hosting the entire table (and its index) on SSD resulted in a 95 percent reduction.
We did, of course, expect database flash cache performance to lag behind that of direct SSD, as the database flash cache requires that the block be read from magnetic disk at least once. However, the database flash cache does not require that there be sufficient SSD to store the entire segment, so it may be the more practical option in some circumstances.
OLTP Read/Write Workload
In this test, each row was selected twice and updated once during the test. Commits were issued every 10,000 rows. Figure 8 shows the results of the test.
Figure 4.2. Read/write performance test
The performance advantage of SSD is somewhat diminished when writes are involved, particularly for the db flash cache. However, the performance advantages were still substantial: 69 percent reduction in elapsed time for the database flash cache and 91 percent reduction when directly mounting the segment on SSD.
Full Table Scan Performance
This test measured the time to perform a full table scan against the test table. Figure 9 shows the results.
Figure 4.3. Full table scan performance
Not surprisingly, placing the table on SSD resulted in a dramatic improvement in full table scan performance. However, using the database flash cache did not result in any measurable performance improvement. In Oracle 11g, full table scans typically bypass the buffer cache: the Oracle process reads directly from disk and does not place the blocks thus read in the buffer cache. Since the database flash case is populated from the buffer cache, full table scans cannot normally benefit from the database flash cache.
Full Table Scans and FusionIO directCache
As noted above, some SSD vendors offer caching technologies that operate in a similar way to the Oracle database flash cache, but which are independent of Oracle software. These caching layers can accelerate direct path reads from full table scans that bypass the Oracle buffer cache and, therefore, the database flash cache.
Figure 4.4. FusionIO directCache can accelerate full table scans
Figure 10 shows the effect of implementing FusionIO directCache on full table scan performance. Without directCache, successive table scans do not reduce IO overhead, since Oracle uses direct path reads that bypass its own buffer cache and the database flash cache. However, with directCache, subsequent table scans accelerate as IO requests are satisfied from the SSD. directCache is therefore a compelling solution for databases whose IO includes substantial direct path full table scans.
Disk Sort and Hash Operations
Oracle performs IO to the temporary tablespace when a disk sort or hashing operation occurs — typically as a consequence of ORDER BY or join processing — and there is insufficient PGA (Program Global Area) memory available to allow the join to complete in memory. Depending on the shortfall of memory, Oracle may need to perform a single-pass disk operation or multi-pass operations. The more passes involved, the heavier the overhead of the temporary segment IO.
Figure 4.5. Traditional performance profile for disk sorts
Figure 11 shows what we have come to expect from disk sort performance. As memory becomes constrained, temporary segment IO requirements for the operation come to dominate performance and rise sharply as the number of temporary segment passes increase.
When the temporary tablespace is placed on SSD, a completely different performance profile emerges, as shown in Figure 12. While the overhead of single-pass sorts are significantly improved, the overhead of multi-pass disk sorts is drastically reduced. The more temporary segment passes involved, the greater the improvement.
Figure 4.6. SSD radically reduces the overhead of multi-pass disk sorts
Redo Log Optimization
The Oracle architecture is designed so that sessions are blocked by IO requests only when absolutely necessary. The most common circumstance is when a transaction entry must be written to the redo log in order to support a COMMIT. The Oracle session must wait in this circumstance in order to ensure that the commit record has actually been written to disk.
Since redo log IO is so often a performance bottleneck, many have suggested locating the redo logs on solid state disk for performance optimization. However, the nature of redo log IO is significantly different from data file IO; redo log IO consists almost entirely of sequential write operations for which magnetic disk is well suited, since there is no seek time and write throughput is limited only by the rotation of the magnetic disk. In contrast the write intensive workload is a worst case scenario for SSD since, as we have seen, write IO is far slower than read IO for solid state devices.
Figure 13 shows how a redo IO constrained workload placing the logs on a solid state disk device resulted in no measurable improvement in performance. In this case the SAS drive was able to support a write rate equal to that of the solid state disk device.
Figure 4.7. Performance of Redo constrained workload
View Printer Friendly Version
References (2)
-
Response: Using SSD for redo on Exadata - pt 2at Products - Blog on October 1, 2012In my previous post on this topic, I presented data showing that redo logs placed on an ASM diskgroup
-
Response: Using SSD for redo on Exadata - pt 2at Platform - Blog on October 1, 2012In my previous post on this topic, I presented data showing that redo logs placed on an ASM diskgroup
Post a New Comment
My response is on my own website »
Notify me of follow-up comments via email.
Reader Comments (1)
Hi Guy,
Thank you for an interesting read. I have done some tests on HDD vs SSD. MY findings are as follows:
SSD versus HDD for random IOs via an index scan
In this case a table of 1.2 milion rows was created on an 8K block size. The same table was created and populated on a schema using HDD and another schema using SSD. They were on the same instance. Both tests were done after a reboot.
For this type of query SSD was faster by a factor of 20 compared to HHD. SSD shows its use for random IO's done via an index as seek time disappears.
SSD versus HDD for full table scans
In this case the gain was only 45% more with SSD compared to HDD. The reason being that HDD is reading sequentially from blocks and the seek time is considerably less compared to random reads.
However, I noticed that for full table scans you are implying that the gain is considerable. Could you kindly elaborate on that?
Regards,
Mich