Pages

Monday 9 January 2012

CASE STUDY ON BUFFER BUSY WAIT EVENT

PURPOSE
-------

This document discusses a rare and difficult to diagnose database performance 
problem characterized by extremely high buffer busy waits that occur at 

seemingly random times.  The problem persists even after traditional buffer 
busy wait tuning practices are followed (typically, increasing the number of 
freelists for an object).  
 
SCOPE & APPLICATION

-------------------

This document is intended for support analysts and customers.  It applies to 
both Unix and Windows-based systems, although the examples here will be 
particular to a Unix-based (Solaris) system.


In addition to addressing a specific buffer busy wait performance problem, 
in section II, this document presents various techniques to diagnose and 
resolve this problem by using detailed data from a real-world example.  The 

techniques illustrated here may be used to diagnose other I/O and performance 
problems.



RESOLVING INTENSE AND "RANDOM" BUFFER BUSY WAIT PERFORMANCE PROBLEMS
------------------------------
------------------------------
-------- This document is composed of two sections; a summary section that broadly discusses the problem and its resolution, and a detailed diagnostics section that shows how to collect and analyze various database and operating system diagnostics related to this problem. The detailed diagnostics section is provided to help educate the reader with techniques that may be useful in other situations. I. Summary ~~~~~~~~~~~ 1. Problem Description ~~~~~~~~~~~~~~~~~~~~~~~ At seemingly random times without regard to overall load on the database, the following symptoms may be witnessed: - Slow response times on an instance-wide level - long wait times for "buffer busy waits" in Bstat/Estat or Statpack reports - large numbers of sessions waiting on buffer busy waits for a group of objects (identified in v$session_wait) Some tuning effort may have been spent in identifying the segments involved in the buffer busy waits and rebuilding those segments with a higher number of freelists or freelist groups (from 8.1.6 on one can dynamically add process freelists; segments only need to be rebuilt if changing freelist groups). Even after adding freelists, the problem continues and is not diminished in any way (although regular, concurrency-based buffer busy waits may be reduced). 2. Problem Diagnosis ~~~~~~~~~~~~~~~~~~~~~ The problem may be diagnosed by observing the following: - The alert.log file shows many occurrences of ORA-600, ORA-7445 and core dumps during or just before the time of the problem. - The core_dump_dest directory contains large core dumps during the time of the problem. There may either be many core dumps or a few very large core dumps (100s of MB per core file), depending on the size of the SGA. - sar -d shows devices that are completely saturated and have high request queues and service times. These devices and/or their controllers are part of logical volumes used for database files. - Buffer busy waits, write complete waits, db file parallel writes and enqueue waits are high (in top 5 waits, usually in that order). Note that in environments using Oracle Advanced Replication, the buffer busy waits may at times appear on segments related to replication (DEF$_AQCALL table, TRANORDER index, etc...). 3. Problem Resolution ~~~~~~~~~~~~~~~~~~~~~~ The cause for the buffer busy waits and other related waits might be a saturated disk controller or subsystem impacting the database's ability to read or write blocks. The disk/controller may be saturated because of the many core dumps occurring simultaneously requiring hundreds of megabytes each. If the alert.log or core_dump_dest directory has no evidence of core dumps, then the source of the I/O saturation must be found. It may be due to non-database processes, another database sharing the same filesystems, or a poorly tuned I/O subsystem. The solution is as follows: 1) Find the root cause for the I/O saturation (core dumps, another process or database, or poorly performing I/O subsystem) and resolve it. OR, 2) If evidence of core dumps are found: - Find the causes for the core dumps and resolve them (patch, etc) - Move the core_dump_dest location to a filesystem not shared with database files. - Use the following init.ora parameters to reduce or avoid the core dumps: shadow_core_dump = partial background_core_dump = partial These core dump parameters can also be set to "none" but this is not recommended unless the causes for the core dumps have been identified. II. Detailed Diagnostics with Examples ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This section provides details into the diagnostic tools and methods used to resolve this problem and are intended to help illustrate the use of various diagnostics that may be applied in other situations. 1. Diagnostics ~~~~~~~~~~~~~~~ The following diagnostics will help fingerprint the problem. - alert.log and trace files in the background_dump_dest, user_dump_dest, and core_dump_dest directories - iostat or sar data (on Windows, the Performance Monitor utility with disk counters activated) with filesystem mapping tools or techniques (e.g., Veritas vxprint) - StatsPack or BSTAT/ESTAT reports - V$SESSION_WAIT A. ALERT.LOG file ~~~~~~~~~~~~~~~~~ The alert.log file should always be checked in any database diagnostic effort. In this case, the following entries were created during the time of the buffer busy wait problem: Wed May 16 00:38:15 2001 Errors in file /db_admin/prod/bdump/mlrep_s008_2731.trc: ORA-07445: exception encountered: core dump [kzdurtc()+4] [SIGBUS] [Invalid address alignment] [859063291] [] [] Wed May 16 00:38:15 2001 Errors in file /db_admin/prod/bdump/mlrep_s014_2737.trc: ORA-07445: exception encountered: core dump [kzdurtc()+4] [SIGSEGV] [Address not mapped to object] [926430458] [] [] Wed May 16 00:41:13 2001 Errors in file /db_admin/prod/bdump/mlrep_s005_930.trc: ORA-07445: exception encountered: core dump [kzdurtc()+4] [SIGBUS] [Invalid address alignment] [1178678525] [] [] The stack traces corresponding to the core files indicated a problem during the login process. It was discovered that a core dump could be caused if users accidentally entered passwords greater than 48 characters. Furthermore, in this web-based application, users did not receive an error and would retry the login, producing another core dump with each click of the mouse. B. SAR Diagnostics ~~~~~~~~~~~~~~~~~~ SAR, IOSTAT, or similar tools are critical to diagnosing this problem because they show the health of the I/O system during the time of the problem. The SAR data for the example we are looking at is shown below (shown using "sar -d -f /var/adm/sa/sa16"): SunOS prod1 5.6 Generic_105181-23 sun4u 05/16/01 01:00:00 device %busy avque r+w/s blks/s avwait avserv sd22 100 72.4 2100 2971 0.0 87.0 sd22,c 0 0.0 0 0 0.0 0.0 sd22,d 0 0.0 0 0 0.0 0.0 sd22,e 100 72.4 2100 2971 0.0 87.0 /\ || extremely high queue values (usually less than 2 during peak) By mapping the sd22 device back to the device number (c3t8d0) and then back to the logical volume through to the filesystem (using "df" and Veritas' utility /usr/sbin/vxprint), it was determined the filesystem shared the same controller (c3) as several database files (among them were the datafiles for the SYSTEM tablespace). By looking within the filesystems using the aforementioned controller (c3), several very large (1.8 GB) core dumps were found in the core_dump_dest directory, produced around the time of the problem. C. StatsPack ~~~~~~~~~~~~~ During an episode of the performance problem, it is very helpful to collect samples using StatsPack. The interval between samples is ideally about 15 minutes, but even 1-hour intervals are acceptable. It is very valuable to also collect baseline samples of the same interval during similar activity levels when there is NOT a problem. By having the baseline, one will be able to see how certain statistics reflect the problem. i. Waits Events: During the problem episode, "buffer busy waits" and "write complete waits" will be seen as the highest wait events. During the baseline load, these events are not significant. Other significant wait events during the problem may be "enqueue" and "db file parallel writes". For example: Avg Total Wait wait Waits Event Waits Timeouts Time (cs) (ms) /txn ---------------------------- ------------ ---------- ----------- ----- ------ buffer busy waits 225,559 211,961 24,377,029 1081 4.0 enqueue 25,731 21,756 6,786,722 2638 0.5 Parallel Query Idle Wait - S 9,980 7,929 1,762,606 1766 0.2 SQL*Net message from dblink 435,845 0 1,288,965 30 7.7 db file parallel write 4,252 0 1,287,023 3027 0.1 write complete waits 5,872 5,658 581,066 990 0.1 db file sequential read 1,249,981 0 510,793 4 22.0 ii. Statistics: There may be evidence of DB Writer falling behind and slow write performance as follows: - low number of DBWR Timeouts - the ratio (free buffers inspected)/(free buffers requested) * 100% > 5% - much higher number for "dirty buffers inspected" than normal baseline The following lists some key statistics to look at: Statistic Total per Second per Trans ----------------------- ---------------- ------------ ------------ consistent changes 43,523 12.1 2.4 Much free buffer inspected 6,087 1.7 0.3 <== higher free buffer requested 416,010 115.6 23.1 than logons cumulative 15,718 4.4 0.9 normal physical writes 24,757 6.9 1.4 write requests 634 0.2 0.0 It's important to look at workload parameters to ensure the problem isn't due to simply more work being done. The statistic "consistent changes", "logons cumulative", "physical writes", and "write requests" are all compared to a baseline. iii. Tablespace I/O Summary The average wait times for tablespaces will be dramatically higher. Tablespace IO Summary for DB: PROD Instance: PROD Snaps: 3578 - 3579 Avg Read Total Avg Wait Tablespace Reads (ms) Writes Waits (ms) ----------- ----------- -------- ----------- ---------- -------- BOM 482,368 7.0 18,865 3,161 205.9 very CONF 157,288 0.6 420 779 9897.3 <= high CTXINDEX 36,628 0.5 7 4 12.5 very RBS 613 605.7 23,398 8,253 7694.6 <= high SYSTEM 18,360 3.6 286 78 745.5 DB_LOW_DATA 16,560 2.6 1,335 14 24.3 Similar statistics per datafile may be seen in the "File IO Statistics" section of StatsPack. iv. Buffer Busy Wait Statistics Buffer wait Statistics for DB: PROD Instance: PROD Snaps: 3578 - 3579 Tot Wait Avg Class Waits Time (cs) Time (cs) ------------------ ----------- ---------- --------- data block 216,577 ########## 108 <== severe contention on undo header 5,072 609,734 120 data blks undo block 3,770 333,876 89 ("off the scale") free list 70 17,426 249 segment header 8 34 4 v. Enqueues The StatsPack data for this case study shows many SQ (sequence) enqueue waits. SQ enqueues are held while Oracle updates the dictionary entry of the sequence with the next value (or starting value for the next batch of sequence numbers to cache). The object id of the sequence is found by looking at the p2 column of v$session_wait for 'enqueue' events and selecting from DBA_OBJECTS using the object_id (you must convert the value in the p2 column to decimal first). In this case it was for sys.sessaud$. This means the SYSTEM tablespace is being impacted by the I/O problems and is taking a long time to generate the next sequence number for "sessaud$". Enqueue activity for DB: MLREP Instance: MLREP Snaps: 3578 - 3579 Enqueue Gets Waits ---------- ------------ ---------- SQ 5,130 1,345 <== Drill down v$session_wait.p2 and TX 80,735 63 DBA_OBJECTSfor object_id. In SR 8,207 7 this case it was for sys.audses$ TM 93,225 4 CF 1,396 1 vi. Buffer Pool Statistics The buffer pool statistics should be compared with the baseline. During the performance problem write complete waits may be 10 to 100 times longer than during the baseline load (without the problem): Buffer Pool Sets for DB: PROD Instance: PROD Snaps: 3578 - 3579 Free Write Buffer Set Buffer Consistent Physical Physical Buffer Complete Busy Id Gets Gets Reads Writes Waits Waits Waits --- ----------- ----------- ----------- ----------- -------- -------- -------- 1 99,132 4,060,929 97,235 2,860 0 384 148,348 2 97,773 3,359,172 96,851 3,185 0 221 640 3 97,320 3,486,183 96,592 3,014 0 303 1,024 4 96,961 1,943,505 96,366 2,723 0 232 598 5 97,182 1,508,089 96,223 2,798 0 107 5,731 ... /\ || these are 10 times greater than baseline D. V$SESSION_WAIT ~~~~~~~~~~~~~~~~~~ V$SESSION_WAIT can be used to see how many sessions are impacted by the buffer busy wait problem in real-time using a query as follows: SELECT count(*), event FROM v$session_wait WHERE wait_time = 0 AND event NOT IN ('smon timer','pmon timer','rdbms ipc message', 'SQL*Net message from client') GROUP BY event ORDER BY 1 DESC; Output: COUNT(*) EVENT ---------- ---------------------------------------------------------------- 122 buffer busy waits 15 enqueue 15 enqueue 5 db file parallel write 3 SQL*Net message from dblink 2 latch free 1 SQL*Net more data to client To see the file and block numbers in more detail: SELECT count(*) NUM_WAITERS, p1 FILE#, p2 BLK#, p3 CLASS FROM v$session_wait WHERE event = 'buffer busy waits' GROUP BY p1, p2, p3 NUM_WAITERS FILE# BLK# CLASS ------------ ----- ------- ------ 92 2 13487 1016 73 2 27762 1016 32 1 29697 1016 This shows that during the execution of the above query, there were 92 waiters for file 2, block 13487. To find the object name and type for the objects being wait on, use the following query (supplying the file numbers and block numbers shown above): SELECT owner,segment_name,segment_type FROM dba_extents WHERE file_id=&file AND &blockid BETWEEN block_id AND block_id + blocks After querying against DBA_EXTENTS and supplying the file and block numbers, the following correlation can be made: NUM_WAITERS OWNER.SEGMENT_NAME TYPE FILE# BLK# CLASS ------------ ------------------- --------- ---- ------- ------ 92 SYSTEM.DEF$_TRANORDER INDEX 2 13487 1016 73 SYSTEM.DEF$_AQCALL TABLE 2 27762 1016 32 SYSTEM.DEF$_TRANORDER INDEX 1 29697 1016 Normally, one would rebuild the above segments with more freelists or freelists groups (in 8.1.6 or higher you can add process freelists without rebuilding the segment) to reduce contention. However, in this case more freelists won't help. 2. Resolving the Problem ~~~~~~~~~~~~~~~~~~~~~~~~~~ With the above diagnostics, it was possible to conclude that the core dumps caused disk controller saturation and impeded the database from reading and writing blocks. By scanning back through the alert.log file and looking for occurrences of similar core dumps, it was possible to ascertain when the problems had appeared. Looking at StatsPack data for the same time period seen in the alert.log file, the buffer busy waits were indeed the highest wait event. Since the customer could not change the application to limit the password length and there were no other places to put the core_dump_dest files, the following changes were made to the init.ora file: shadow_core_dump = partial background_core_dump = partial These changes caused the core dumps to be created AFTER the oracle processes are detached from the SGA. The resulting core dumps were 24 MB on average instead of 1.8 GB. Later occurrences of core dumps were seen in the alert.log and cdump directory, with no buffer busy wait side effects or performance degradation (and no I/O impact visible in sar data); this confirmed that the init.ora changes solved the problem.

---------------------------------------------------------------------------------------------------------

Related links:


CASE STUDY ON BUFFER BUSY WAIT EVENT
CHECKPOINTS CONFUSIONS CLEARED
ORACLE CHECKPOINTS
ORACLE DATABASE BUFFER CACHE
SIMULATE BUFFER BUSY WAIT AND IDENTIFY HOT OBJECT
 CONSISTENT READS IN ORACLE PART-I


CONSISTENT READS IN ORACLE PART-II 

FLUSH BUFFER CACHE

No comments: