Pages

Tuesday, 24 July 2012

11g R2 RAC : PROBLEMS DISCOVERED AND SOLUTIONS



                11G R2 RAC PROBLEMS AND SOLUTIONS

I discovered some issues while working with 11g R2 RAC. I would like to share them with u all.

- During c/w installation, If kernel parameters are not set to
  appropriate values and runfixup.sh is run, kernel.sem parameter
  is set for running machine only. It's value is reset on next
  reboot and will have to be set manually to

  kernel.sem = 250 32000  100 128

-------------------------------------------------------------

- Whenver there is any problemwith ASM, it could be due to
  following reasons

 - parameter kernel.sem not set to appropriate value
 - Inappropriate permissions on /u01/app/11.2.0/grid/bin/oracle
 - Inappropriate permissions on /u01/app/oracle/product/11.2.0/dbhome_1/bin


------------------------------------------------------
  - parameter kernel.sem not set to appropriate value
-------------------------------------------------------

  soln:

  - add following line to /etc/sysctl.conf

    kernel.sem = 250 32000  100 128

  - #sysctl -p

-------------------------------------------------------
 - Inappropriate permissions on /u01/app/11.2.0/grid/bin/oracle
-------------------------------------------------------

- Check  that Permission on file /u01/app/11.2.0/grid/bin/oracle
  shd be

# ls -lr /u01/app/11.2.0/grid/bin/oracle

-rwsr-s--x 1 grid oinstall 152462728 Apr  4 08:45 oracle

  soln:

  - if permissions are not appropriate, change them

    #chmod 6751 /u01/app/11.2.0/grid/bin/oracle

     chown grid:oinstall /u01/app/11.2.0/grid/bin/oracle

-------------------------------------------------------
 - Inappropriate permissions on /u01/app/oracle/product/11.2.0/dbhome_1/bin
-------------------------------------------------------

- Check  that Permission on file
  /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle   shd be

#ls -lr oracle

-r-sr-s--x 1 oracle asmadmin 173515905 Apr 10 23:29 oracle

  soln:

  - if permissions are not appropriate, change them

    #chmod 6751  /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle

     chown oracle:asmadmin  /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle


------------------------------------------------------------------------

IF CRS SERVICE IS NOT STARTING ON A NODE AND CRSD.LOG HAS AN ENTRY
LIKE

   hash-id does not match

It implies that OCR info on that node is not compatible with
other nodes. This could be the case when OCR was reconfigured
(add/delete/replace a location) and the affected node was not
online so that it's ocr.loc was not updated.

Soln: Copy /etc/oracle/ocr.loc from a runnning node to the
      affected node and try to restart crs

-------------------------------------------------------------------
RCONFIG

IF WE USE RCONFIG TO ADD AN INSTANCE, ENTRIES ARE NOT AUTOMATICALLY
ADDED TO THE TNSNAMES.ORA ON THE NEWLY ADDED NODE. THEY HAVE TO BE
ADDED MANUALLY.

RCONFIG WILL AUTOMATICALLY ADD REDO THREADS FOR THE ALL THE RUNNING NODES IN THE CLUSTER. FOR EXAMPLE, IF WE HAVE
- A SINGLE INSTANCE DATABASE (NODE1)
- 5 NODES IN THE CLUSTER AND
- 3 RUNNING NODES (NODE1, NODE2, NODE3)

IF WE USE RCONFIG TO ADD ANOTHER INSTANCE ON ONE OF THE REMAINING RUNNING NODES (NODE2), IT WILL AUTOMATICALLY ADD 2 SETS OF REDO THREADS, ONE FOR NODE2 AND ANOTHER FOR NODE3.

--------------------------------------------------------------------

IF A NODE IS DELETED USING ROOTCRS.PL etc. procedure (steps 3,4 and 7)
GRID HOME IS NOT DELETED FROM THE DELETED NODE.

TO ADD BACK SUCH NODE,

  - RUN ADDNODE.SH WITH NOCOPY OPTION

    It will update the inventories on all existing nodes and
    instantiates scripts on local node

    #cd grid_home/oui/bin

     addNode.sh -silent -noCopy ORACLE_HOME=$ORACLE_HOME "CLUSTER_NEW_NODES={host03}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={host03-vip}"


   - RUN ROOT.SH

     In case it is not available in grid home, copy root.sh
     from any of the existing nodes

     #cd /u01/app/11.2.0/grid

      ./root.sh

---------------------------------------------------------------------------------------------

WHEN AN INSTANCE IS ADDED TO A RAC DATABASE USING DBCA, TNSNAMES.ORA
ON NEWLY ADDED INSTANCE IS NOT UPDATED.

Soln: Copy entry for the database from tnsnames.ora from existing node to the newly added node.

---------------------------------------------------------------------------------------------
 Problem:

After adding a node to the cluster, sometimes the node is visible to the cluster i.e. it appears when
crsctl stat res -t is issued but it is not listed in DBCA. The reason is that its information is not updated
in the inventory of existing nodes. 

Solution:

Add the entry of the newly added node in the inventory of the existing nodes for database home.

$cd /u01/app/oraInventory/ContentsXML
  vi inventory.xml
i.e.

<HOME NAME="OraDb11g_home1" LOC="/u01/app/oracle/product/11.2.0/dbhome_1" TYPE="O" IDX="2">
   <NODE_LIST>
      <NODE NAME="host01"/>
      <NODE NAME="host02"/>
      <NODE NAME="host03"/>

--------------------------------------------------------------------------------------------------------

If a serverpool is created using srvctl e.g. test and we want to see its status, we can see it using srvctl or crsctl.
If we query using srvctl status serverpool -g test, it will display its status
If we query using crsctl status serverpool test, it will not recognize test serverpool but
if we query using crsctk status serverpool ora.test, it displays the status.

Hence, the name of the serverpool created using srvctl should be prefixed by ora. for crsctl to recognize it. 

Another issue related to the same problem is that DBCA recognizes only the pools prefixed with ora. Hence if we want to create a policy managed database ,
  - create the corresponding serverpool using srvctl
   - if we create serverpool using crsctl and prefix  the name with .ora.   i.e crsctl create serverpool ora.test , DBCA will recognize the pool but the database created using this pool will be listed as ADMIN MANAGED when its configuration is checked using srvctl although it runs in the sreverpool.

   Hence we should use srvctl to create a serverpool if we want a database to run in that pool.

-------------------------------------------------------------------------------------------------------------

In case of an admin managed database , names of the instances have the format db1, db2 etc.
In case of an policy managed database , names of the instances have the format db_1, db_2 etc.
/etc/oratab should have entries for all the instances so that we can invoke EM.
To invoke EM:
$export ORACLE_UNQNAME=db
 -- check which is instance is running on current node
#srvctl status database -d db
-- set ORACLE_SID to the instance running on current node
#export ORACLE_SID=db_n
$emctl start dbconsole


------------------------------------------------------------------------------------------------------

 Related links :

HOME

11.2 RAC GOODIES AND SERVERPOOLS
11G R2 RAC PROBLEMS AND SOLUTIONS
CLONE DATABASE HOME IN 11GR2 RAC
NIC BONDING IN 11G R2 RAC
PREREQUISITE TO CONFIGURE GNS AND GPNP IN 11G R2 RAC
SERVICES CONFUSIONS CLEARED
SETUP 10G RAC ON YOUR LAPTOP
SETUP AND PLAY WITH 11G R2 RAC AT YOUR HOME PC
http://koenigocm.blogspot.in/search/label/11g%20R2%20RAC%20-%20ADD%20INSTANCE%20MANUALLY

11g R2 RAC - LOCAL ARCHIVE WITH NFS
  http://koenigocm.blogspot.in/search/label/11g%20R2%20RAC%20-%20LOCAL%20ARCHIVE%20WITH%20NFS
 
 11g R2 RAC : CACHE FUSION DEMONSTRATED
 http://koenigocm.blogspot.in/search/label/11g%20R2%20RAC%20%3A%20CACHE%20FUSION%20DEMONSTRATED


11G R2 RAC : DYNAMIC REMASTERING DEMONSTRATED

11gR2 RAC : USE RCONFIG TO CONVERT NON RAC DATABASE TO RAC DATABASE
http://koenigocm.blogspot.in/search/label/11gR2%20RAC%20%3A%20USE%20RCONFIG%20TO%20CONVERT%20NON%20RAC%20DATABASE%20TO%20%20RAC%20DATABASE

11g R2 RAC : RECOVER VOTING DISK - A SCENARIO
http://koenigocm.blogspot.in/search/label/11g%20R2%20RAC%20%3A%20RECOVER%20VOTING%20DISK%20-%20A%20SCENARIO


11g R2 RAC : TRACING SERVICES IN A RAC DATABASE
 http://koenigocm.blogspot.in/search/label/11g%20R2%20RAC%20%3A%20TRACING%20SERVICES%20%20IN%20A%20RAC%20DATABASE

11G R2 RAC: INSTANCE RECOVERY
 http://koenigocm.blogspot.in/search/label/11G%20R2%20RAC%3A%20INSTANCE%20RECOVERY

11g R2 RAC: MANAGING THIRD PARTY APPLICATIONS USING APPLICATION VIP
 http://koenigocm.blogspot.in/search/label/11g%20R2%20RAC%3A%20MANAGING%20THIRD%20PARTY%20APPLICATIONS%20USING%20APPLICATION%20VIP

Monday, 2 July 2012

ORACLE DATABASE BUFFER CACHE




                      BUFEFR CACHE ARCHITECTURE



Database buffer cache is a memory structure in SGA. Buffers in DB cache are used to hold blocks of data read from the data files.Each buffer is sized to hold one database block. Oracle allows for a total of 8 different caches :

- 5 caches to allow different block sizes
- 3 caches to allow different behaviour

BUFFERS TO ALLOW DIFFRENT BLOCK SIZES :

- DEFAULT CACHE : To hold blocks of size DB_BLOCK_SIZE. Its size is determined by the parameter DB_CACHE_SIZE

- DB NK CACHE : To hold blocks of size = nK where n = 2,4,8,16,32 where n is not equal the size of the default block (DB_BLOCK_SIZE).  Its size is determined by the parameter DB_NK_CACHE_SIZE

BUFFERS TO ALLOW DIFFRENT BEHAVIOUR :

- DEFAULT CACHE : Cache where blocks of those objects are stored which are not assigned to keep or recycle cache or assigned default cache.

- KEEP CACHE : Cache to hold blocks of small sized objects for a longer time.Its size is determined by the parameter DB_KEEP_CACHE_SIZE.

- RECYCLE CACHE : Cache designed to quickly age out blocks of rarely accessed large sized objects. Its size is determined by the parameter DB_RECYCLE_CACHE_SIZE.

Each buffer in any of the above caches can be in any of the following states :

- FREE : Free buffers can be further of two kinds:
         - Empty or unused
         - Buffers which are identical to the blocks on disk because
           . they have not been dirtied or
           . they were dirtied and have been written to disk

- PINNED : A buffer being accessed (read/written) by a process.

- DIRTY : A buffer which holds a block that has been changed by a server process and hence needs to be writted to the disk before it can be reused.

Buffer headers store metadata about contents of the buffers and are used to manage the cache.The metadata includes information like Data block address (DBA) of the block, type of block (data,undo), touch count, pointer to the buffer, pointers to previous/next buffer in cache bufferchain etc. There is one row in V$BH for every buffer. Buffer headers keep track of various attributes and state of the buffers in buffer cache. The blocks are placed in the buffer cache in the form of linked lists (cache buffer chains) in different hash buckets. While searching for blocks in buffer cache oracle does not look at them buffer by buffer as it will bring down the performance drastically. Let us take the example of a library where we are looking for a blook. If we search the whole library book by book it will take very long but if the books are placed subject wise, we first need to go the rack containing books of the desired subject and then we search for the book we want. Our search will be over rather quickly. In the same manner, buffers are placed in different hash buckets depending upon the hash value of the data block address (DBA). While searching for a particular block, Oracle first hashes the DBA of the desired block, locates the hash bucket containing the block and then searches the blocks in that hash bucket. Different DBAs can hash to the same value (similar to modulus function i.e mod (5,4) = mod (9,4) = mod(13,4)....). Hence, one hash bucket can contain blocks with different DBA's. Each hash chain is protected by a latch. Processes need to get the relevant latch to allow them to scan a hash chain for a buffer so that while searching, the linked list underneath does not change. These latches are called cache buffer chain latches CBC latches). One CBC latch protects multiple hash buckets.

   To implement read consistency, a hash chain can also contain multiple versions of the  same block (DBA). For each DBA in a bucket, there is at least one current block which contains all the committed/uncommitted changes in different rows in that block made by different users. When a user requests a row from that block, he should receive a read consistent copy of the block i.e. the copy should have all the changes (committed / uncommitted)  made by him and only the committed changes made by other users. If the desired read consistent clone is already there, it is read and the user gets to see the results. If the read consistent clone is not there, a copy of the current block is made and marked as clone. Changes are rolled back on the clone after reading undo and then the user os displayed the results. In this case, No. of consistent gets are higher. If a block is rapidly changed and is rapidly queried, multiple CR copies can exist in buffer cache. Such a  block is called a hot block. All CR copies have the same DBA and hence go to the same hash bucket (CBC). The CBC gets longer and hence takes longer to scan. Multiple users trying to scan the CBC chain may have to wait to obtain the latch on the hash bucket. This can result in contention on CBC latch.

          CBC latch    Hash bucket          Buffer hash chains of           Buffers
                                                           buffer headers

            L1 +------- B1 ----------------- A <-> C                                     A
               |                                                                                         B
               +------- B2 ----------------- B <-> F <-> E <-> B                    C
                                                        |_____________|                     D
                                                                         |                               E
                                                                CR CLONES                     F

            L2 +------- B3
               |
               +------- B4

            L3 +------- B5
               |
               +------- B6

No. of buckets protected by each latch = 32 - is high so that

1. Hash chains linked to a bucket will be smaller.

2. No. of latches will be less since latch is a fairly large structure with along code path.

No. of hash buckets is decided by an uncodumented parameter _db_block_hash_buckets


  Whenever any block is modified, changes are always made on the current copy of the block and the statistic db block gets is incremented. Th information in hash buckets is actually the linked list of buffer headers pointing to blocks in DB buffercache. The latch on the hash chain can be obtained in

- SHARED MODE: to scan the buffers

- EXCLUSIVE MODE: To link/delink blocks to/from  the chain

To know more about consistent reads in oracle click the following link: http://koenigocm.blogspot.in/search/label/CONSISTENT%20READS%20IN%20ORACLE

                ALGORITHM FOR HASH LOOKUP
         ---------------------------------------

1. Hash the DBA (Tablespace-id + File-id + Block-id)

2. Find the corresponding hash bucket .

3. Try to get the latch protecting the hash bucket

   If not success

      - Spin for spin count times

        . If latch was not got while spinnning,
              then sleep, wakeup and go to step 3(try to get latch)

          else (latch obtained)
              go to step 4(walk the chain)

   else ( latch obtained)
       Go to step 4 (walk the chain)

4. Walk the chain, reading buffer headers to see if specific version of the buffer is already in the chain.

   If found (logical I/O)

      Access the buffer in buffer cache with protection of buffer pin/unpin action.

   else (not found) (Physical I/O)

      - find a free buffer in buffer cache (obtain LRU latch)
      - unlink the buffer header for that buffer from its current chain (LRU chain)
      - Link that buffer header with this buffer chain
      - Release the latch and read the block into that free buffer with buffer header unpinned.
   
                        WORKING SETS

Buffers in the buffer cache are divided into working sets to improve scalability of common buffer cache operation. Each working set has its own LRU list and two check point queues.

- One DB writer may be responsible to write buffers in multiple working sets. Therefore, write I/O workload can be distributed among various DB writers.

- Since each working set has  its own LRU list, competition between processes trying to look for free buffers is reduced as differebt readers pick buffers from different working sets.

     No. of working sets in 11g R2 = CPU_COUNT
     No. of DB writers in 11g R2 = CPU_COUNT/8

Buffers in DB buffer cache are managed using various doubly linked lists:

- LRU list : contains list of free / dirty / pinned buffers. It has a

    . Least Recently used (LRU) end which has cold buffers which have not been accessed for a long time.

    . Most recently used (MRU) end which have hot buffers which have been recently accessed.

  When a block is first read into the buffer, it is placed in the middle of the LRU list and moved to the LRU/MRU end depeneding upon how frequently and how recently it has been accessed which is further decided by a buffer's touch count. The touch count of a buffer is stored in its header .When a server process scans the LRU list to find free buffers, it moves the buffers to the MRU/LRU end depending upon its touch count. Hence, it is possible that some buffers on the LRU end might not be really cold. They may be there because they have not been searched by a server process yet.

  As instance activity increases, dirty buffers on LRU need to be flushed to the disk so that they can be reused. Since blocks need to be written to disk in the order in which they were dirtied, anothet list is maintained i.e.e checkpoint queue which contains dirty blocks in the order of SCN#.

- Checkpoint queue : contains dirty buffers in the order in which they were dirtied (SCN#). Whenever a block is dirtied, it is attached here. A buffer can be simlultaneously in both LRU list and checkpoint queue.

- Write list - Main : contains dirty buffers which have to be written by DB writer to free some buffers in the buffer cache. When a Server process is scanning LRU list for a free buffer, dirty buffers in LRU list are  delinked from LRU list before linking them to this list.

- Write list - Aux : contains dirty buffers  to which have already been written to the online redo log files and are currently being written by DB writer to data files.

In the next article I will discuss about checkpointing.
----------------------------------------------------------------------------------------------------------

Related links:


HOME

BUFFER CACHE WAIT EVENTS
CASE STUDY ON BUFFER BUSY WAIT EVENT
CHECKPOINTS CONFUSIONS CLEARED
ORACLE CHECKPOINTS
ORACLE DATABASE BUFFER CACHE
SIMULATE BUFFER BUSY WAIT AND IDENTIFY HOT OBJECT
 CONSISTENT READS IN ORACLE PART-I
http://koenigocm.blogspot.in/search/label/CONSISTENT%20READS%20IN%20ORACLE



CONSISTENT READS IN ORACLE PART-II 

FLUSH BUFFER CACHE