ORACLE CHECKPOINTS:
======================
In this post, I will explain about checkpoints – their purpose and different types of checkpoints.
PURPOSE OF CHECKPOINTS
Database blocks are temporarily stored in Database buffer cache.
As blocks are read, they are stored in DB buffer cache so that if any
user accesses them later, they are available in memory and need not be
read from the disk. When we update any row, the buffer in DB buffer
cache corresponding to the block containing that row is updated in
memory. Record of the change made is kept in redo log buffer . On
commit, the changes we made are written to the disk thereby making them
permanent. But where are those changes written? To the datafiles
containing data blocks? No !!! The changes are recorded in online redo
log files by flushing the contents of redo log buffer to them.This is
called write ahead logging. If the instance crashed right now, the DB
buffer cache will be wiped out but on restarting the database, Oracle
will apply the changes recorded in redo log files to the datafiles.
Why doesn’t Oracle write the changes to datafiles right away
when we commit the transaction? The reason is simple. If it chose to
write directly to the datafiles, it will have to physically locate the
data block in the datafile first and then update it which means that
after committing, user has to wait until DBWR searches for the block and
then writes it before he can issue next command. This will bring down
the performance drastically. That is where the role of redo logs comes
in. The writes to the redo logs are sequential writes – LGWR just dumps
the info in redologs to log files sequentially and synchronously so that
the user does not have to wait for long. Moreover, DBWR will always
write in units of Oracle blocks whereas LGWR will write only the changes
made. Hence, write ahead logging also improves performance by reducing
the amount of data written synchronously. When will the changes be
applied to the datablocks in datafiles? The data blocks in the datafiles
will be updated by the DBWR asynchronously in response to certain
triggers. These triggers are called checkpoints.
Checkpoint is a synchronization event at a specific point in time
which causes some / all dirty blocks to be written to disk thereby
guaranteeing that blocks dirtied prior to that point in time get
written.
Whenever dirty blocks are written to datafiles, it allows oracle
- to reuse a redo log : A redo log can’t be reused until
DBWR writes all the dirty blocks protected by that logfile to disk. If
we attempt to reuse it before DBWR has finished its checkpoint, we get
the following message in alert log : Checkpoint not complete.
- to reduce instance recovery time : As the memory available
to a database instance increases, it is possible to have database
buffer caches as large as several million buffers. It requires that the
database checkpoint advance frequently to limit recovery time, since
infrequent checkpoints and large buffer caches can exacerbate crash
recovery times significantly.
- to free buffers for reads : Dirtied blocks can’t be used
to read new data into them until they are written to disk. Thus DBWrR
writes dirty blocks from the buffer cache, to make room in the cache.
Various types of checkpoints in Oracle :
– Full checkpoint– Thread checkpoint
- File checkpoint
- Parallel Query checkpoint
- Object checkpoint
- Log switch checkpoint
_ Incremental checkpoint
Whenever a checkpoint is triggered :
- DBWR writes some /all dirty blocks to datafiles
- CKPT process updates the control file and datafile headers
FULL CHECKPOINT
- Writes block images to the database for all dirty buffers from all instances.
- Statistics updated
. DBWR checkpoints
. DBWR checkpoint buffers written
. DBWR thread checkpoint buffers written
- Caused by :
. Alter system checkpoint [global]
. ALter database begin backup
. ALter database close
. Shutdown [immediate]
- Controlfile and datafile headers are updated
. Checkpoint_change#
THREAD CHECKPOINT
- Writes block images to the database for all dirty buffers from one instance
- Statistics updated
. DBWR checkpoints
. DBWR checkpoint buffers written
. DBWR thread checkpoint buffers written
- Caused by :
. Alter system checkpoint local
- Controlfile and datafile headers are updated
. Checkpoint_change#
FILE CHECKPOINT
When a tablespace is put into backup mode or take it offline,
Oracle writes all the dirty blocks from the tablespace to disk before
changing the state of the tablespace.
- Writes block images to the database for all dirty buffers for all files of a tablespace from all instances
- Statistics updated
. DBWR checkpoints
. DBWR tablespace checkpoint buffers written
. DBWR checkpoint buffers written
- Caused by :
. Alter tablespace xxx offline
. Alter tablespace xxx begin backup
. Alter tablespace xxx read only
- Controlfile and datafile headers are updated
. Checkpoint_change#
PARALLEL QUERY CHECKPOINT
Parallel query often results in direct path reads (Full
tablescan or index fast full scan). This means that blocks are read
straight into the session’s PGA, bypassing the data cache; but that
means if there are dirty buffers in the data cache, the session won’t
see the most recent versions of the blocks unless they are copied to
disk before the query starts – so parallel queries start with a
checkpoint.
- Writes block images to the database for all dirty buffers belonging to objects accessed by the query from all instances.
- Statistics updated
. DBWR checkpoints
. DBWR checkpoint buffers written
- Caused by :
. Parallel Query
. Parallel Query component of Parallel DML (PDML) or Parallel DDL (PDDL)
- Mandatory for consistency
- Controlfile and datafile headers are updated
. Checkpoint_change#
OBJECT CHECKPOINT
When an object is dropped/truncated, the session initiates an
object checkpoint telling DBWR to copy any dirty buffers for that object
to disk and the state of those buffers is changed to free.
- Writes block images to the database for all dirty buffers belonging to an object from all instances.
- Statistics updated
. DBWR checkpoints
. DBWR object drop buffers written
- Caused by dropping or truncating a segment:
. Drop table XXX
. Drop table XXX Purge
. Truncate table xxx
. Drop index xxx
- Mandatory for media recovery purposes
- Controlfile and datafile headers are updated
. Checkpoint_change#
LOG SWITCH CHECKPOINT
- Writes the contents of the dirty buffers whose information is protected by a redo log to the database .
- Statistics updated
. DBWR checkpoints
. DBWR checkpoint buffers written
. background checkpoints started
. background checkpoints completed
- Caused by log switch
- Controlfile and datafile headers are updated
. Checkpoint_change#
INCREMENTAL CHECKPOINT
Prior to Oracle 8i, only well known checkpoint was log switch
checkpoint. Whenever LGWR filled an online logfile, DBWR would go into a
frenzy writing data blocks to disks, and when it had finished, Oracle
would update each data file header block with the SCN to show that file
was updated up to that point in time.
Oracle 8i introduced incremental checkpointing which triggered
DBWR to write some dirty blocks from time to time so as to advance the
checkpoint and reduce the instance recovery time.
Incremental checkpointing has been implemented using two algorithms :
- Ageing algorithm
- LRU/TCH algorithm
AGEING ALGORITHM
This strategy involves writing changed blocks that have been dirty
for the longest time and is called aging writes. This algorithm relies
on the CKPT Q running thru the cache and buffers being linked to the end
of this list the first time they are made dirty.
.The LRU list contains all the buffers – free / pinned / dirty.
Whenever a buffer in LRU list is dirtied, it is placed in CKPT Q as well
i.e. a buffer can simultaneously have pointers in both LRU list and
CKPT Q but the buffers in CKPT Q are arranged in the order in which they
were dirtied.Thus, checkpoint queue contains dirty blocks in the order
of SCN# in which they were dirtied
Every 3 secs DBWR wakes up and checks if there are those many
dirty buffers in CKPT Q which need to br written so as to satisfy
instance recovery requirement..
If those many or more dirty buffers are not found,
DBWR goes to sleep
else (dirty buffers found)
.CKPT target RBA is calculated based on
– The most recent RBA
– log_checkpoint_interval
– log_checkpoint_timeout
– fast_start_mttr_target
– fast_start_io_target
– 90% of the size of the smallest redo log file
. DBWR walks the CKPT Q from the low end (dirtied earliest) of
the redo log file collecting buffers for writing to disk until it
reaches the buffer that is more recent than the target RBA. These
buffers are placed in write list-main.
. DBWR walks the write list-main and checks all the buffers
– If changes made to the buffer have already been written to redo log files
. Move those buffers to write-aux list
else
. Trigger LGWR to write changes to those buffers to redo logs
. Move those buffers to write-aux list
. Write buffers from write-aux list to disk
. Update checkpoint RBA in SGA
. Delink those buffers from CKPT Q
. Delink those buffers from write-aux list
- Statistics Updated :
. DBWR checkpoint buffers written
- Controlfile updated every 3 secs by CKPT
. Checkpoint progress record
As sessions link buffers to one end of the list, DBWR can
effectively unlink buffers from the other end and copy them to disk. To
reduce contention between DBWR and foreground sessions, there are two
linked lists in each working set so that foreground sessions can link
buffers to one while DBWR is unlinking them from the other.
LRU/TCH ALGORITHM
LRU/TCH algorithm writes the cold dirty blocks to disk that are on the point of being pushed out of cache.
As per ageing algorithm, DBWR will wake up every 3 seconds to
flush dirty blocks to disk. But if blocks get dirtied at a fast pace
during those 3 seconds and a server process needs some free buffers,
some buffers need to be flushed to the disk to make room. That’s when
LRU/TCH algorithm is used to write those dirty buffers which are on the
cold end of the LRU list.
Whenever a server process needs some free buffers to read data,
it scans the LRU list from its cold end to look for free buffers.
While searching
If unused buffers found
Read blocks from disk into the buffers and link them to the corresponding hash bucket
if it finds some clean buffers (contain data but not dirtied or dirtied and have been flushed to disk),
if they are the candidates to be aged out (low touch count)
Read blocks from disk into the buffers and link them to the corresponding hash bucket
else (have been accessed recently and should not be aged out)
Move them to MRU end depending upon its touch count.
If it finds dirty buffers (they are already in CKPT Q),
Delink them from LRU list
Link them to the write-main list (Now these buffers are in CKPT Q and write-main list)
The server process scans a threshold no. of buffers
(_db_block_max_scan_pct = 40(default)). If it does not find required no.
of free buffers,
It triggers DBWR to dirty blocks in write-mainlist to disk
. DBWR walks the write list-main and checks all the buffers
– If changes made to the buffer have already been written to redo log files
. Move those buffers to write-aux list
else
. Trigger LGWR to write changes to those buffers to redo logs
. Move those buffers to write-aux list
. Write buffers from write-aux list to disk
. Delink those buffers from CKPT Q and w rite-aux list
. Link those buffers to LRU list as free buffers
Note that
- In this algorithm, the dirty blocks are delinked from LRU list
before linking them to write-main list in contrast to ageing algorithm
where the blocks can be simultaneously be in both CKPT Q and LRU list.
- In this algorithm, checkpoint is not advanced because it may be
possible that the dirty blocks on the LRU end may actually not be the
ones which were dirtied earliest. They may be there because the server
process did not move them to the MRU end earlier. There might be blocks
present in CKPT Q which were dirtied earlier than the blocks in
question.
I hope the information was useful. Thanks for your time.
Keep visiting the blog…
http://www.dbafree.net/wp-content/uploads/2011/05/CheckPoints.pdfhttp://jonathanlewis.wordpress.com/2007/04/12/log-file-switch/
No comments:
Post a Comment