Fractured Block Corruption Solution

Selamlar,

Prod ortamında bir datafile’da “fractured” block corruption ile karşılaştık. Aldığımız hatalar;

Hex dump of (file 73, block 2525) in trace file /oracle/app/diag/rdbms/spark/spark/trace/spark_ora_22533.trc
Corrupt block relative dba: 0x124009dd (file 73, block 2525)
Fractured block found during backing up datafile
Data in bad block:
type: 6 format: 2 rdba: 0x124009dd
last change scn: 0x0000.23e58731 seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0x00000000
check value in block header: 0xb5
computed block checksum: 0x90bf
Reread of blocknum=2525, file=+DATA1/spark/datafile/ts_gprs_cdr_001ind_32k.428.877783237. found same corrupt data
Reread of blocknum=2525, file=+DATA1/spark/datafile/ts_gprs_cdr_001ind_32k.428.877783237. found same corrupt data
Reread of blocknum=2525, file=+DATA1/spark/datafile/ts_gprs_cdr_001ind_32k.428.877783237. found same corrupt data
Reread of blocknum=2525, file=+DATA1/spark/datafile/ts_gprs_cdr_001ind_32k.428.877783237. found same corrupt data
Reread of blocknum=2525, file=+DATA1/spark/datafile/ts_gprs_cdr_001ind_32k.428.877783237. found same corrupt data
RMAN-03009: failure of backup command on ch00 channel at 08/02/2015 08:04:42
ORA-19566: exceeded limit of 0 corrupt blocks for file +DATA1/spark/datafile/ts_gprs_cdr_001ind_32k.428.877783237

Nedir bu “fractured” block corruption derseniz Oracle’ın açıklaması şu şekilde;

A Fractured Block means that the block is incomplete and considered a Physical Corruption.  Information from the block header does not match the block end / tail.  It is a clear symptom about issues within the Operating System (OS) / Hardware layers.

In order to understand why a Fractured Block happens, we need to understand how a block is written into disk. The block size at OS level does not match the block size at Oracle level, so in order to write an Oracle block, the Operating System needs to perform more than one write.  As an example: if OS block size is 512 bytes and Oracle block size is 8K, the OS performs 16 writes in order to complete the write process.

Oracle keeps track off the header of each block and before writing down to disk updates a 4 byte field/value in the tail of each block (tailchk) to guarantee afterwards consistency check that the block is complete after written.

Corruption’ı Tespit Etme

İlk önce aşağıdaki yöntemler ile corruption’ın tam olarak hangi datafile’da ve block’da olduğunu tespit ediyorum.

  1. RMAN ile advise alabilirsiniz.
    RMAN> list failure;
    List of Database Failures
    =========================
    Failure ID Priority Status    Time Detected Summary
    ---------- -------- --------- ------------- -------
    67041      HIGH     OPEN      28-JUN-15     Datafile 73: '+DATA1/spark/datafile/ts_gprs_cdr_001ind_32k.428.877783237' contains one or more corrupt blocks
    RMAN> advise failure;
    List of Database Failures
    =========================
    Failure ID Priority Status    Time Detected Summary
    ---------- -------- --------- ------------- -------
    67041      HIGH     OPEN      28-JUN-15     Datafile 73: '+DATA1/spark/datafile/ts_gprs_cdr_001ind_32k.428.877783237' contains one or more corrupt blocks
    analyzing automatic repair options; this may take some time
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=582 device type=DISK
    analyzing automatic repair options complete
    Mandatory Manual Actions
    ========================
    1. No backup of block 2525 in file 73 was found. Drop and re-create the associated object (if possible), or use the DBMS_REPAIR package to repair the block corruption
    2. Contact Oracle Support Services if the preceding recommendations cannot be used, or if they do not fix the failures selected for repair
    Optional Manual Actions
    =======================
    no manual actions available
    Automated Repair Options
    ========================
    no automatic repair options available
  2. Database üzerinden bakıyorum.
    SQL>  select * from V$DATABASE_BLOCK_CORRUPTION;
         FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO
    ---------- ---------- ---------- ------------------ ---------
            73       2525          1                  0 FRACTURED
  3. dbverify ile bakıyorum.
    oracle@fmsdb $ dbv file="+DATA1/spark/datafile/ts_gprs_cdr_001ind_32k.428.877783237" blocksize=32768 userid=sys/0racle
    DBVERIFY: Release 11.2.0.4.0 - Production on Sat Aug 15 18:01:14 2015
    Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
    DBVERIFY - Verification starting : FILE = +DATA1/spark/datafile/ts_gprs_cdr_001ind_32k.428.877783237
    Page 2525 is marked corrupt
    Corrupt block relative dba: 0x124009dd (file 73, block 2525)
    Bad header found during dbv:
    Data in bad block:
    type: 11 format: 2 rdba: 0x12400001
    last change scn: 0x0000.00000000 seq: 0x1 flg: 0x04
    spare1: 0x0 spare2: 0x0 spare3: 0x0
    consistency value in tail: 0x00000b01
    check value in block header: 0x67e1
    computed block checksum: 0x0
    DBVERIFY - Verification complete
    Total Pages Examined         : 229376
    Total Pages Processed (Data) : 0
    Total Pages Failing   (Data) : 0
    Total Pages Processed (Index): 229279
    Total Pages Failing   (Index): 0
    Total Pages Processed (Other): 79
    Total Pages Processed (Seg)  : 0
    Total Pages Failing   (Seg)  : 0
    Total Pages Empty            : 17
    Total Pages Marked Corrupt   : 1
    Total Pages Influx           : 0
    Total Pages Encrypted        : 0
    Highest block SCN            : 0 (0.0)

Corruption’ın tam olarak hangi block’da olduğunu tespit ettik. Şimdi bu block kullanılıyor mu diye kontrol ediyorum. Block’un herhangi bir segmentte olup olmadığı alacağımız aksiyon için önemli. Aşağıdaki sql’i çalıştırıyorum.

SELECT tablespace_name, segment_type, owner, segment_name
  FROM dba_extents
WHERE file_id = 73 and 2525 between block_id AND block_id + blocks - 1;

adsasdİlgili block IX_GP_1_PH_2 segmentinde kullanılıyor. Şimdi bu segment’te hangi objenin olduğuna bakıyorum.

asdasdasd

Corruption’ı Çözme

Eğer 2525 block’u herhangi bir segmentte kullanılmıyor olsaydı Oracle şöyle bir çözümü öneriyor;

RMAN Reports an ORA-19566: Exceeded Limit Of 0 Corrupt Blocks From Backup And No Segment Associated (Doc ID 457422.1)

2525 block’u kullanılan bir segmentte olduğu için Block Media Recovery (RMAN) ile repair yapmak istiyorum. Ancak yukarıda da görebileceğiniz gibi ilgili block’un backup’ı olmadığı için RMAN ile repair edemiyorum. Eğer elimde backup’ı olsaydı aşağıdaki şekilde repair edebilirdim.

SQL> recover datafile 73 block 2525;

Block media recovery’de işimi görmediğine göre, yine RMAN’i advise’ını uyguluyorum ve ilgili index’i re-create (rebuild değil, drop ve yeniden create) ediyorum.

 Drop INDEX IX_GP_1_PH_2;
CREATE INDEX "IX_GP_1_PH_2" ON "NIK_GPRS_CDR_001" ("PHONE_NUMBER")
PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
STORAGE
(INITIAL 1048576 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "TS_GPRS_CDR_001IND_32K"

Şimdi aynı şekilde dbverify ve RMAN ile corruption devam ediyor mu diye kontrol ediyorum.

oracle@fmsdb $  dbv file="+DATA1/spark/datafile/ts_gprs_cdr_001ind_32k.428.877783237" blocksize=32768 userid=sys/0racle
DBVERIFY: Release 11.2.0.4.0 - Production on Fri Aug 21 10:23:14 2015
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = +DATA1/spark/datafile/ts_gprs_cdr_001ind_32k.428.877783237
DBVERIFY - Verification complete
Total Pages Examined         : 229376
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 229192
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 167
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 17
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 0 (0.0)
RMAN> list failure;
using target database control file instead of recovery catalog
no failures found that match specification

ve corruption düzelmiş durumda.

Kolay gelsin.

Bir Cevap Yazın

Aşağıya bilgilerinizi girin veya oturum açmak için bir simgeye tıklayın:

WordPress.com Logosu

WordPress.com hesabınızı kullanarak yorum yapıyorsunuz. Log Out / Değiştir )

Twitter resmi

Twitter hesabınızı kullanarak yorum yapıyorsunuz. Log Out / Değiştir )

Facebook fotoğrafı

Facebook hesabınızı kullanarak yorum yapıyorsunuz. Log Out / Değiştir )

Google+ fotoğrafı

Google+ hesabınızı kullanarak yorum yapıyorsunuz. Log Out / Değiştir )

Connecting to %s