Thursday, 6 November 2014

Capture SQL PLAN BASELINE from AWR

Reference Metalink document :789888.1

Going through my Scripts i have found a very useful document today. thought it would be worth sharing online.


1- Create SQL Tuning Set 


exec DBMS_SQLTUNE.CREATE_SQLSET('4yjj6c21g246w_STS');


2- Load SQL Plan from AWR in STS

declare
baseline_ref_cursor DBMS_SQLTUNE.SQLSET_CURSOR;
begin
open baseline_ref_cursor for
select VALUE(p) from table(DBMS_SQLTUNE.SELECT_WORKLOAD_REPOSITORY(46634, 46635,'sql_id='||CHR(39)||'4yjj6c21g246w'||CHR(39)||' and plan_hash_value=1171167410',NULL,NULL,NULL,NULL,NULL,NULL,'ALL')) p;
DBMS_SQLTUNE.LOAD_SQLSET('4yjj6c21g246w_STS', baseline_ref_cursor);
end;
/


3- the loaded plan details



Verify how many sqls got loaded in the STS.

SELECT NAME,OWNER,CREATED,STATEMENT_COUNT FROM DBA_SQLSET where name='4yjj6c21g246w_STS';


Verify the sql statements and its sql_id in the STS

select sql_id, substr(sql_text,1, 30) text
from dba_sqlset_statements
where sqlset_name = '4yjj6c21g246w_STS'
order by sql_id;

Verify the execution Plan of a SQL_ID in the STS for an user sql

SELECT * FROM table (DBMS_XPLAN.DISPLAY_SQLSET('4yjj6c21g246w_STS','4yjj6c21g246w'));


4- Load the the Plan in STS as a Baseline

set serveroutput on
declare
my_integer pls_integer;
begin
my_integer := dbms_spm.load_plans_from_sqlset (
sqlset_name => '4yjj6c21g246w_STS',
sqlset_owner => 'SYS',
fixed => 'NO',
enabled => 'YES',
basic_filter => 'parsing_schema_name != ''SYS''');
DBMS_OUTPUT.PUT_line(my_integer);
end;
/

Wednesday, 29 October 2014

ASMLIB COnfiguration (root.sh on Node 2 Fails with ORA-15018 ORA-15072 ORA-15080 errors)

References: 

1) Configuring Oracle ASMLib on Multipath Disks 
http://www.oracle.com/technetwork/topics/linux/multipath-097959.html

2) Oracle support document: 1670374.1



For a RAC installation we had two nodes to configure. Multipath disks are already configured 
with storage tools provided (e.g EMC power path). We assume the multipath disks have been given the name 
" multipatha", so as to be storage vendor agnostic. 

Error Faced:

While running the root.sh on node 2 the following error was returned 

-----------------------------------------------------------
Disk Group OCRDG creation failed with the following message:
ORA-15018: diskgroup cannot be created
ORA-15072: command requires at least 1 regular failure groups, discovered only 0
ORA-15080: synchronous I/O operation to a disk failed

Configuration of ASM ... failed
see asmca logs at /u01/app/oracle/cfgtoollogs/asmca for details
Did not successfully configure and start ASM at /u01/grid/oracle/crs/product/11.2.0.3/crs/install/crsconfig_lib.pm line 6763.
-------------------------------------------------------------

Background:


There are two issues with using multipath disks in ASM. ASM cannot handle seeing the same disk more than once. 
If it does, it will cause an error. A single disk can appear multiple times in a multipath configuration: 
1)Unique paths to the disk ( in our case we had four different paths to the disk)
2)The multipath access point
For example the server has disk /dev/sda presented from an external storage; the server has four (in our case)
connections, or paths, to that external storage. The Linux SCSI driver 
will see all paths. They will appear as /dev/sdb ,/dev/sdc, /dev/sdd and /dev/sde. Accessing the paths /dev/sdb ,/dev/sdc, /dev/sdd and /dev/sde
will end up on the same disk.

If multipathing is enabled, there will be a multipath disk, for example /dev/multipatha, that can access all paths. That is, any I/O to multipatha can use any of the above path. If a system were using the sdb path, and that cable is unplugged, the system will get an error. But the multipath disk will know to switch to any of the three paths available. 

Mutipath configuration to transparent to most applications. It can use any of the paths, sdb, sdc, or multipatha, and it won't know the difference. ASMLib is the solution. The default configuration does not care which path it uses. 

ASMLib will choose only one of the paths, because ASM cannot handle seeing the same disk more than once. This solves the first issue. ASM only sees one path, and it works. now the second issue is which path does ASM see In its default configuration, ASMLib will choose the first path it finds. This is the first path as reported by Linux. Depending on the storage driver, it could be the multipath, or it could be one of the single paths.

The system administrator wants ASMLib to always use the multipath disk. What's the point of having it if Oracle is not using it? There is no way, however, for ASMLib to know what a multipath looks like. It must be told via its configuration.

Disk Scan Ordering 


Disks are marked for ASMLib using a process described in ASMLib Installation. ASMLib learns what disk are marked during a process called disk scanning. ASMLib runs this scan every time it starts up. The system administrator can also force a scan via the /etc/init.d/oracleasm scandisks command. ASMLib examines each disk in the system. It checks if the disk has been marked for ASMLib. Any disk that has been marked will be made available to ASMLib. 

Normally, ASMLib checks each disk in the exact order the OS lists them. This is fine for most systems. In the last section, we described a case where the OS order isn't good enough. The system administrator wants ASMLib to see multipath disks before it sees single path disks. This allows ASMLib to choose the multipath disk and make it available to Oracle.

ASMLib allows two modifications to the disk scan order. First, it allows exclusion of certain disks. In other words, ASMLib will ignore those disks completely. Second, the system administrator can specify disks that are to be scanned first. Disks in this list are scanned before the rest of the disks in the system.

A multipath configuration can use either facility. The system administrator could choose to exclude all the single path disks. ASMLib will then ignore them, only scanning the multipath disks. Conversely, the administrator could specify that the multipath disks are scanned first. ASMLib will see them first, and prefer them over the single paths.

Solution: Configuration of ASMLIB Scan order

The Oracle ASMLib configuration file is located at /etc/sysconfig/oracleasm. It is a link to file /etc/sysconfig/oracleasm-_dev_oracleasm and tools   actually read the later file. It contains all the startup configuration that the system administrator   specified via the /etc/init.d/oracleasm configure command. That command cannot configure scan ordering.  

The configuration file contains many configuration variables. The "ORACLEASM_SCANORDER" variable specifies disks to be scanned first. The ORACLEASM_SCANEXCLUDE variable specifies the disks that are to be ignored. 

The variables take a whitespace-seperated list of prefix strings to match. In other words, if a disk starts with one of the prefix strings, it matches. For example, the prefix string sd will match all SCSI devices. The disk sda starts with sd. Note that these are not globs. They do not use wildcards. They are simple prefixes. Also note that the /dev/ path is not part of the 
prefix. 

NOTE: When scanning, only the device names known by the kernel are scanned. With device-mapper, the kernel sees the devices as /dev/dm-XX. The /dev/mapper/XXX names are created by udev for human readability. Any configuration of ORACLEASM_SCANORDER or ORACLEASM_SCANEXCLUDE must use the dm prefix. 
A couple examples:

NOTE: If you are hand editing file /etc/sysconfig/oracleasm, make sure the link to /etc/sysconfig/oracleasm-dev_oracleasm is not broken.


Multipath Disks First 

The system administrator configures ASMLib to scan the multipath disks first. In the ASMLib configuration file she edits the ORACLEASM_SCANORDER variable to look like so: 

 ORACLEASM_SCANORDER="multipath dm"
 or in our case we used 
 ORACLEASM_SCANORDER="dm"
  
During a scan, ASMLib first tries all disks that start with "multipath".  The multipath device /dev/multipatha certainly matches.  It is scanned first. Next, ASMLib tries all disks that start with "dm". This is all the SCSI disks. The local disk /dev/sda will be scanned, but it is not an ASM disk. The single path disks /dev/sdb and /dev/sdc are also scanned.  They are ASM disks, but ASMLib will see that it already has a path to that disk. It will ignore them.  Finally, ASMLib will scan any other disks that did not match either prefix. 


Excluding Single Path Disks 

The system administrator configures ASMLib to ignore the single path disks. In the ASMLib configuration, he edits the ORACLEASM_SCANEXCLUDE variable to look like so: 

ORACLEASM_SCANEXCLUDE="sd"



Running the root.sh after configuring the following two parameters; the issue was resolved and the root.sh was successful.
  • ORACLEASM_SCANORDER
  • ORACLEASM_SCANEXCLUDE









Thursday, 9 October 2014

ORA-28000 "the account is locked" in the standby database, even after the account was unlocked in the primary.

It  was a very interesting case i came across. We have a Production database with a Standby database opened in Read-only mode for reporting (the Active data guard Feature). An  user account was locked in the production database. I issued the following command to unlock it.

SQL> alter user <username> account unlock;

wasn't that easy. Now as per my understanding the user account should have been unlocked in the standby database as well. Just to check i logged in the standby database

SQL> conn <username>/<password>

ERROR:
ORA-28000: the account is locked


now what !!!!    Still, the user cannot connect to the standby, getting the same error ORA-28000.

Luckily i found the following document on the Oracle Support. "ORA-28000 "the account is locked" in the standby database, even after the account was unlocked in the primary. (Doc ID 1600401.1)".

which states the cause of the issue as

"A privileged user (a database administrator) must unlock account in the standby database.  It is not enough to unlock the same account in the primary database.  This is because the standby is open read-only and cannot update any tables.  When a user's account has to be locked on the standby database, it is locked only in memory there."


The solution to the problem is

"A privileged user (sysdba, the database administrator) must logon to the standby and unlock the account there.  A message ORA-28015 "Account unlocked, but the database is open for read-only access" confirms that the account is now unlocked in the standby database.  From then on, the user can logon to the standby database without getting any error."

Thursday, 2 October 2014

Truncate Table with Foreign Key constraint

Truncating the Tables i always a better option when we want to delete all rows in the table as Dropping and re-creating a table invalidates dependent objects of the table, requires you to re grant object privileges on the table, and requires you to re-create the indexes, integrity constraints, and triggers on the table and re specify its storage parameters. Truncating has none of these effects.

The "Truncate Table" also performs of following tasks
  • Deallocates all space used by the removed rows except that specified by the MINEXTENTS storage parameter
  • Sets the NEXT storage parameter to the size of the last extent removed from the segment by the truncation process 
But there is one challenge in truncating a table when there is referential integrity constraint involved. I have written a small script to first disable the foreign key constraint on the child table. After then perform the truncate operation and enable the foreign key constraints again. Lets perform a small demo.

Parent table:

SQL> CREATE TABLE TEST1 (
  2  id    NUMBER        PRIMARY KEY,
  3  name  VARCHAR2(20)  NOT NULL
  4  );
Table created.


Child table:

SQL> CREATE TABLE TEST2 (
  2  ctry_id         NUMBER        NOT NULL,
  3  cont_id         NUMBER        NOT NULL REFERENCES TEST1
  4  );
Table created.

Now While trying to truncate the Parent table. The following errors pops up
SQL> truncate table TEST1;
truncate table TEST1
               *
ERROR at line 1:
ORA-02266: unique/primary keys in table referenced by enabled foreign keys

The very useful script disable all the foreign keys referencing the parent table (or list of tables to be truncated)
SQL> select 'alter table '||owner||'.'||TABLE_NAME||' disable constraint '||constraint_name ||';'
  2  r_constraint_name
  3  from dba_constraints
  4  where constraint_type='R'
  5  and r_constraint_name in
  6  (select constraint_name from dba_constraints
  7  where constraint_type in ('P','U') and table_name='TEST1');

R_CONSTRAINT_NAME
----------------------------------------------------------------------------------------------------
alter table SYSTEM.TEST2 disable constraint SYS_C00821334;
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Disable the Foreign Keys identified by the Script:

SQL> alter table SYSTEM.TEST2 disable constraint SYS_C00821334;
Table altered.

The table truncate is successful now

SQL> truncate table TEST1;
Table truncated.

The script Enable all the foreign keys referencing the parent table (or list of tables)

SQL> select 'alter table '||owner||'.'||TABLE_NAME||' ENABLE NOVALIDATE constraint '||constraint_name ||';'
  2  r_constraint_name
  3  from dba_constraints
  4  where constraint_type='R'
  5  and r_constraint_name in
  6  (select constraint_name from dba_constraints
  7  where constraint_type in ('P','U') and table_name='TEST1');

R_CONSTRAINT_NAME
----------------------------------------------------------------------------------------------------
alter table SYSTEM.TEST2 ENABLE NOVALIDATE constraint SYS_C00821334;
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Finally Enable the Foreign Keys identified by the Script:

SQL> alter table SYSTEM.TEST2 ENABLE NOVALIDATE constraint SYS_C00821334;
Table altered.

Tuesday, 30 September 2014

enq: TX – allocate ITL entry

This is one of favorite Blog post i have submitted in my old blog "jehanzebahsan.wordpress.com" . quoting the same Text Here
----------------------------------------------------------------------------------------------------------------------------------------------
A few days ago i wrote an email regarding “enq: TX – allocate ITL entry” seen on one of our production databases to my development team. i just pasted that here in the post

Issue:

The issue we have faced on a DB  was extensive locking on the TBL_jehanzeb (an arbitrary name used here) table. The specific wait events we have observed on the database were “enq: TX – allocate ITL entry”
Our EM grid control was showing some horrific hills; and we were especially concerned about the brown ones ( the configuration waits).. As a DBA i have destroyed my aesthetics ; because i cant admire such a beautiful blend of colors on the EM GRID control .

Analysis:

On the particular day the traffic on the database was very high raising the Transactions per seconds to 90k ( which on average is 40-50k )
The Particular wait event we faced on this table explains the situation as following.
This is the snapshot of an Oracle Block. With the value of PCTFREE=10 in the TBL_jehanzeb table

The oracle block Header the top part shown in the figure contains the ITL ( interested transaction list) i-e a linked link where each node represents the a single interested transaction. As the number of interested transaction (especially updates) grow the header starts expanding downwards and the row data starts expanding upwards until there’s a situation shown in the figure below
There is no more room in the block for a new ITL entry. A new transaction comes in to update Record3. The transaction will have to wait. This is not the same wait as a row lock; because there is no lock on the row marked Record3.Instead, session will wait on a special wait event known as the “enq: TX – allocate ITL entry”.  The moment one of the transactions – from either Session1 or Session2 end by commit or rollback, the new transaction can grab that ITL slot and complete the locking operation. Until then the session3 will acquire a block level exclusive lock.
The  V$SEGMENT_STATSTICS shows the significantly high ITL waits on the table TBL_jehanzeb as compared to the closest competitors . (Rest of the table names omitted using the renowned MS paint again for the INFO. SECUrity reasons)
Solution:

  • Change the application design to avoid this bone of contention by reducing the number of hits on the table TBL_jehanzeb
  • Increase the PCTFREE and INITTRANS parameter as the database level for this particular table.
    • alter table <name> pctfree 50;
    • alter table <name> initrans 100

Monday, 29 September 2014

Sessions PGA memory utilization Scripts

Following are some of the very useful PGA memory utilization scripts.

Total PGA Memory utilization for all Sessions

SELECT sum(trunc((VALUE/1024/1024))) || ' MB' "TOTAL MEMORY FOR ALL SESSIONS"
    FROM V$SESSTAT, V$STATNAME
    WHERE NAME = 'session pga memory'
    AND V$SESSTAT.STATISTIC# = V$STATNAME.STATISTIC#;


PGA Memory for all sessions individually 

select s.program,p.pid,s.sid,spid,s.program,s.status,
round(PGA_USED_MEM/(1024*1024),2) as PGA_USED_MEM,
round(PGA_ALLOC_MEM/(1024*1024),2) as PGA_ALLOC_MEM,
round(PGA_FREEABLE_MEM/(1024*1024),2) as PGA_FREEABLE_MEM,
round(PGA_MAX_MEM/(1024*1024),2) AS PGA_MAX_MEM
from v$process p, v$session s where s.paddr = p.addr(+)
and s.sid not in (select sid from v$mystat)
and s.machine in ('BSS\TPSIPAPP01','BSS\TPSIPAPP02')
order by p.pga_alloc_mem desc;

Friday, 26 September 2014

Why The Data Files Got Extended Up Over The Weekend

The title i use for this post is exactly the same as the Oracle Support document "Doc ID 1538442.1".

One fine day i logged in to one of my Production databases for a routine health check and found the database Physical size has grown almost 500GB in just 2 days.  That was a biggg shock as the DB growth trend was never like that before. I knew he data itself cannot grow such to such a figure in just two days so why did the tablespaces autoextends with any check. 

the first that comes to my mind the recent changes in the database environment and there was only one; the oracle database 11g upgrade from 10g.

The document i mentioned in my title comes to the rescue. It expalined the behaviuor i have just explained. Quoting the exact words

"There are a new feature in 11g that can do this behavior, SMCO (Space Management Coordinator) background process

SMCO coordinates the following space management tasks. It performs proactive space allocation and space reclamation.
The main advantage is that sessions need not wait for reactive space allocation/deallocation operations as this is proactively done, and hence a gain in performance."

I think the Oracle 11g thinks the DBAs are not wise enough to handle space management tasks. So i decided i won't let Oracle database to take control in my authority :) ; and disable the feature.

"This feature can be disabled by the following step:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SQL> ALTER SYSTEM SET "_ENABLE_SPACE_PREALLOCATION" = 0;

The feature can be turned on again any time by setting "_enable_space_preallocation"=3 which is the default value:
SQL> ALTER SYSTEM SET "_ENABLE_SPACE_PREALLOCATION" = 3;

NOTE:
- The parameter is dynamic and the scope is ALTER SYSTEM.
- Please use the double quotes as indicated to prevent an ORA-911 because of the leading '_'
"

Thursday, 25 September 2014

Exadata Storage Space Calculation

Exadata the Oracle engineered database machine has taken the OLTP database database performance to the next level. It has a number of power pack features flash cache, Smart scans and storage indexes that really provides the great returns on the investment.

How much usable storage a cutomer gets with every configurtion of the Exadata (quarter, Half or full Rack) is a little tricky for the customers guage.


Now below diagram shows the Exadata disk architecture for every disk. The physical is virtualized as a cell disk and then a grid disk to be presented to the Disk groups. (i'll keep the disk architecture discussion for some later post)



now for all the cells the structure would be something like



Lets take the example and oracle Exadata quarter rack example ang calculate what exactly the customer gets in terms of storage space.

The Quarter Rack Consists of

- 2 DB nodes
- 3 Storage Cells


Now the 3 Storage cells ( with high performance disks) has the following configuration


Step 1: Calculate the RAW Storage 

total Raw capacity => 12 disk on each cell * 600 GB each disk * 3 cells
                   => 21600GB or 21.09TB


now 30 GB on the first two disks of every cell is reserved for Linux OS. So by default to a disk group is created with 30 GB of disk space from remaining 10 disks from every cell.


so we have a Disk Group name DBFS => 10 disks from each cell * 30 GB from each disk *3 cells
                                                   DBFS => 900 GB

so We are left with 570GB on each disk i-e a total of "20520 GB" or  20.03TB 



Step 2: Now consider the mirroring requirement

 After NORMAL Redundancy (i-e double mirroring) Avialable RAW CAPACITY 20.03/2 =  10.01TB.

Step 3: Reserve free space for fault tolerance. 

with Disk failure coverage (DFC) one disk of 600GB is required to protect against the failure of one disks in a cell.

NOTE: that is a very important part. You need to plan for a Disk failure coverage or for complete cell node failure coverge. The ORACLE support document "Understanding ASM Capacity and Reservation of Free Space in Exadata (Doc ID 1551288.1)" clearly describes the different strategies and their pros and cos.

NOTE: After the application of BPS10 (11.2.0.4.10) the default is now the Disk Failure Coverage rather than the Cell Failure Coverage.


 This means that available is 10.01 TB - 0.57TB (i-e 600GB) =>  9.4 TB

Step 4: Decide the proportion of the Recovery Area. For example 60/40 (40% for recovery area)


 So the Recovery area size (RECODG) would be => 3.76 TB
and the DATADG would be => 5.64TB


                                                       



Wednesday, 24 September 2014

Oracle Real-Time SQL Monitoring

Oracle real-time SQL Monitoring is one of the very exciting features i have used recently. Its a very effective way to identify problems related to performance in long running queries especillay with the ones with complex execution Plans. It shows Stats at every step in the execution plan with key performance metrics including elapsed time, CPU time, number of reads and writes, I/O wait time and various other wait times. It allows us to have a deeper look inside the query plans and pin point the exact pain areas and then develop a startegy for SQL tuning.

The good part is the SQL Monitoring is also available with Oracle Enterprise Manager. Below is the Snapshot from the Enterprise manager 12c




It can also be manually Extracted using the following script.
In the below example I am going to extract a SQL monitoring report for a very complex SQL. The SQL monitoring report extracted in HTML format clearly identify the steps where the most of the time id spent on

SET LONG 1000000SET LONGCHUNKSIZE 1000000SET LINESIZE 1000SET PAGESIZE 0SET TRIM ONSET TRIMSPOOL ONSET ECHO OFFSET FEEDBACK OFF
SPOOL report_sql_monitor.htmSELECT DBMS_SQLTUNE.report_sql_monitor(sql_id => '2fj0jatxnstyu', -- pass the SQL id as the argumenttype => 'HTML',report_level => 'ALL') AS reportFROM dual;SPOOL OFF



The step # 32 shows 96% of the CPU time is spent on it.