【Oracle计数器】CPU used by this session详解
查了资料,介绍说是“这是在用户调用开始和结束之间会话所占用的 CPU 时间(以 10 毫秒为单位)”,我整个场景跑下来,这个值一直处于85%,why? 说清楚点问题的来龙去脉好吗,你这样问问题无法回答。 关注。sdlkfj8ORACLE CPU过高的一次调整过程
发现CPU占用率一般都维持在90%以上,由于大量使用FOR UPDATE,造成大量LATCH等等待现象,其中某些线程SQL又运行非常频繁。很久以前就发现这些问题,给出相对意见,建议不采用FOR UPDATE或在FOR UPDATE上加NOWAIT(见http://spaces.msn.com/sunmoonking/blog/cns!E3BD9CBED01777CA!318.entry “ELECT FOR UPDATE 相关的知识”一章 ),由于项目进程问题一直没进行程序修改,决定在这次解决掉,以解后顾之忧。调整前要知道如下问题
1,CPU是多少MHZ
2,SERVER 端的CPU是否有负荷较重
3,CLIENT 端的CPU是否有负荷较重
4,空闲时间(如半夜)CPU使用率是否超过15%,如果超过则需要特别注意了
5,CPU的PEAK LOAD
6,CPU的IDLE STATE
影响CPU的因素
1,高的无必要的解析会代价昂贵。
发现那些SQL运行了大量的PARSE
select sql_text, parse_calls, executions from v$sqlarea order by parse_calls desc;
SYS的总的PARSE情况
select name, value from v$sysstat where name = ’parse count%’;
只有硬解析才能减少,可以绑定变量,或增加每一个SESSION的CACHED CURSORS。
2,导致大量I/O的SQL也会明显占用CPU,如没有INDEX。BUFFER GETS一般会同CPU一块增长。可以通过v$sqlarea发现buffer_gets
3,其他等待时间,可以通过v$sesstat,v$sysstat查看
#VMSTAT 5 5 的CPU部分
ususer用掉的
sysystem用掉的
id空闲
调整前
TOPAS
Name PIDCPU%PgSp Owner
oracle 68841625.2 4.6 orasbp
oracle 56965824.9 4.5 orasbp
topas 676210 0.1 2.9 root
syncd 77964 0.0 0.5 root
hatsd 159792 0.0 8.3 root
或者ps aux|head
检查CPU数量
/usr/sbin/bindprocessor -q
The available processors are:0 1 2 3
STATSPACK的信息
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
enqueue 5,465 121 1,793 328 0.6
latch free 2,986 2,669 21 7 0.3
根据消耗最多CPU的进程PID来得到SID详细信息
select s.sid from v$process p, v$session s
where s.paddr=p.addr and p.spid =&your_spid;
根据SID查SQL
SELECT SQL_TEXT from V$SQLTEXT_WITH_NEWLINES where HASH_VALUE
= (select sql_hash_value from v$session
where SID = <problem_SID_you_got_from_last_step>) ;
还有
select n.name,s.value
from v$statname n,V$sesstat s
where n.statistic# = s.statistic#
and value > 0
and s.sid = (select a.sid from v$process p,v$session a
where p.addr =a.paddr
and a.terminal = userenv(’terminal’))
order by n.class,n.name
用以上SQL完成SHELL(shell信息在后面whoit.sh),运行
sh whoit.sh 688416来根据PID得到用户信息和SQL语句
在STATSPACK和我的whoit.sh都指定是这个SQL的问题
SELECT demessageid, fromid, apptype, demessage,appversion,vovers
ion ,toid,tag FROM de_receivelogWHERE (status = :1 and rownum<
=1 and (dealtime is null or dealtime<sysdate) and delock=0) or (
delock=1 and Update_Date<sysdate-2/24) order by RECEIVETIME for
update
再看等待事件
select sid||' '||event||' '|| total_waits||' '||average_wait from v$session_event where sid=25
SQL> /
SID||''||EVENT||''||TOTAL_WAITS||''||AVERAGE_WAIT
--------------------------------------------------------------------------------
25 latch free 46180 1
25 control file sequential read 4 0
25 log file sync 1 0
25 db file sequential read 202 0
25 db file scattered read 445 1
25 SQL*Net message to client 22 0
25 SQL*Net message from client 22 0
只有一个DEFAULT 池
NAME HIT_RATIO
---------------------------------------- ----------
DEFAULT .88042806
设置db_keep_cache_size池,并KEEP表
ALTERTABLE customer STORAGE (BUFFER_POOLRECYCLE|KEEP|DFAULT)
SQL> analyze table sbpopt.de_receivelog compute statistics;
建立相关索引
alter tableTI_REPAIR_DEED storage(buffer_pool keep);
alter tablede_receivelogstorage(buffer_pool keep);
看到相关的等待都是LATCH FREE,enqueue,估计是由于SELECT FOR UPDATE并且全表扫描造成的
查看每个Session的CPU利用情况:
select ss.sid||' '||se.command||' '||ss.value CPU ||' '||se.username||' '||se.program
from v$sesstat ss, v$session se
where ss.statistic# in
(select statistic#
from v$statname
where name = 'CPU used by this session')
and se.sid=ss.sid
and ss.sid>6
order by ss.value
根据STATSPACK的HASH VALUE 用SQL>@sprepsql得到
STATSPACK SQL report for Hash Value: 1710202187Module: JDBC Thin Client
DB Name DB Id Instance Inst Num Release Cluster Host
------------ ----------- ------------ -------- ----------- ------- ------------
SBP 3008872479 SBP 1 9.2.0.6.0 NO svodbp01
Start Id Start Time End Id End Time Duration(mins)
--------- ------------------- --------- ------------------- --------------
44 23-May-06 14:13:01 45 23-May-06 14:28:00 14.98
SQL Statistics
~~~~~~~~~~~~~~
-> CPU and Elapsed Time are in seconds (s) for Statement Total and in
milliseconds (ms) for Per Execute
% Snap
Statement Total Per Execute Total
------------------------------------
Buffer Gets: 6,938,821 7,608.4 63.34
Disk Reads: 0 0.0 .00
Rows processed: 197 0.2
CPU Time(s/ms): 508 557.2
Elapsed Time(s/ms): 607 665.8
Sorts: 1,292 1.4
Parse Calls: 191 .2
Invalidations: 0
Version count: 1
Sharable Mem(K): 22
Executions: 912
SQL Text
~~~~~~~~
SELECT demessageid, fromid, apptype, demessage,appversion,vovers
ion ,toid,tag FROM de_receivelogWHERE (status = :1 and rownum<
=1 and (dealtime is null or dealtime<sysdate) and delock=0) or (
delock=1 and Update_Date<sysdate-2/24) order by RECEIVETIME for
update nowait
===============================================================
15分钟内执行了912次,占用22K内存(不多),每次执行1.4个排序,共花了508秒,由于已经采取措施将此表KEEP与内存中,因此没有磁盘读,但每次的逻辑读7,608.4个块
然后将SQL语句的FOR UPDATE改成 FOR UPDATE NOWAIT; 这才是真正解决问题的地方
至此,通过观察CPU一般维持在25%以内。
等待事件
enqueue(队列)Indicates waits associated with internal queuing mechanism for locking various resources and components of Oracle. Please refer to Appendix B of the Oracle8i Reference Manual for the complete list of enqueues in Oracle.是一种保护共享资源的琐,锁可以保护纪录里面数据一类的共享资源,防止两个人同时修改该纪录,fifo,enqueue等待通常是 ST enqueue,HW enqueue ,TX4 enqueue。ST enqueue在数据字典管理方式表空间的物理空间分配和管理上发挥作用。当有此问题时可以转变为本地管理表空间或预先分配扩展。HW enqueue同段的高水位线一起使用,手工分配扩展可避免其上的等待。TX4 是最常见的,1,唯一索引的重复,需要COMMIT/ROLLBACK来释放ENQUEUE。2,多个并发的对同一位图索引片的修改,同上。3,多用户同时修改相同的数据块,如果已经没有空闲的ITL槽,则会引发一数据块的锁,运用增大initrans或maxtrans以容纳更多ITL槽的方法来解决,或增大PCTFREE。(ST:使用本地表空间或预先分配大扩展,HW预先分配扩展于高水水位线之上,TX4增大表或索引的initrans和maxtrans,TM为外键建立索引,察看日志放在快速磁盘上)
latch free(锁存器空闲,拴空闲) Indicates latch contention for the latch# that is being waited on. Ensure that you already have tuned the number of latches to their allowed maximums by setting the relevant init.ora parameters. If the problem persists, you should determine what is causing the contention for the latch and fix the underlying problem. Your goal should be to cure the disease not the symptom. A latch free event is a symptom of a bigger problem. For example, if the latch# derived from this is a library cache latch (assuming that the shared pool is configured appropriately), it may indicate a significant amount of hard parsing. This usually is a problem with applications that have hard-coded values in them. Either rewrite them with bind variables or upgrade to Oracle8i and use CURSOR_SHARING=force (or just look the other way).Latch Free是一种低级的同步锁机制,用以维持某些访寻和执行操作的顺序。Oracle通过enqueue来达成对重做线程/表/事务一类对象的并发使用,而通过Latch来达成对SGA中共享内存结构的保护。Latch速度快而成本低,往往通过单个的内存单元来实现。大部分Latch是互斥的。共享型Latch许可对其内存结构的并发读操作。当请求得到Latch但它已被其它进程占住时,将产生一条Latch free miss的纪录。大多数Latch问题关联到不使用绑定变量(Library Cache Latch),重做日志生成中问题(Redo Allocation Latch),内存缓冲区竞争问题(Cache Buffers LRU Chain)和内存缓冲区中‘热块’的数据块(Cache Buffers Chain)。也有一些Latch Waits是由软件错误而造成的。 首先确认足够的LATCH数,然后确认那个锁存器(V$SESSION_WAIT的P2),再确认是什么引起的.
#cat whoit.sh
sqlplus /nolog <<EOF
connect / as sysdba
col machine format a30
col program format a40
set line 200
select sid,serial# ,username,osuser,machine,program,process,to_char(logon_time,'yyyy/mm/dd hh24:mi:ss')
from v\$session where paddr in
( select addr from v\$process where spid in($1));
select sql_text from v\$sqltext_with_newlines
where hash_value in
(select SQL_HASH_VALUE from v\$session where
paddr in (select addr from v\$process where spid=$1)
)
order by piece;
exit;
EOF 原帖由 rickyzhu 于 2007-6-19 13:51 发表 http://bbs.51testing.com/images/common/back.gif
说清楚点问题的来龙去脉好吗,你这样问问题无法回答。
我其实就是想问这个指标是什么意思
我认为应该不是CPU的占用率,那应该是什么呢?
因为现在测试的一个系统,数据库服务器与应用服务器为同一台机器,在考虑CPU的占用率的时候,就要考虑到底是应用占用了CPU,还是数据库占用了CPU 学习中
回复 #5 dawee 的帖子
没看到你上面那个回复有 CPU used by this session的详解吗?不可教也! 原帖由 shanxi 于 2007-6-24 13:21 发表 http://bbs.51testing.com/images/common/back.gif
没看到你上面那个回复有 CPU used by this session的详解吗?
不可教也!
首先感谢你的帮助
其次要说的是,你动辄就copy一个长篇大论出来,谁会有耐心看,恐怕你自己都没有搞的清清楚楚就直接贴过来给别个看
回复 #4 shanxi 的帖子
很好值得学习 谢谢分享!
回复 #8 dawee 的帖子
BackGround没有,怎么去理解这个概念光给个名词解释肯定说不清楚 mark一下
页:
[1]