Oracle EBS Apps DBA Real-Time Production Troubleshooting Guide
Month-end processing is always a critical period for any Oracle E-Business Suite production environment. During this time, multiple high-volume concurrent programs run together, business users are under pressure, and even one bad SQL plan can delay important financial or operational closure activities.
As an Apps DBA, our responsibility is not only to say that “SQL was running long,” but to provide a proper technical deep dive:
Which SQL_ID caused the delay?
Which concurrent request triggered it?
Which user submitted the request?
Was it a bad execution plan, stale statistics, blocking, TEMP usage, or lack of parallelism?
What is the permanent fix?
This blog post explains a practical production approach to analyze all problematic long-running SQL_IDs during month-end.
1. Objective of Month-End SQL Deep Dive
The main objective is to identify SQL statements that caused abnormal runtime during month-end and prepare a clear RCA with corrective and preventive actions.
During the analysis, we should answer the following questions:
| Question | Evidence Required |
|---|---|
| Which SQL_IDs ran long? | AWR, ASH, GV$SQL |
| Which concurrent request triggered the SQL? | FND concurrent request tables |
| Which program and user were involved? | Request ID, program name, user name |
| What was the main wait event? | ASH / GV$SESSION |
| Did the execution plan change? | Current plan vs historical AWR plan |
| Was SQL running parallel or serial? | GV$PX_SESSION |
| Was there TEMP spill? | GV$TEMPSEG_USAGE |
| Was there any blocking? | GV$SESSION blocking columns |
| What is the root cause? | Plan regression, stale stats, blocking, TEMP, I/O, DOP issue |
| What is the preventive action? | Stats, SPM, scheduling, DOP governance, monitoring |
2. Define the Month-End Issue Window
Before starting any technical analysis, first define the exact time window.
Example:
-- Month-end issue window
-- Start Time : 30-Apr-2026 18:00
-- End Time : 01-May-2026 06:00
Now identify the AWR snapshot IDs for this period:
SELECT snap_id,
instance_number,
begin_interval_time,
end_interval_time
FROM dba_hist_snapshot
WHERE begin_interval_time BETWEEN TO_DATE('30-APR-2026 18:00','DD-MON-YYYY HH24:MI')
AND TO_DATE('01-MAY-2026 06:00','DD-MON-YYYY HH24:MI')
ORDER BY snap_id, instance_number;
These snapshot IDs will be used for historical SQL analysis.
3. Identify Top Problematic SQL_IDs During Month-End
This query helps identify the top SQL statements by elapsed time, CPU time, I/O wait, buffer gets, disk reads, and executions.
SELECT *
FROM (
SELECT ss.sql_id,
ss.plan_hash_value,
SUM(ss.executions_delta) executions,
ROUND(SUM(ss.elapsed_time_delta)/1000000/60,2) elapsed_mins,
ROUND(SUM(ss.cpu_time_delta)/1000000/60,2) cpu_mins,
ROUND(SUM(ss.iowait_delta)/1000000/60,2) io_wait_mins,
ROUND(SUM(ss.clwait_delta)/1000000/60,2) cluster_wait_mins,
ROUND(SUM(ss.apwait_delta)/1000000/60,2) app_wait_mins,
SUM(ss.buffer_gets_delta) buffer_gets,
SUM(ss.disk_reads_delta) disk_reads,
SUM(ss.rows_processed_delta) rows_processed,
ROUND(
SUM(ss.elapsed_time_delta)/1000000 /
NULLIF(SUM(ss.executions_delta),0),2
) avg_elapsed_sec_per_exec
FROM dba_hist_sqlstat ss
JOIN dba_hist_snapshot sn
ON ss.snap_id = sn.snap_id
AND ss.instance_number = sn.instance_number
WHERE sn.begin_interval_time BETWEEN TO_DATE('30-APR-2026 18:00','DD-MON-YYYY HH24:MI')
AND TO_DATE('01-MAY-2026 06:00','DD-MON-YYYY HH24:MI')
GROUP BY ss.sql_id, ss.plan_hash_value
ORDER BY SUM(ss.elapsed_time_delta) DESC
)
WHERE ROWNUM <= 30;
How to Read the Output
| Observation | Meaning |
|---|---|
| High elapsed time and high CPU | SQL is CPU intensive |
| High elapsed time and high I/O wait | Possible full table scan or storage issue |
| High elapsed time and TEMP wait | Sort/hash operation spilled to TEMP |
| High elapsed time but low CPU | Possible blocking or concurrency issue |
| High executions with small average time | SQL is called repeatedly |
| Low executions with very high average time | One or few executions were very expensive |
| Multiple plan hash values | Possible plan instability or regression |
4. Get SQL Text for the SQL_ID
Once the problematic SQL_ID is identified, get the SQL text.
SELECT sql_id,
DBMS_LOB.SUBSTR(sql_text, 4000, 1) sql_text
FROM dba_hist_sqltext
WHERE sql_id = '&SQL_ID';
For current SQL available in memory:
SELECT sql_id,
child_number,
plan_hash_value,
executions,
elapsed_time/1000000 elapsed_sec,
cpu_time/1000000 cpu_sec,
buffer_gets,
disk_reads,
rows_processed,
parsing_schema_name,
module,
action
FROM gv$sql
WHERE sql_id = '&SQL_ID';
5. Map SQL_ID to Current Database Session
If the SQL is currently running, use the below query to identify the session details.
SELECT s.inst_id,
s.sid,
s.serial#,
s.username,
s.module,
s.action,
s.program,
s.status,
s.sql_id,
s.prev_sql_id,
s.event,
s.wait_class,
s.seconds_in_wait,
s.blocking_session,
s.machine,
s.osuser,
p.spid os_pid
FROM gv$session s
JOIN gv$process p
ON s.inst_id = p.inst_id
AND s.paddr = p.addr
WHERE s.sql_id = '&SQL_ID'
ORDER BY s.inst_id, s.sid;
This gives the database SID, serial number, OS process ID, wait event, and blocking session details.
6. Map SQL_ID to EBS Concurrent Request
This is one of the most important steps in Oracle EBS troubleshooting.
As Apps DBA, we should not stop only at SQL_ID. We must map it to the actual concurrent request and program.
SELECT s.inst_id,
s.sid,
s.serial#,
s.sql_id,
s.event,
s.wait_class,
s.seconds_in_wait,
r.request_id,
r.phase_code,
r.status_code,
r.actual_start_date,
r.actual_completion_date,
ROUND((NVL(r.actual_completion_date,SYSDATE)-r.actual_start_date)*24*60,2) runtime_mins,
cp.concurrent_program_name,
cpt.user_concurrent_program_name,
u.user_name,
r.argument_text
FROM gv$session s
JOIN gv$process p
ON s.inst_id = p.inst_id
AND s.paddr = p.addr
JOIN apps.fnd_concurrent_requests r
ON r.oracle_process_id = p.spid
JOIN apps.fnd_concurrent_programs cp
ON r.concurrent_program_id = cp.concurrent_program_id
AND r.program_application_id = cp.application_id
JOIN apps.fnd_concurrent_programs_tl cpt
ON cp.concurrent_program_id = cpt.concurrent_program_id
AND cp.application_id = cpt.application_id
AND cpt.language = 'US'
JOIN apps.fnd_user u
ON r.requested_by = u.user_id
WHERE s.sql_id = '&SQL_ID';
This gives the full production mapping:
SQL_ID
→ SID / SERIAL#
→ OS Process ID
→ Concurrent Request ID
→ Concurrent Program Name
→ Submitted User
→ Request Parameters
This mapping is very useful for RCA and business communication.
7. Find Long-Running Concurrent Requests During Month-End
Use this query to list all long-running concurrent requests during the month-end window.
SELECT r.request_id,
cpt.user_concurrent_program_name,
cp.concurrent_program_name short_name,
u.user_name requested_by,
r.phase_code,
r.status_code,
r.actual_start_date,
r.actual_completion_date,
ROUND((NVL(r.actual_completion_date,SYSDATE)-r.actual_start_date)*24*60,2) runtime_mins,
r.argument_text
FROM apps.fnd_concurrent_requests r
JOIN apps.fnd_concurrent_programs cp
ON r.concurrent_program_id = cp.concurrent_program_id
AND r.program_application_id = cp.application_id
JOIN apps.fnd_concurrent_programs_tl cpt
ON cp.concurrent_program_id = cpt.concurrent_program_id
AND cp.application_id = cpt.application_id
AND cpt.language = 'US'
JOIN apps.fnd_user u
ON r.requested_by = u.user_id
WHERE r.actual_start_date BETWEEN TO_DATE('30-APR-2026 18:00','DD-MON-YYYY HH24:MI')
AND TO_DATE('01-MAY-2026 06:00','DD-MON-YYYY HH24:MI')
AND ROUND((NVL(r.actual_completion_date,SYSDATE)-r.actual_start_date)*24*60,2) > 30
ORDER BY runtime_mins DESC;
This query helps identify requests that exceeded the expected runtime.
8. Analyze ASH Wait Events for the SQL_ID
ASH analysis tells us where the SQL spent most of its time.
SELECT ash.sql_id,
ash.sql_plan_hash_value,
ash.session_state,
ash.wait_class,
ash.event,
COUNT(*) ash_samples
FROM dba_hist_active_sess_history ash
JOIN dba_hist_snapshot sn
ON ash.snap_id = sn.snap_id
AND ash.instance_number = sn.instance_number
WHERE ash.sql_id = '&SQL_ID'
AND sn.begin_interval_time BETWEEN TO_DATE('30-APR-2026 18:00','DD-MON-YYYY HH24:MI')
AND TO_DATE('01-MAY-2026 06:00','DD-MON-YYYY HH24:MI')
GROUP BY ash.sql_id,
ash.sql_plan_hash_value,
ash.session_state,
ash.wait_class,
ash.event
ORDER BY ash_samples DESC;
Common Wait Events and Meaning
| Wait Event / Wait Class | Possible Meaning |
|---|---|
| CPU | SQL is CPU intensive |
| db file scattered read | Full table scan / multiblock read |
| db file sequential read | Index access / single block read |
| direct path read temp | TEMP read due to sort/hash spill |
| direct path write temp | TEMP write due to sort/hash spill |
| enq: TX - row lock contention | Blocking due to row lock |
| library cache lock | Object dependency / hard parse issue |
| PX Deq Credit: send blkd | Parallel query bottleneck |
| resmgr:cpu quantum | Resource Manager throttling |
| gc buffer busy acquire | RAC block contention |
9. Check Whether SQL Ran Parallel or Serial
During month-end, many long-running SQLs are expected to run with parallelism. However, due to parallel server shortage or parameter limits, the SQL may run serially or with reduced DOP.
SELECT s.inst_id,
s.sid,
s.serial#,
s.sql_id,
s.qcsid,
s.qcserial#,
s.server_group,
s.server_set,
s.degree,
s.req_degree
FROM gv$px_session s
WHERE s.sql_id = '&SQL_ID'
ORDER BY s.inst_id, s.qcsid, s.sid;
Check parallel server availability:
SELECT *
FROM gv$px_process_sysstat
WHERE statistic LIKE 'Servers%';
Check parallel-related parameters:
SHOW PARAMETER parallel_max_servers;
SHOW PARAMETER parallel_servers_target;
SHOW PARAMETER parallel_degree_policy;
SHOW PARAMETER parallel_min_time_threshold;
Interpretation
| Finding | Meaning |
|---|---|
| No rows in GV$PX_SESSION | SQL is not running parallel |
| REQ_DEGREE greater than DEGREE | Requested DOP was not fully granted |
| Parallel servers exhausted | Other jobs consumed PX servers |
| QC waiting for PX slaves | Parallel execution bottleneck |
| High PX waits | Parallel skew or resource pressure |
10. Display Current Execution Plan
If the SQL is still available in cursor cache, use:
SELECT *
FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(
sql_id => '&SQL_ID',
cursor_child_no => NULL,
format => 'ALLSTATS LAST +PEEKED_BINDS +OUTLINE +ALIAS +NOTE'
));
Look for the following:
| Plan Observation | Possible Issue |
|---|---|
| Full table scan on large table | Missing index, stale stats, or expected full scan |
| Nested loop with huge rows | Bad join method |
| Hash join with TEMP spill | PGA/TEMP issue |
| Actual rows much higher than estimated rows | Cardinality/statistics issue |
| Different plan hash value | Plan regression |
| No PX operations | SQL did not run parallel |
| PX operations but slow | Parallel skew or resource bottleneck |
11. Display Historical Execution Plan from AWR
To check historical plans:
SELECT *
FROM TABLE(DBMS_XPLAN.DISPLAY_AWR(
sql_id => '&SQL_ID',
plan_hash_value => NULL,
db_id => NULL,
format => 'ALL +OUTLINE +NOTE'
));
This is useful when the SQL is no longer available in memory but exists in AWR.
12. Check Plan Regression
Use the below query to check whether the same SQL_ID used multiple execution plans.
SELECT sql_id,
plan_hash_value,
COUNT(DISTINCT snap_id) snap_count,
SUM(executions_delta) executions,
ROUND(SUM(elapsed_time_delta)/1000000/60,2) elapsed_mins,
ROUND(SUM(elapsed_time_delta)/1000000 / NULLIF(SUM(executions_delta),0),2) avg_elapsed_sec
FROM dba_hist_sqlstat
WHERE sql_id = '&SQL_ID'
GROUP BY sql_id, plan_hash_value
ORDER BY avg_elapsed_sec DESC;
RCA Interpretation
| Finding | RCA Direction |
|---|---|
| Multiple plan hash values | SQL plan instability |
| Old plan fast, new plan slow | Plan regression |
| Same plan but higher data volume | Month-end volume issue |
| Same plan but different waits | System load, I/O, TEMP, or blocking issue |
13. Check Object Statistics
First identify the objects used by the SQL.
SELECT DISTINCT object_owner,
object_name,
object_type
FROM dba_hist_sql_plan
WHERE sql_id = '&SQL_ID'
AND object_owner IS NOT NULL
ORDER BY object_owner, object_name;
Then check table statistics:
SELECT owner,
table_name,
num_rows,
blocks,
last_analyzed,
stale_stats,
stattype_locked
FROM dba_tab_statistics
WHERE owner = '&OWNER'
AND table_name = '&TABLE_NAME';
Check index statistics:
SELECT owner,
index_name,
table_name,
blevel,
leaf_blocks,
clustering_factor,
num_rows,
last_analyzed,
stale_stats
FROM dba_ind_statistics
WHERE owner = '&OWNER'
AND table_name = '&TABLE_NAME';
Interpretation
| Finding | Meaning |
|---|---|
| LAST_ANALYZED is old | Stats may be outdated |
| STALE_STATS = YES | Stats refresh may be required |
| STATS locked | Optimizer may use old stats |
| Bad clustering factor | Index access may be costly |
| Actual rows very different from estimated rows | Cardinality issue |
14. Check TEMP Usage
TEMP usage is common during month-end due to large sorting, hashing, reporting, and accounting jobs.
SELECT s.inst_id,
s.sid,
s.serial#,
s.sql_id,
u.tablespace,
ROUND(u.blocks * t.block_size / 1024 / 1024,2) temp_mb
FROM gv$tempseg_usage u
JOIN gv$session s
ON u.inst_id = s.inst_id
AND u.session_addr = s.saddr
JOIN dba_tablespaces t
ON u.tablespace = t.tablespace_name
WHERE s.sql_id = '&SQL_ID'
ORDER BY temp_mb DESC;
If TEMP usage is high, check the execution plan for hash joins, sorts, group by, order by, or parallel operations.
15. Check Blocking Sessions
Blocking can make a request appear like a SQL performance issue, but the actual cause may be another session holding locks.
SELECT s.inst_id,
s.sid,
s.serial#,
s.username,
s.sql_id,
s.event,
s.wait_class,
s.blocking_instance,
s.blocking_session,
s.seconds_in_wait
FROM gv$session s
WHERE s.blocking_session IS NOT NULL
ORDER BY s.seconds_in_wait DESC;
To check the blocker:
SELECT s.inst_id,
s.sid,
s.serial#,
s.username,
s.sql_id,
s.prev_sql_id,
s.module,
s.action,
s.program,
s.machine,
s.status
FROM gv$session s
WHERE s.sid = '&BLOCKING_SESSION';
Production advice:
Do not directly kill production sessions unless business impact is confirmed and approval is taken.
In EBS, it is better to first identify the request/user and ask the business team to terminate and rerun if that is safer.
16. Check Bind Sensitivity and Child Cursors
Some SQLs perform differently because of bind values, adaptive cursor sharing, or multiple child cursors.
SELECT sql_id,
child_number,
plan_hash_value,
is_bind_sensitive,
is_bind_aware,
is_shareable,
executions,
parsing_schema_name
FROM gv$sql
WHERE sql_id = '&SQL_ID';
To check child cursor reasons:
SELECT sql_id,
child_number,
reason
FROM gv$sql_shared_cursor
WHERE sql_id = '&SQL_ID';
Possible RCA wording:
SQL used multiple child cursors due to bind sensitivity or optimizer environment mismatch. This may have contributed to plan instability during month-end.
17. Compare Normal Runtime vs Month-End Runtime
This query helps compare normal request runtime against abnormal month-end runtime.
SELECT TRUNC(r.actual_start_date) run_date,
cpt.user_concurrent_program_name,
COUNT(*) total_runs,
ROUND(MIN((r.actual_completion_date-r.actual_start_date)*24*60),2) min_mins,
ROUND(AVG((r.actual_completion_date-r.actual_start_date)*24*60),2) avg_mins,
ROUND(MAX((r.actual_completion_date-r.actual_start_date)*24*60),2) max_mins
FROM apps.fnd_concurrent_requests r
JOIN apps.fnd_concurrent_programs cp
ON r.concurrent_program_id = cp.concurrent_program_id
AND r.program_application_id = cp.application_id
JOIN apps.fnd_concurrent_programs_tl cpt
ON cp.concurrent_program_id = cpt.concurrent_program_id
AND cp.application_id = cpt.application_id
AND cpt.language = 'US'
WHERE cpt.user_concurrent_program_name LIKE '%&PROGRAM_NAME%'
AND r.actual_start_date >= SYSDATE - 30
AND r.actual_completion_date IS NOT NULL
GROUP BY TRUNC(r.actual_start_date),
cpt.user_concurrent_program_name
ORDER BY run_date DESC;
Example RCA line:
The concurrent program normally completes in 2 minutes. During month-end, the same program took 4 hours because SQL_ID
xxxxconsumed most of the DB time and waited mainly on TEMP I/O.
18. Month-End SQL Deep Dive Summary Table
Use the below format for your RCA report.
| SQL_ID | Request ID | Program Name | Runtime | Plan Hash | Main Wait | Root Cause | Action |
|---|---|---|---|---|---|---|---|
| abc123 | 123456 | Create Accounting | 4 hrs | 987654321 | direct path read temp | TEMP spill / bad plan | Review stats and plan |
| def456 | 123457 | Gather Schema Stats | 3 hrs | 123456789 | CPU | Large stats job | Run with controlled DOP |
| ghi789 | 123458 | AutoInvoice Import | 2 hrs | 555555555 | db file scattered read | Full table scan | Stats/index review |
| jkl111 | 123459 | Cost Manager | 1.5 hrs | 777777777 | enq: TX row lock | Blocking transaction | Business coordination |
19. Root Cause Classification
A. Plan Regression
Evidence:
Same SQL_ID used multiple PLAN_HASH_VALUEs.
Bad plan consumed more elapsed time and buffer gets.
RCA statement:
The long runtime was caused by SQL plan regression. The SQL used a different plan hash value during month-end compared to normal execution. The new plan caused higher logical reads and elapsed time.
Corrective actions:
1. Compare good and bad execution plans.
2. Validate object statistics.
3. Use SQL Plan Management baseline if a known good plan exists.
4. Avoid blind hinting in seeded EBS code.
B. Stale or Inaccurate Statistics
Evidence:
STALE_STATS = YES
LAST_ANALYZED is old
Estimated rows and actual rows mismatch
RCA statement:
The optimizer selected an inefficient plan due to stale or inaccurate object statistics. The issue became visible during month-end because of increased transaction volume.
Corrective action example:
EXEC DBMS_STATS.GATHER_TABLE_STATS(
ownname => 'OWNER',
tabname => 'TABLE_NAME',
estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE,
method_opt => 'FOR ALL COLUMNS SIZE AUTO',
cascade => TRUE,
degree => 8
);
C. Parallelism Not Used or Not Available
Evidence:
No rows in GV$PX_SESSION
Requested DOP greater than actual DOP
Parallel servers exhausted
RCA statement:
SQL was expected to run with parallelism but executed serially or received reduced DOP due to parallel server shortage. This increased elapsed time during month-end processing.
Corrective actions:
1. Review parallel_max_servers.
2. Review parallel_servers_target.
3. Avoid uncontrolled DOP during peak month-end.
4. Use controlled DOP only after business and DBA approval.
D. TEMP Spill
Evidence:
direct path read temp
direct path write temp
High TEMP usage
Hash join or sort operation in plan
RCA statement:
SQL spent significant time on TEMP I/O due to large sort/hash operations. This indicates memory pressure, poor cardinality estimation, or large month-end data volume.
Corrective actions:
1. Review execution plan.
2. Check PGA pressure.
3. Check TEMP tablespace usage.
4. Validate object statistics.
5. Review join order and cardinality estimates.
E. Blocking or Locking
Evidence:
enq: TX - row lock contention
blocking_session populated
RCA statement:
The concurrent request was delayed due to row lock contention from another active transaction. The database was waiting on an application-level lock rather than executing the SQL actively.
Corrective actions:
1. Identify blocker session.
2. Map blocker to EBS user/request.
3. Coordinate with business.
4. Ask user/process owner to terminate or commit if appropriate.
5. Avoid killing sessions without approval.
20. Production Handling: Should We Kill the Session?
In production, killing a session should not be the first option.
For EBS month-end, the better approach is:
1. Identify SQL_ID, SID, serial#, and request ID.
2. Check whether the request is critical.
3. Check whether it is blocking other business-critical jobs.
4. Inform business users and application owners.
5. Ask business to terminate and rerun from the application if possible.
6. Kill from database only if approved and technically safe.
This protects the business process and avoids data inconsistency.
21. Final Management RCA Format
Executive Summary
During month-end processing, multiple concurrent requests experienced abnormal runtime. The major contributors were long-running SQL statements identified through AWR, ASH, and active session analysis.
The top SQL_IDs consumed significant DB time due to one or more of the following causes:
1. SQL plan regression
2. Stale or inaccurate statistics
3. TEMP spill
4. Blocking sessions
5. Parallel server shortage
6. Increased month-end data volume
7. High concurrent batch workload
Technical Summary
| Area | Finding |
|---|---|
| Impact Window | 30-Apr-2026 18:00 to 01-May-2026 06:00 |
| Affected Area | Month-end concurrent processing |
| Main SQL_IDs | SQL_ID_1, SQL_ID_2, SQL_ID_3 |
| Main Waits | CPU, TEMP I/O, db file scattered read, PX waits |
| Impact | Request runtime increased above normal baseline |
| Immediate Action | Sessions monitored and mapped to requests |
| Permanent Fix | Stats review, plan stabilization, DOP governance, scheduling improvement |
22. Sample RCA Statement
During month-end, concurrent request <REQUEST_ID> / program <PROGRAM_NAME> ran longer than normal due to SQL_ID <SQL_ID>.
The request normally completes in approximately <NORMAL_RUNTIME>, but during the issue window it took <ACTUAL_RUNTIME>.
ASH analysis showed that most of the time was spent on <WAIT_EVENT>. AWR showed plan hash value <PLAN_HASH_VALUE>. Further review confirmed that the root cause was <PLAN_REGRESSION / STALE_STATS / TEMP_SPILL / BLOCKING / PARALLEL_SHORTAGE / DATA_VOLUME>.
No database instance-level outage was observed.
Corrective action:
<ACTION>
Preventive action:
<PREVENTIVE_ACTION>
23. Proactive Month-End Apps DBA Checklist
Before Month-End
| Check | Action |
|---|---|
| Top month-end programs | Review last 3 months runtime |
| SQL plan stability | Check SQLs with multiple plan hash values |
| Object statistics | Gather stats for volatile tables |
| TEMP capacity | Verify free TEMP space |
| Parallel servers | Validate DOP availability |
| Concurrent managers | Check target and actual processes |
| Critical requests | Prepare monitoring list |
| Business communication | Inform users not to rerun blindly |
During Month-End
| Check | Action |
|---|---|
| Running requests | Monitor FND concurrent requests |
| Active SQL_IDs | Monitor GV$SESSION and GV$SQL |
| Wait events | Check ASH and active waits |
| Blocking | Monitor blocking sessions |
| TEMP usage | Monitor GV$TEMPSEG_USAGE |
| Parallel usage | Monitor GV$PX_SESSION |
| CPU/I/O pressure | Monitor AWR/OEM/OS level load |
After Month-End
| Deliverable | Content |
|---|---|
| RCA report | SQL_ID, request ID, root cause, evidence |
| Trend report | Normal runtime vs month-end runtime |
| Fix plan | Stats, SPM, SQL profile, scheduling, DOP |
| Preventive action | Monitoring and business coordination plan |
24. Final Apps DBA Action Plan
| Priority | Action | Owner |
|---|---|---|
| P1 | Identify top SQL_IDs by elapsed time | Apps DBA |
| P1 | Map SQL_IDs to concurrent requests | Apps DBA |
| P1 | Capture ASH wait profile | Apps DBA |
| P1 | Capture current and historical execution plans | Apps DBA |
| P1 | Check blocking, TEMP, and parallelism | Apps DBA |
| P2 | Compare against previous successful month-end | Apps DBA |
| P2 | Validate object statistics | DBA |
| P2 | Recommend SPM baseline or SQL profile if required | DBA |
| P3 | Prepare month-end monitoring dashboard | DBA / SRE |
| P3 | Optimize scheduling of heavy requests | Apps Team / Business |
| P3 | Create safe terminate/rerun communication process | DBA / Business |
Conclusion
Month-end performance troubleshooting in Oracle EBS should be evidence-based. As an Apps DBA, we should not only report that a request was slow. We should clearly prove:
Which SQL_ID caused the issue?
Which request and program triggered it?
What was the wait event?
Was the plan changed?
Was parallelism used?
Was TEMP heavily used?
Was there blocking?
What is the permanent fix?
A strong Apps DBA RCA connects database evidence with business impact.
The best production approach is:
SQL_ID → Session → Request ID → Program → Wait Event → Plan → Root Cause → Corrective Action → Preventive Action
This approach helps us handle month-end issues professionally, avoid unnecessary killing of sessions, and build confidence with business users and management.
No comments:
Post a Comment