Oracle EBS Apps DBA Real-Time Production Troubleshooting Guide
Month-end processing is always a critical period for any Oracle E-Business Suite production environment. During this time, multiple high-volume concurrent programs run together, business users are under pressure, and even one bad SQL plan can delay important financial or operational closure activities.
As an Apps DBA, our responsibility is not only to say that “SQL was running long,” but to provide a proper technical deep dive:
Which SQL_ID caused the delay?
Which concurrent request triggered it?
Which user submitted the request?
Was it a bad execution plan, stale statistics, blocking, TEMP usage, or lack of parallelism?
What is the permanent fix?
This blog post explains a practical production approach to analyze all problematic long-running SQL_IDs during month-end.
1. Objective of Month-End SQL Deep Dive
The main objective is to identify SQL statements that caused abnormal runtime during month-end and prepare a clear RCA with corrective and preventive actions.
During the analysis, we should answer the following questions:
| Question | Evidence Required |
|---|
| Which SQL_IDs ran long? | AWR, ASH, GV$SQL |
| Which concurrent request triggered the SQL? | FND concurrent request tables |
| Which program and user were involved? | Request ID, program name, user name |
| What was the main wait event? | ASH / GV$SESSION |
| Did the execution plan change? | Current plan vs historical AWR plan |
| Was SQL running parallel or serial? | GV$PX_SESSION |
| Was there TEMP spill? | GV$TEMPSEG_USAGE |
| Was there any blocking? | GV$SESSION blocking columns |
| What is the root cause? | Plan regression, stale stats, blocking, TEMP, I/O, DOP issue |
| What is the preventive action? | Stats, SPM, scheduling, DOP governance, monitoring |
2. Define the Month-End Issue Window
Before starting any technical analysis, first define the exact time window.
Example:
Now identify the AWR snapshot IDs for this period:
These snapshot IDs will be used for historical SQL analysis.
3. Identify Top Problematic SQL_IDs During Month-End
This query helps identify the top SQL statements by elapsed time, CPU time, I/O wait, buffer gets, disk reads, and executions.
How to Read the Output
| Observation | Meaning |
|---|
| High elapsed time and high CPU | SQL is CPU intensive |
| High elapsed time and high I/O wait | Possible full table scan or storage issue |
| High elapsed time and TEMP wait | Sort/hash operation spilled to TEMP |
| High elapsed time but low CPU | Possible blocking or concurrency issue |
| High executions with small average time | SQL is called repeatedly |
| Low executions with very high average time | One or few executions were very expensive |
| Multiple plan hash values | Possible plan instability or regression |
4. Get SQL Text for the SQL_ID
Once the problematic SQL_ID is identified, get the SQL text.
For current SQL available in memory:
5. Map SQL_ID to Current Database Session
If the SQL is currently running, use the below query to identify the session details.
This gives the database SID, serial number, OS process ID, wait event, and blocking session details.
6. Map SQL_ID to EBS Concurrent Request
This is one of the most important steps in Oracle EBS troubleshooting.
As Apps DBA, we should not stop only at SQL_ID. We must map it to the actual concurrent request and program.
This gives the full production mapping:
This mapping is very useful for RCA and business communication.
7. Find Long-Running Concurrent Requests During Month-End
Use this query to list all long-running concurrent requests during the month-end window.
This query helps identify requests that exceeded the expected runtime.
8. Analyze ASH Wait Events for the SQL_ID
ASH analysis tells us where the SQL spent most of its time.
Common Wait Events and Meaning
| Wait Event / Wait Class | Possible Meaning |
|---|
| CPU | SQL is CPU intensive |
| db file scattered read | Full table scan / multiblock read |
| db file sequential read | Index access / single block read |
| direct path read temp | TEMP read due to sort/hash spill |
| direct path write temp | TEMP write due to sort/hash spill |
| enq: TX - row lock contention | Blocking due to row lock |
| library cache lock | Object dependency / hard parse issue |
| PX Deq Credit: send blkd | Parallel query bottleneck |
| resmgr:cpu quantum | Resource Manager throttling |
| gc buffer busy acquire | RAC block contention |
9. Check Whether SQL Ran Parallel or Serial
During month-end, many long-running SQLs are expected to run with parallelism. However, due to parallel server shortage or parameter limits, the SQL may run serially or with reduced DOP.
Check parallel server availability:
Check parallel-related parameters:
Interpretation
| Finding | Meaning |
|---|
| No rows in GV$PX_SESSION | SQL is not running parallel |
| REQ_DEGREE greater than DEGREE | Requested DOP was not fully granted |
| Parallel servers exhausted | Other jobs consumed PX servers |
| QC waiting for PX slaves | Parallel execution bottleneck |
| High PX waits | Parallel skew or resource pressure |
10. Display Current Execution Plan
If the SQL is still available in cursor cache, use:
Look for the following:
| Plan Observation | Possible Issue |
|---|
| Full table scan on large table | Missing index, stale stats, or expected full scan |
| Nested loop with huge rows | Bad join method |
| Hash join with TEMP spill | PGA/TEMP issue |
| Actual rows much higher than estimated rows | Cardinality/statistics issue |
| Different plan hash value | Plan regression |
| No PX operations | SQL did not run parallel |
| PX operations but slow | Parallel skew or resource bottleneck |
11. Display Historical Execution Plan from AWR
To check historical plans:
This is useful when the SQL is no longer available in memory but exists in AWR.
12. Check Plan Regression
Use the below query to check whether the same SQL_ID used multiple execution plans.
RCA Interpretation
| Finding | RCA Direction |
|---|
| Multiple plan hash values | SQL plan instability |
| Old plan fast, new plan slow | Plan regression |
| Same plan but higher data volume | Month-end volume issue |
| Same plan but different waits | System load, I/O, TEMP, or blocking issue |
13. Check Object Statistics
First identify the objects used by the SQL.
Then check table statistics:
Check index statistics:
Interpretation
| Finding | Meaning |
|---|
| LAST_ANALYZED is old | Stats may be outdated |
| STALE_STATS = YES | Stats refresh may be required |
| STATS locked | Optimizer may use old stats |
| Bad clustering factor | Index access may be costly |
| Actual rows very different from estimated rows | Cardinality issue |
14. Check TEMP Usage
TEMP usage is common during month-end due to large sorting, hashing, reporting, and accounting jobs.
If TEMP usage is high, check the execution plan for hash joins, sorts, group by, order by, or parallel operations.
15. Check Blocking Sessions
Blocking can make a request appear like a SQL performance issue, but the actual cause may be another session holding locks.
To check the blocker:
Production advice:
Do not directly kill production sessions unless business impact is confirmed and approval is taken.
In EBS, it is better to first identify the request/user and ask the business team to terminate and rerun if that is safer.
16. Check Bind Sensitivity and Child Cursors
Some SQLs perform differently because of bind values, adaptive cursor sharing, or multiple child cursors.
To check child cursor reasons:
Possible RCA wording:
SQL used multiple child cursors due to bind sensitivity or optimizer environment mismatch. This may have contributed to plan instability during month-end.
17. Compare Normal Runtime vs Month-End Runtime
This query helps compare normal request runtime against abnormal month-end runtime.
Example RCA line:
The concurrent program normally completes in 2 minutes. During month-end, the same program took 4 hours because SQL_ID xxxx consumed most of the DB time and waited mainly on TEMP I/O.
18. Month-End SQL Deep Dive Summary Table
Use the below format for your RCA report.
| SQL_ID | Request ID | Program Name | Runtime | Plan Hash | Main Wait | Root Cause | Action |
|---|
| abc123 | 123456 | Create Accounting | 4 hrs | 987654321 | direct path read temp | TEMP spill / bad plan | Review stats and plan |
| def456 | 123457 | Gather Schema Stats | 3 hrs | 123456789 | CPU | Large stats job | Run with controlled DOP |
| ghi789 | 123458 | AutoInvoice Import | 2 hrs | 555555555 | db file scattered read | Full table scan | Stats/index review |
| jkl111 | 123459 | Cost Manager | 1.5 hrs | 777777777 | enq: TX row lock | Blocking transaction | Business coordination |
19. Root Cause Classification
A. Plan Regression
Evidence:
RCA statement:
The long runtime was caused by SQL plan regression. The SQL used a different plan hash value during month-end compared to normal execution. The new plan caused higher logical reads and elapsed time.
Corrective actions:
B. Stale or Inaccurate Statistics
Evidence:
RCA statement:
The optimizer selected an inefficient plan due to stale or inaccurate object statistics. The issue became visible during month-end because of increased transaction volume.
Corrective action example:
C. Parallelism Not Used or Not Available
Evidence:
RCA statement:
SQL was expected to run with parallelism but executed serially or received reduced DOP due to parallel server shortage. This increased elapsed time during month-end processing.
Corrective actions:
D. TEMP Spill
Evidence:
RCA statement:
SQL spent significant time on TEMP I/O due to large sort/hash operations. This indicates memory pressure, poor cardinality estimation, or large month-end data volume.
Corrective actions:
E. Blocking or Locking
Evidence:
RCA statement:
The concurrent request was delayed due to row lock contention from another active transaction. The database was waiting on an application-level lock rather than executing the SQL actively.
Corrective actions:
20. Production Handling: Should We Kill the Session?
In production, killing a session should not be the first option.
For EBS month-end, the better approach is:
This protects the business process and avoids data inconsistency.
21. Final Management RCA Format
Executive Summary
During month-end processing, multiple concurrent requests experienced abnormal runtime. The major contributors were long-running SQL statements identified through AWR, ASH, and active session analysis.
The top SQL_IDs consumed significant DB time due to one or more of the following causes:
Technical Summary
| Area | Finding |
|---|
| Impact Window | 30-Apr-2026 18:00 to 01-May-2026 06:00 |
| Affected Area | Month-end concurrent processing |
| Main SQL_IDs | SQL_ID_1, SQL_ID_2, SQL_ID_3 |
| Main Waits | CPU, TEMP I/O, db file scattered read, PX waits |
| Impact | Request runtime increased above normal baseline |
| Immediate Action | Sessions monitored and mapped to requests |
| Permanent Fix | Stats review, plan stabilization, DOP governance, scheduling improvement |
22. Sample RCA Statement
23. Proactive Month-End Apps DBA Checklist
Before Month-End
| Check | Action |
|---|
| Top month-end programs | Review last 3 months runtime |
| SQL plan stability | Check SQLs with multiple plan hash values |
| Object statistics | Gather stats for volatile tables |
| TEMP capacity | Verify free TEMP space |
| Parallel servers | Validate DOP availability |
| Concurrent managers | Check target and actual processes |
| Critical requests | Prepare monitoring list |
| Business communication | Inform users not to rerun blindly |
During Month-End
| Check | Action |
|---|
| Running requests | Monitor FND concurrent requests |
| Active SQL_IDs | Monitor GV$SESSION and GV$SQL |
| Wait events | Check ASH and active waits |
| Blocking | Monitor blocking sessions |
| TEMP usage | Monitor GV$TEMPSEG_USAGE |
| Parallel usage | Monitor GV$PX_SESSION |
| CPU/I/O pressure | Monitor AWR/OEM/OS level load |
After Month-End
| Deliverable | Content |
|---|
| RCA report | SQL_ID, request ID, root cause, evidence |
| Trend report | Normal runtime vs month-end runtime |
| Fix plan | Stats, SPM, SQL profile, scheduling, DOP |
| Preventive action | Monitoring and business coordination plan |
24. Final Apps DBA Action Plan
| Priority | Action | Owner |
|---|
| P1 | Identify top SQL_IDs by elapsed time | Apps DBA |
| P1 | Map SQL_IDs to concurrent requests | Apps DBA |
| P1 | Capture ASH wait profile | Apps DBA |
| P1 | Capture current and historical execution plans | Apps DBA |
| P1 | Check blocking, TEMP, and parallelism | Apps DBA |
| P2 | Compare against previous successful month-end | Apps DBA |
| P2 | Validate object statistics | DBA |
| P2 | Recommend SPM baseline or SQL profile if required | DBA |
| P3 | Prepare month-end monitoring dashboard | DBA / SRE |
| P3 | Optimize scheduling of heavy requests | Apps Team / Business |
| P3 | Create safe terminate/rerun communication process | DBA / Business |
Conclusion
Month-end performance troubleshooting in Oracle EBS should be evidence-based. As an Apps DBA, we should not only report that a request was slow. We should clearly prove:
A strong Apps DBA RCA connects database evidence with business impact.
The best production approach is:
This approach helps us handle month-end issues professionally, avoid unnecessary killing of sessions, and build confidence with business users and management.