In the world of Oracle E-Business Suite (EBS) R12.2, the oacore managed server is the heart of the application logic. It serves every OAF-based self-service page, every iProc requisition, every AP invoice workbench query — everything that runs through the browser lands on an oacore JVM.
When a specific oacore JVM process starts consuming high CPU or memory at the OS level, the immediate challenge for an Apps DBA is to answer a seemingly simple question:
The answer lies in bridging the gap between the OS-level PID and the database-level gv$session. This post is a deep-dive into that exact workflow — with production-ready queries, field notes on what to look for, and the extended diagnostics that go beyond the basics.
🧠 The Mapping Concept — OS PID ↔ gv$session.process
Before diving into queries, let's understand how the tracing works architecturally:
↓
WebLogic oacore Managed Server (JVM PID: 29693)
↓ JDBC Thin connection opened from this PID
Oracle DB 19c → gv$session.process = '29693'
↓
gv$session → gv$sql → gv$sql_plan → gv$active_session_history
Here is the critical mapping that makes everything possible:
| OS Layer | Database Layer | Relationship |
|---|---|---|
ps -ef → PID (e.g., 29693) | gv$session.process | The oacore JVM PID appears as the process column in gv$session |
ps -ef → PID | gv$process.spid | The DB server process (shadow process) spawned for each JDBC connection |
| oacore JVM Thread ID | gv$session.client_identifier | Sometimes set by OAF framework; may contain user/resp info |
| WebLogic module name | gv$session.module / action | Set by DBMS_APPLICATION_INFO from OAF framework |
gv$session.process = Client-side PID (the oacore JVM that opened the JDBC connection)gv$process.spid = Server-side PID (the Oracle shadow/server process on the DB host)When tracing from the app tier, you match on
gv$session.process.When tracing from the DB host OS (e.g.,
top shows a high-CPU ora_ process), you match on gv$process.spid.
First, list all oacore managed server processes and their resource footprint. The goal is to find the PID that's consuming abnormal CPU or memory.
📌 1A. Quick Listing — All OACore Processes
# List all oacore JVM processes with resource usage ps -ef | grep oacore | grep -v grep | awk '{print $2}' | while read pid; do echo "=== PID: $pid ===" ps -o pid,ppid,%cpu,%mem,vsz,rss,etime,args -p $pid done
📌 1B. Sorted by CPU — Find the Offender Instantly
# Top oacore processes sorted by CPU (descending) ps -eo pid,ppid,%cpu,%mem,rss,etime,args | grep "[o]acore" | sort -k3 -rn | head -10 | \ awk '{printf "PID: %-7s | CPU: %5s%% | MEM: %5s%% | RSS: %6d MB | Uptime: %s\n", $1, $3, $4, $5/1024, $6}'
📌 1C. Memory Focus — RSS in MB
# OACore heap footprint snapshot (RSS in MB) ps -eo pid,rss,args | grep "[o]acore" | \ awk '{printf "PID: %s | RSS: %d MB | %s\n", $1, $2/1024, $3}' | sort -t: -k3 -rn
📌 1D. Solaris-Specific — LWP (Light Weight Process) Count
# On Solaris: check thread (LWP) count per oacore JVM ps -eo pid,nlwp,rss,args | grep "[o]acore" | \ awk '{printf "PID: %-7s | Threads(LWP): %-5s | RSS: %6d MB\n", $1, $2, $3/1024}'
-Xmx setting, that's your target. Note down the PID — we'll use it in every query below.
29693) that is consuming abnormal resources. Every subsequent step uses this PID to drill into the database.
Now we bridge the gap. The process column in gv$session holds the client-side PID — which is the oacore JVM PID from Step 1.
📌 2A. The Core Diagnostic Query — PID to Session to SQL
-- ============================================================ -- CORE QUERY: Map OS PID → DB Session → Active SQL -- Replace '29693' with your identified PID from Step 1 -- ============================================================ SELECT s.inst_id, s.sid, s.serial#, s.client_identifier, s.sql_id, s.prev_sql_id, s.status, s.event, s.wait_class, s.state, s.last_call_et, s.seconds_in_wait, s.blocking_session, s.module, s.action, s.process AS client_pid, p.spid AS db_server_pid, s.logon_time, SUBSTR(q.sql_text, 1, 1000) AS sql_text FROM gv$session s JOIN gv$process p ON p.addr = s.paddr AND p.inst_id = s.inst_id LEFT JOIN gv$sql q ON q.inst_id = s.inst_id AND q.sql_id = s.sql_id WHERE s.program LIKE '%JDBC%' AND s.process = '29693' -- << Replace with your PID ORDER BY s.last_call_et DESC;
29693. Depending on your JDBC pool MaxCapacity, you might see 20–50+ sessions from a single PID.
📌 2B. Condensed View — Session Status Summary for the PID
-- Quick summary: how many sessions from this PID, and what are they doing? SELECT s.process AS client_pid, s.status, s.wait_class, COUNT(*) AS session_count, MAX(s.last_call_et) AS max_idle_secs, ROUND(AVG(s.last_call_et), 1) AS avg_idle_secs, COUNT(DISTINCT s.sql_id) AS distinct_sqls FROM gv$session s WHERE s.program LIKE '%JDBC%' AND s.process = '29693' -- << Replace with your PID GROUP BY s.process, s.status, s.wait_class ORDER BY session_count DESC;
Once you've identified interesting sql_id values from Step 2 (especially from ACTIVE sessions), drill into the full SQL text, execution plan, and resource consumption.
📌 3A. Full SQL Text for Identified sql_id
-- Full SQL text for a specific sql_id found in Step 2 SELECT sql_id, plan_hash_value, executions, ROUND(elapsed_time / 1e6, 2) AS total_elapsed_sec, ROUND(elapsed_time / NULLIF(executions, 0) / 1e6, 3) AS per_exec_sec, ROUND(cpu_time / 1e6, 2) AS total_cpu_sec, buffer_gets, disk_reads, rows_processed, ROUND(buffer_gets / NULLIF(executions, 0), 0) AS gets_per_exec, sql_fulltext FROM gv$sql WHERE sql_id = '<sql_id_from_step2>' -- << Replace ORDER BY inst_id, child_number;
📌 3B. Execution Plan — Is the Plan Regressed?
-- Execution plan for the identified sql_id SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR( '<sql_id_from_step2>', -- << Replace NULL, -- child_number (NULL = all children) 'ALLSTATS LAST PEEKED_BINDS' ));
📌 3C. All SQL_IDs Currently Running from the Problem PID
-- Every distinct SQL being executed by this PID right now SELECT s.sql_id, sq.plan_hash_value, s.status, s.event, s.last_call_et AS running_secs, sq.executions, ROUND(sq.elapsed_time / NULLIF(sq.executions, 0) / 1e6, 3) AS per_exec_sec, sq.buffer_gets, ROUND(sq.buffer_gets / NULLIF(sq.executions, 0)) AS gets_per_exec, SUBSTR(sq.sql_text, 1, 300) AS sql_snippet FROM gv$session s JOIN gv$sql sq ON sq.inst_id = s.inst_id AND sq.sql_id = s.sql_id AND sq.child_number = s.sql_child_number WHERE s.program LIKE '%JDBC%' AND s.process = '29693' -- << Replace with your PID AND s.sql_id IS NOT NULL ORDER BY sq.buffer_gets DESC;
📌 3D. Previous SQL — What Did the Session Just Finish?
-- For INACTIVE sessions: check prev_sql_id to see what they just ran SELECT s.sid, s.serial#, s.status, s.last_call_et AS idle_since_secs, s.sql_id AS current_sql, s.prev_sql_id AS previous_sql, SUBSTR(pq.sql_text, 1, 500) AS prev_sql_text, pq.executions AS prev_sql_execs, ROUND(pq.elapsed_time / NULLIF(pq.executions, 0) / 1e6, 3) AS prev_per_exec_sec FROM gv$session s LEFT JOIN gv$sql pq ON pq.inst_id = s.inst_id AND pq.sql_id = s.prev_sql_id WHERE s.program LIKE '%JDBC%' AND s.process = '29693' -- << Replace with your PID AND s.status = 'INACTIVE' ORDER BY s.last_call_et DESC;
prev_sql_id? If a session is INACTIVE but the JVM is burning CPU, the culprit SQL may have already finished at the DB level. The JVM could be processing the result set (sorting, rendering, serializing). The prev_sql_id tells you what heavy query the session just completed.
📌 4A. Wait Event Breakdown for This PID's Sessions
-- Wait event distribution for all sessions from this PID SELECT s.wait_class, s.event, s.status, COUNT(*) AS session_count, MAX(s.seconds_in_wait) AS max_wait_sec, ROUND(AVG(s.seconds_in_wait), 1) AS avg_wait_sec FROM gv$session s WHERE s.program LIKE '%JDBC%' AND s.process = '29693' -- << Replace with your PID GROUP BY s.wait_class, s.event, s.status ORDER BY session_count DESC;
📌 4B. Is Any Session from This PID Blocked?
-- Check if any session from this PID is blocked by another session SELECT s.sid AS waiting_sid, s.serial# AS waiting_serial, s.sql_id AS waiting_sql, s.event AS wait_event, s.seconds_in_wait, s.blocking_session AS blocker_sid, s.blocking_instance AS blocker_inst, bs.serial# AS blocker_serial, bs.sql_id AS blocker_sql, bs.status AS blocker_status, bs.module AS blocker_module, bs.machine AS blocker_machine, SUBSTR(bq.sql_text, 1, 300) AS blocker_sql_text FROM gv$session s LEFT JOIN gv$session bs ON bs.sid = s.blocking_session AND bs.inst_id = s.blocking_instance LEFT JOIN gv$sql bq ON bq.inst_id = bs.inst_id AND bq.sql_id = bs.sql_id WHERE s.program LIKE '%JDBC%' AND s.process = '29693' -- << Replace with your PID AND s.blocking_session IS NOT NULL;
The queries above show the current snapshot. But what if the spike already passed or you need a timeline? ASH is your time machine.
📌 5A. ASH — Last 30 Minutes of Activity for This PID
-- ASH: Historical activity for this PID over the last 30 minutes SELECT TO_CHAR(h.sample_time, 'HH24:MI:SS') AS sample_time, h.session_id AS sid, h.sql_id, h.sql_plan_hash_value, h.event, h.wait_class, h.session_state, h.blocking_session, h.module, h.action, h.time_waited / 1000 AS wait_ms FROM gv$active_session_history h WHERE h.program LIKE '%JDBC%' AND h.process = '29693' -- << Replace with your PID AND h.sample_time > SYSDATE - 30/1440 ORDER BY h.sample_time DESC;
📌 5B. ASH — Top SQLs by DB Time for This PID (Last 1 Hour)
-- Top SQLs consuming DB time from this specific PID (last 1 hour) SELECT h.sql_id, h.sql_plan_hash_value, COUNT(*) * 10 AS approx_db_time_sec, COUNT(DISTINCT h.session_id) AS sessions_involved, MAX(h.event) AS last_wait_event, ROUND(SUM(h.time_waited) / 1e6, 2) AS total_wait_sec, SUBSTR(MAX(sq.sql_text), 1, 300) AS sql_snippet FROM gv$active_session_history h LEFT JOIN gv$sql sq ON sq.sql_id = h.sql_id AND sq.inst_id = h.inst_id WHERE h.program LIKE '%JDBC%' AND h.process = '29693' -- << Replace with your PID AND h.sample_time > SYSDATE - 1/24 GROUP BY h.sql_id, h.sql_plan_hash_value ORDER BY approx_db_time_sec DESC FETCH FIRST 10 ROWS ONLY;
📌 5C. ASH — Wait Class Distribution Over Time (Heatmap Data)
-- Wait class distribution per minute for this PID (last 30 min) SELECT TO_CHAR(TRUNC(h.sample_time, 'MI'), 'HH24:MI') AS minute_bucket, SUM(CASE WHEN h.session_state = 'ON CPU' THEN 1 ELSE 0 END) AS on_cpu, SUM(CASE WHEN h.wait_class = 'User I/O' THEN 1 ELSE 0 END) AS user_io, SUM(CASE WHEN h.wait_class = 'Application' THEN 1 ELSE 0 END) AS application, SUM(CASE WHEN h.wait_class = 'Concurrency' THEN 1 ELSE 0 END) AS concurrency, SUM(CASE WHEN h.wait_class = 'Cluster' THEN 1 ELSE 0 END) AS cluster_wait, SUM(CASE WHEN h.wait_class NOT IN ('Idle','User I/O','Application','Concurrency','Cluster') AND h.session_state = 'WAITING' THEN 1 ELSE 0 END) AS other_wait, COUNT(*) AS total_samples FROM gv$active_session_history h WHERE h.program LIKE '%JDBC%' AND h.process = '29693' -- << Replace with your PID AND h.sample_time > SYSDATE - 30/1440 GROUP BY TRUNC(h.sample_time, 'MI') ORDER BY 1;
Each ASH sample = ~10 seconds of DB time. If a 1-minute bucket shows
on_cpu = 6, that means ~60 seconds of CPU was consumed by this PID in that minute — that's a full CPU core saturated. If application or concurrency spikes, you have lock/latch contention.
📌 6A. Open Cursor Count per Session from This PID
-- Open cursors per session for the specific PID (ORA-01000 risk) SELECT s.sid, s.serial#, s.status, s.module, COUNT(oc.sql_id) AS open_cursors, COUNT(DISTINCT oc.sql_id) AS distinct_sqls FROM gv$session s JOIN gv$open_cursor oc ON oc.sid = s.sid AND oc.inst_id = s.inst_id WHERE s.program LIKE '%JDBC%' AND s.process = '29693' -- << Replace with your PID GROUP BY s.sid, s.serial#, s.status, s.module ORDER BY open_cursors DESC;
📌 6B. Connection Pool Leak Detector — Idle Sessions with High last_call_et
-- INACTIVE sessions idle for a long time = possible connection leak SELECT s.sid, s.serial#, s.last_call_et AS idle_secs, ROUND(s.last_call_et / 3600, 1) AS idle_hours, s.logon_time, s.module, s.event, s.prev_sql_id FROM gv$session s WHERE s.program LIKE '%JDBC%' AND s.process = '29693' -- << Replace with your PID AND s.status = 'INACTIVE' AND s.last_call_et > 1800 -- idle > 30 minutes ORDER BY s.last_call_et DESC;
idle_hours > 1 and the event is SQL*Net message from client, the connection was borrowed from the pool but never returned. Check the WebLogic JDBC datasource settings: Inactive Connection Timeout and Statement Cache Size.
If the database sessions all look clean (INACTIVE, low CPU) but the JVM is still spiking, the problem is inside the JVM — garbage collection, stuck threads, or application-level loops.
📌 7A. Take 3 Thread Dumps 10 Seconds Apart
# Golden rule: 3 thread dumps, 10 sec apart → compare for stuck threads for i in 1 2 3; do echo "=== Thread Dump #$i at $(date '+%Y-%m-%d %H:%M:%S') ===" kill -3 29693 # << Replace with your PID sleep 10 done echo "Done. Check oacore managed server .out file."
📌 7B. Where to Find the Thread Dump Output
# Thread dump is written to the managed server stdout log grep -c "Full thread dump" $EBS_DOMAIN_HOME/servers/oacore_server1/logs/oacore_server1.out # Extract the most recent thread dump awk '/Full thread dump/{found++} found==3' \ $EBS_DOMAIN_HOME/servers/oacore_server1/logs/oacore_server1.out > /tmp/thread_dump_latest.txt # Count STUCK threads grep -c "STUCK" /tmp/thread_dump_latest.txt
📌 7C. GC Log Analysis — Is the JVM Spending All Its Time in GC?
# Check GC activity in the diagnostic log grep -iE "GC|Full GC|Allocation Failure|OutOfMemory" \ $EBS_DOMAIN_HOME/servers/oacore_server1/logs/oacore_server1-diagnostic.log | tail -30 # Count Full GC events in the last hour grep -c "Full GC" $EBS_DOMAIN_HOME/servers/oacore_server1/logs/oacore_server1-diagnostic.log
📊 Interpreting the Results — Decision Matrix
| Observation | Root Cause Area | Action |
|---|---|---|
Many ACTIVE sessions with same sql_id, high last_call_et | Bad SQL / Plan Regression | Check plan with DBMS_XPLAN; compare plan_hash with baseline; gather stats |
Sessions INACTIVE, event = SQL*Net message from client, PID has high CPU | JVM-side issue (GC / app loop) | Take thread dump; check GC logs; issue is NOT in the database |
Sessions ACTIVE, event = enq: TX - row lock | Row-level locking | Identify blocker SID (Step 4B); coordinate with functional team |
Many sessions, high open_cursors count (>300) | Cursor leak | Escalate to development; check JDBC statement cache config |
Sessions INACTIVE, idle_hours > 1 | Connection pool leak | Check Inactive Connection Timeout in WLS JDBC datasource |
prev_sql_id shows expensive query, session now idle | Heavy result set processing in JVM | Tune the SQL or paginate results at the OAF layer |
ASH shows ON CPU spikes for specific sql_id | CPU-bound SQL | Add missing index, gather stats, check bind variable peeking |
ASH shows Cluster wait spikes | RAC interconnect contention | Check for hot blocks, sequence cache size, instance affinity |
🎁 Bonus — All-in-One PID Diagnostic Script
Paste this into SQL*Plus or SQLcl, substituting the PID. It gives you everything in one shot:
-- ============================================================ -- ALL-IN-ONE PID DIAGNOSTIC — Replace '29693' throughout -- ============================================================ PROMPT ============================== PROMPT 1. SESSION SUMMARY FOR PID PROMPT ============================== SELECT s.process AS pid, s.status, s.wait_class, COUNT(*) AS cnt, MAX(s.last_call_et) AS max_idle FROM gv$session s WHERE s.program LIKE '%JDBC%' AND s.process = '29693' GROUP BY s.process, s.status, s.wait_class ORDER BY cnt DESC; PROMPT ============================== PROMPT 2. ACTIVE SESSIONS DETAIL PROMPT ============================== SELECT s.sid, s.serial#, s.sql_id, s.event, s.last_call_et AS run_sec, s.module, s.action, SUBSTR(q.sql_text, 1, 200) AS sql_text FROM gv$session s LEFT JOIN gv$sql q ON q.sql_id = s.sql_id AND q.inst_id = s.inst_id WHERE s.program LIKE '%JDBC%' AND s.process = '29693' AND s.status = 'ACTIVE' ORDER BY s.last_call_et DESC; PROMPT ============================== PROMPT 3. BLOCKED SESSIONS PROMPT ============================== SELECT s.sid, s.blocking_session, s.event, s.seconds_in_wait FROM gv$session s WHERE s.program LIKE '%JDBC%' AND s.process = '29693' AND s.blocking_session IS NOT NULL; PROMPT ============================== PROMPT 4. TOP SQLs BY BUFFER GETS PROMPT ============================== SELECT sq.sql_id, sq.buffer_gets, sq.executions, ROUND(sq.elapsed_time/1e6,2) AS elapsed_sec, SUBSTR(sq.sql_text, 1, 200) AS sql_text FROM gv$sql sq WHERE sq.sql_id IN ( SELECT DISTINCT s.sql_id FROM gv$session s WHERE s.program LIKE '%JDBC%' AND s.process = '29693' AND s.sql_id IS NOT NULL ) ORDER BY sq.buffer_gets DESC FETCH FIRST 10 ROWS ONLY; PROMPT ============================== PROMPT 5. OPEN CURSORS PROMPT ============================== SELECT s.sid, COUNT(*) AS open_cursors FROM gv$session s JOIN gv$open_cursor oc ON oc.sid = s.sid AND oc.inst_id = s.inst_id WHERE s.program LIKE '%JDBC%' AND s.process = '29693' GROUP BY s.sid HAVING COUNT(*) > 100 ORDER BY open_cursors DESC;
📝 Summary & Key Takeaways
① OS Level:
ps -eo pid,%cpu,rss,args | grep oacore → identify heavy PID② DB Level:
gv$session WHERE process = '<PID>' → find all sessions from that JVM③ SQL Level:
gv$sql / DBMS_XPLAN → get the SQL text and execution plan④ Wait Level: Check
event and wait_class → is it a lock? I/O? CPU?⑤ ASH Level:
gv$active_session_history → timeline view of what happened⑥ JVM Level: Thread dump + GC logs → if DB sessions are clean, the issue is in the JVM
The key insight is this: mapping the OS process directly to gv$session.process is the fastest way to verify whether an oacore CPU/memory spike is caused by heavy database processing — or whether the issue is confined entirely to the JVM heap, garbage collection, or stuck threads within the WebLogic layer.
If the database sessions are all INACTIVE and waiting on SQL*Net message from client, the database is idle and waiting for the JVM. The problem is not a SQL performance issue — it's a JVM-level concern. Conversely, if you find ACTIVE sessions with high last_call_et on the same sql_id, you've found your offending SQL and can proceed with standard SQL tuning (plan analysis, stats gathering, index review).
1. Don't confuse
gv$session.process (client PID = oacore JVM) with gv$process.spid (DB server shadow process). They are different hosts.2. Don't assume high INACTIVE count = problem. The JDBC pool keeps connections open and idle intentionally. Only worry if
idle_hours > 1 AND pool MaxCapacity is being hit.3. Don't skip the
prev_sql_id check. A session that just finished a heavy query will be INACTIVE — but the JVM may still be processing the result set.4. Always take 3 thread dumps, not 1. A single dump is a snapshot; 3 dumps 10 seconds apart reveal which threads are genuinely stuck vs. which are just momentarily busy.
Happy Troubleshooting! 🛠️
🏷️ Tags
Oracle EBS 12.2 OACore JVM Tracing gv$session Apps DBA v$active_session_history ASH DBMS_XPLAN Thread Dump JDBC Pool WebLogic 12c Oracle 19c RAC Performance Tuning SQL Tuning Connection Leak Cursor Leak Wait Events Solaris
No comments:
Post a Comment