You're in the middle of a production incident — a user's request is hung, the helpdesk ticket is escalating,
and you need to surgically kill exactly that session without touching anything else.
You could scroll through GV$SESSION manually. You could guess by username or machine.
Or — you could use the Execution Context ID (ECID) to locate the exact session with zero ambiguity
and generate the kill command in the same query.
This post is a deep-dive into a Phase 3.1 diagnostic script that does precisely that: ECID-based exact-match
session hunting, filtered to oacore (the Oracle EBS managed server), with every forensic column
you need to understand why the session is hung — not just where it is.
- What is ECID and Why Does It Matter in EBS?
- The Full Script (Phase 3.1)
- Column-by-Column Breakdown
- Understanding the Kill Command Construction
- The WHERE Clause — ECID + oacore Filter
- Decoding Wait Class & Event
- Blocking Session Chain Analysis
- RAC Awareness — Why GV$SESSION, Not V$SESSION
- End-to-End Incident Workflow
- Safety Checklist Before You Kill
- Script Extensions & Variants
- Summary
🧬 1. What is ECID and Why Does It Matter in EBS?
👶 Fresher
ECID — Execution Context Identifier — is a unique tag that Oracle middleware (WebLogic / OC4J)
stamps onto every database session it opens on behalf of an application user request.
Think of it as a transaction passport: it travels from the browser request through WebLogic,
into the database connection, and gets stored in GV$SESSION.ECID.
In Oracle EBS 12.2 with WebLogic 12c, the ECID is propagated via Oracle Diagnostic Logging (ODL) and
is visible in both the WLS server logs (oacore managed server) and in
GV$SESSION.ECID. When a user reports a hung page — and you have the WLS request log or
an FND log reference — you extract the ECID from there and cross-match it to the DB session.
This is far more reliable than matching on username/machine/IP, because in a connection-pooled
environment like EBS, multiple sessions can belong to the same OS user from the same app tier host.
1. FND:
FND_LOG_MESSAGES — column EXECUTION_CONTEXT_ID2. WLS Access Log:
$DOMAIN_HOME/servers/oacore/logs/access.log — look for ecid=3. WLS Diagnostic Log:
oacore-diagnostic.log — search by username + timestamp4. From the user: raise a support ticket and ask them to provide the URL request timestamp — you can backtrack from there.
📜 2. The Full Script — Phase 3.1
-- ============================================================
-- PHASE 3.1: FIND SESSION BY ECID (Exact Match)
-- ============================================================
SET LINES 200 PAGES 100
COL kill_cmd FORMAT A60
COL program FORMAT A25
COL module FORMAT A30
COL action FORMAT A30
COL ecid FORMAT A55
COL event FORMAT A40
COL sql_text FORMAT A80
SELECT
'ALTER SYSTEM KILL SESSION ''' || s.sid || ',' || s.serial# || ',@' || s.inst_id || ''' IMMEDIATE;'
AS kill_cmd,
s.inst_id,
s.sid,
s.serial#,
s.status,
s.state,
TRUNC(s.last_call_et/60) || 'm ' || MOD(s.last_call_et, 60) || 's' AS last_call_et,
s.program,
s.module,
s.action,
s.ecid,
s.username,
s.machine,
s.process AS os_pid,
s.sql_id,
s.sql_child_number,
s.prev_sql_id,
s.event,
s.wait_class,
s.seconds_in_wait,
s.blocking_session,
s.blocking_instance,
s.blocking_session_status
FROM gv$session s
WHERE s.ecid LIKE '%&ECID_EXTRACT%'
AND s.program LIKE '%oacore%'
ORDER BY s.last_call_et DESC;
GV$SESSION (Global View).
In a RAC environment (e.g., Qualcomm's QCTEBSP04 / P05),
the kill command already includes ,@inst_id so you can run it from any node
and it will target the correct instance. No need to log in to the specific node.
🔬 3. Column-by-Column Breakdown
| Column / Alias | Source in GV$SESSION | What It Tells You |
|---|---|---|
| kill_cmd | Derived (SID + SERIAL# + INST_ID) | Ready-to-execute kill statement. Copy → Paste → Done. |
| inst_id | GV$SESSION.INST_ID |
RAC node number (1, 2, 3…). Critical for 2-node or 3-node RAC clusters. |
| sid / serial# | SID, SERIAL# |
Together with inst_id, uniquely identify the session. SERIAL# prevents killing a recycled SID. |
| status | STATUS |
ACTIVE = executing SQL; INACTIVE = idle (open cursor/lock held); KILLED = kill issued, OS cleanup pending. |
| state | STATE |
WAITING = blocked on an event; WAITED SHORT TIME/WAITED KNOWN TIME = recently waited. Helps distinguish stuck from just slow. |
| last_call_et | LAST_CALL_ET (seconds) |
How long since the last client call. Formatted as NNm NNs for readability. >5 minutes on ACTIVE = red flag. |
| program | PROGRAM |
Connecting client binary. JDBC Thin Client = WebLogic pool; oracle@hostname = background/batch. |
| module | MODULE |
Set by the app via DBMS_APPLICATION_INFO. In EBS: OAF page name, Form name, or Concurrent Program short name. |
| action | ACTION |
Fine-grained action within the module (e.g., Query, Update, Commit). Extremely useful for PL/SQL debugging. |
| ecid | ECID |
The execution context ID propagated from WebLogic. This is your match key for the user's request. |
| username | USERNAME |
Oracle DB schema user. In EBS: typically APPS or the responsibility-owning schema. Not the FND user. |
| machine | MACHINE |
Hostname of the connecting machine — usually the WebLogic app tier server (e.g., qcterpp05app1). |
| os_pid | PROCESS |
OS process ID on the client side (WLS JVM thread). Useful if you need to correlate to a WLS thread dump. |
| sql_id / sql_child_number | SQL_ID, SQL_CHILD_NUMBER |
Currently executing SQL. Feed into V$SQL for the plan, or AWR for historical behavior. |
| prev_sql_id | PREV_SQL_ID |
Last SQL executed before the current one. On INACTIVE sessions this is often the real culprit SQL. |
| event | EVENT |
The specific wait event (e.g., enq: TX - row lock contention, db file sequential read). The most important diagnostic column. |
| wait_class | WAIT_CLASS |
High-level category: Concurrency, User I/O, Application, Network, Idle. |
| seconds_in_wait | SECONDS_IN_WAIT |
How long it has been in the current wait event. Distinct from LAST_CALL_ET. |
| blocking_session / blocking_instance / blocking_session_status | 3 columns | If the session is blocked by another session, these identify the blocker. blocking_session_status = VALID means the blocker is alive and holding the lock. |
⚡ 4. Understanding the Kill Command Construction
The script auto-builds the kill statement so you never construct it manually under pressure:
'ALTER SYSTEM KILL SESSION ''' || s.sid || ',' || s.serial# || ',@' || s.inst_id || ''' IMMEDIATE;'
This produces output like:
ALTER SYSTEM KILL SESSION '247,8931,@2' IMMEDIATE;
| Component | Meaning |
|---|---|
| 247 | Session SID on instance 2 |
| 8931 | SERIAL# — distinguishes this session from any recycled SID 247 |
| @2 | RAC Instance number — routes the kill to node 2 even if you're connected to node 1 |
| IMMEDIATE | Skips the graceful wait — session is killed without waiting for it to respond. Use in stuck-session scenarios. |
Without
IMMEDIATE, Oracle waits for the session to acknowledge the kill, which can take minutes if the session is stuck on a network/I-O wait.
In production incidents, always use IMMEDIATE — but understand it still requires Oracle to rollback uncommitted work in the background.
The session in GV$SESSION may remain with STATUS=KILLED for minutes while the rollback completes.
What Happens After the Kill?
-- Verify the session is gone (or in KILLED state rolling back)
SELECT sid, serial#, inst_id, status, last_call_et, username
FROM gv$session
WHERE sid = &SID
AND inst_id = &INST;
-- If STATUS=KILLED persists for >10 mins, the OS process may be stuck.
-- Escalate to OS-level kill: kill -9 <spid> (get SPID from V$PROCESS)
🎯 5. The WHERE Clause — ECID + oacore Filter
WHERE s.ecid LIKE '%&ECID_EXTRACT%'
AND s.program LIKE '%oacore%'
Why LIKE with % on both sides?
The ECID value in GV$SESSION is typically a composite string like:
56f8a7e1-2d14-4b78-9abc-...:1:1:oacore~0f3a
When you extract an ECID fragment from a WLS log or FND table, you often get a portion
of the full string. The LIKE '%fragment%' pattern ensures you still match even
with a partial extract. If you have the full ECID, switch to = '&ECID_EXACT'
for better performance (avoids full table scan on large instances).
Why the oacore Filter?
In Oracle EBS 12.2, all OAF-based pages and self-service flows go through the
oacore WebLogic managed server. Its JDBC pool connections appear
in GV$SESSION.PROGRAM as something like JDBC Thin Client
with MODULE populated from OAF, and critically the PROGRAM
column carries oacore as a substring in some configurations.
s.program LIKE '%frmweb%'.
For Concurrent Manager sessions, filter on s.program LIKE '%FNDLIBR%'
or join to FND_CONCURRENT_REQUESTS on ORACLE_SESSION_ID = s.audsid.
🕐 6. Decoding Wait Class & Event — What Is the Session Actually Doing?
| WAIT_CLASS | Common EVENT | Likely Root Cause in EBS |
|---|---|---|
| Application | enq: TX - row lock contention |
Another session holds a lock. Check BLOCKING_SESSION. |
| Application | enq: TM - contention |
Table-level lock — often a DDL or bulk DML running concurrently. |
| Concurrency | library cache lock |
Shared pool contention — often seen during ADOP patching or mass recompile. |
| User I/O | db file sequential read |
Single-block I/O — index lookup on a slow disk or high I/O load. Check SQL plan. |
| User I/O | db file scattered read |
Multi-block I/O — full table scan. Check for missing index or bad plan. |
| Network | SQL*Net message from client |
Idle — waiting for next request from WebLogic. Normal for pooled connections. |
| Idle | jobq slave wait |
Scheduler slave idle. Ignore in session analysis. |
| Other | wait for unread message on broadcast channel |
RAC global cache background — normal RAC operation, not a problem. |
SELECT * FROM gv$sql WHERE sql_id = '&SQL_ID';SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR('&SQL_ID', &CHILD, 'ALLSTATS LAST'));These reveal the full SQL text and execution plan — confirming whether the hang is a lock wait, a runaway full-scan, or something deeper.
🔗 7. Blocking Session Chain Analysis
The three blocking columns — blocking_session, blocking_instance,
blocking_session_status — tell you whether your target session is itself being
blocked. If blocking_session_status = 'VALID', you have a blocker–waiter chain.
In that case, killing the waiter alone won't fix the problem — you need to kill
the blocker first (or trace why it's holding the lock).
-- Follow the blocking chain from your ECID session outward
SELECT
LPAD(' ', LEVEL*2) || s.sid AS session_tree,
s.inst_id,
s.serial#,
s.status,
s.event,
s.sql_id,
s.blocking_session,
s.blocking_instance,
s.username,
s.module
FROM gv$session s
START WITH s.ecid LIKE '%&ECID_EXTRACT%'
CONNECT BY PRIOR s.blocking_session = s.sid
AND PRIOR s.blocking_instance = s.inst_id
ORDER SIBLINGS BY s.last_call_et DESC;
blocking_instance column captures this. The hierarchical query above
handles cross-instance chains correctly because it joins on both SID and INST_ID.
🏗️ 8. RAC Awareness — Why GV$SESSION, Not V$SESSION
👶 Fresher
In a single-instance database, V$SESSION shows all sessions.
In a RAC cluster, each node has its own V$SESSION — it only shows
sessions on that node. GV$SESSION is the Global View
that spans all nodes. The G prefix on any V$ view gives you
the cluster-wide picture.
On a 2-node or 3-node RAC, EBS WebLogic connection pools distribute sessions across nodes
via load balancing. The ECID you extract from a WLS log does not tell you which
node the corresponding DB session landed on. Running the query against GV$SESSION
finds it regardless of which node it's on, and the inst_id column in the output
— plus the ,@inst_id in the kill command — ensures the kill is routed correctly
without you having to log in to a specific node.
-- Quick sanity: how many EBS oacore sessions per node right now?
SELECT inst_id,
COUNT(*) AS session_count,
SUM(CASE WHEN status='ACTIVE' THEN 1 ELSE 0 END) AS active,
SUM(CASE WHEN status='INACTIVE' THEN 1 ELSE 0 END) AS inactive
FROM gv$session
WHERE program LIKE '%oacore%'
GROUP BY inst_id
ORDER BY inst_id;
🔄 9. End-to-End Incident Workflow
┌──────────────────────────────────────────────────────────────────────────┐
│ USER REPORTS: "Page is hung / spinner running for 20 minutes" │
└────────────────────────┬─────────────────────────────────────────────────┘
│
STEP 1: Get the ECID
├─ From user: approx time + URL + responsibility
├─ From FND: SELECT execution_context_id FROM fnd_log_messages
│ WHERE timestamp BETWEEN ... AND ...
│ AND module LIKE '%<page_name>%';
└─ From WLS: grep 'ecid=' $DOMAIN/servers/oacore/logs/access.log
│
STEP 2: Run Phase 3.1 Script
└─ Supply ECID_EXTRACT at the prompt
→ Confirms: SID, SERIAL#, INST_ID, EVENT, BLOCKING?
│
STEP 3: Assess
├─ Is session ACTIVE or INACTIVE?
├─ What is the EVENT and WAIT_CLASS?
├─ Is BLOCKING_SESSION_STATUS = VALID?
└─ How long is LAST_CALL_ET and SECONDS_IN_WAIT?
│
STEP 4: Decision
├─ Blocked by another session? → Kill blocker first
├─ Long-running SQL (bad plan)? → Capture SQL_ID, check plan
├─ Network idle (SQL*Net from client)? → WLS thread issue, not DB
└─ Confirmed stuck? → Execute kill_cmd from query output
│
STEP 5: Verify + Document
├─ Re-run query → session should be gone or STATUS=KILLED
├─ Confirm user can re-access the page
└─ Log RCA in incident ticket (ECID, SID, EVENT, action taken)
└──────────────────────────────────────────────────────────────────────────┘
✅ 10. Safety Checklist Before You Kill
- Is the session ACTIVE with a legitimate long-running batch (Quarter-end, month-end, payroll)?
- Is the session part of an ADOP patching cutover phase (fs_clone, generate, finalize)?
- Does the MODULE/ACTION indicate a Concurrent Request? (
FND_CONCURRENT_REQUESTScheck) - Is it a background Oracle process (
PMON,SMON,LGWR)? Never kill these. - Have you confirmed with the user/business that the session can be terminated?
- Are you logged in as
SYSDBAor a user withALTER SYSTEMprivilege? - Is this a production instance? If yes — get approval before proceeding.
🛠️ 11. Script Extensions & Variants
Extension A: Include SQL Text Inline
-- Add this to the SELECT to see current SQL without a separate query:
SELECT
...,
(SELECT SUBSTR(sql_text, 1, 200)
FROM gv$sql q
WHERE q.sql_id = s.sql_id
AND q.inst_id = s.inst_id
AND q.child_number = s.sql_child_number
AND rownum = 1) AS curr_sql_text,
(SELECT SUBSTR(sql_text, 1, 200)
FROM gv$sql q
WHERE q.sql_id = s.prev_sql_id
AND q.inst_id = s.inst_id
AND rownum = 1) AS prev_sql_text
FROM gv$session s
WHERE s.ecid LIKE '%&ECID_EXTRACT%'
AND s.program LIKE '%oacore%';
Extension B: FND_LOG_MESSAGES Cross-Reference
-- Map an FND log entry directly to the live DB session
SELECT
fl.execution_context_id AS ecid,
fl.timestamp,
fl.message_text,
s.sid,
s.serial#,
s.inst_id,
s.status,
s.event,
s.sql_id
FROM fnd_log_messages fl
JOIN gv$session s ON s.ecid = fl.execution_context_id
WHERE fl.execution_context_id LIKE '%&ECID_EXTRACT%'
AND s.program LIKE '%oacore%'
ORDER BY fl.timestamp DESC;
Extension C: Batch Kill All oacore Sessions Idle > 30 Minutes
-- USE WITH EXTREME CAUTION — Only in non-production or after business approval
BEGIN
FOR r IN (
SELECT inst_id, sid, serial#
FROM gv$session
WHERE program LIKE '%oacore%'
AND status = 'INACTIVE'
AND last_call_et > 1800 -- 30 minutes
) LOOP
EXECUTE IMMEDIATE
'ALTER SYSTEM KILL SESSION '''
|| r.sid || ',' || r.serial# || ',@' || r.inst_id
|| ''' IMMEDIATE';
END LOOP;
END;
/
📝 12. Summary
| What the Script Does | Why It Matters |
|---|---|
| Matches session by ECID from WLS/FND | Unique identification — no ambiguity in connection-pooled EBS environments |
| Filters to oacore program | Limits scope to OAF/self-service layer; excludes batch, forms, CM |
| Generates kill_cmd inline | Zero manual effort — copy and execute under pressure |
| Includes inst_id in kill command | RAC-safe: works from any node in a multi-node cluster |
| Exposes EVENT + WAIT_CLASS + SECONDS_IN_WAIT | Distinguishes lock wait from I/O from idle — guides the right action |
| Exposes BLOCKING_SESSION + STATUS | Reveals if killing the waiter alone is insufficient |
| Shows MODULE + ACTION + SQL_ID | Full forensic trail for RCA and post-incident documentation |
| Sorts by LAST_CALL_ET DESC | Worst offenders (longest hung) surface immediately |
ECID-based session targeting is one of those techniques that separates reactive firefighting from precise, professional incident management. Combined with a solid understanding of wait events and blocking chains, this script gives you everything you need to go from a user complaint to a resolved incident — with full auditability — in under five minutes.
No comments:
Post a Comment