RCA for XAException.XAER_RMERR in an application with client server
1. Problem Definition:
Quick Work Around which worked for us : Kill those connections as restart was not possible. It again restored those transactions.
Issue:
The exception Local XARecoveryModule.xaRecovery got XA exception XAException.XAER_RMERR: javax.transaction.xa.XAException
Explaination:
The disconnect happens when client was waiting for server to respond. Rrom logs it was found that both client and server were having problem. At server end there was XA transaction with ACID properties and they were failing due to permission reason. The objects were getting backlogged to attend and then there was serializable problem.
XAException.XAER_RMERR occurs cause of permissions problem. Due to same the memory consumption was growing gradually and io waits were also increasing.
Analysis approach:
The analysis is based on iostat,netstat, vmstat.
The analysis and solutions provided in document will improve the transaction handling and performance of application.
Queries need to be executed and which checks the permissions given to user.
Below permissions needs to be given to user. The user defined to connect from JBoss to Oracle. User is Jboss application user which access Database.
GRANT SELECT ON sys.dba_pending_transactions TO user;
GRANT SELECT ON sys.pending_trans$ TO user;
GRANT SELECT ON sys.dba_2pc_pending TO user;
GRANT EXECUTE ON sys.dbms_xa TO user;
select * from USER_ROLE_PRIVS;
select * from USER_TAB_PRIVS;
select * from USER_SYS_PRIVS;
2. Analysis & Root Cause :
Three different warnings were appearing in server logs:
1. “Could not find new XAResource to use for recovering non-serializable XAResource “1875 times
2. “XAException.XAER_RMERR” 5622 times
3. “CONNECTION_TIMEDOUT” 327 times
3. Proposed Fix or Performance improvement in existing system for specific warning:
1.XAException.XAER_RMERR
From then Red Hat docs:
1. http://docs.redhat.com/docs/en-US/JBoss_Enterprise_Application_Platform_Common_Criteria_Certification/5/html/Transactions_Administrators_Guide/ch07s04.html
2. https://access.redhat.com/solutions/22274
If Oracle is configured incorrectly, you will experience the following error in your log files:
WARN [com.arjuna.ats.jta.logging.loggerI18N] [com.arjuna.ats.internal.jta.recovery.xarecovery1] Local XARecoveryModule.xaRecovery got XA exception javax.transaction.xa.XAException, XAException.XAER_RMERR
To resolve this error, be sure that the Oracle user has access to the appropriate tables to accomplish the recovery:
GRANT SELECT ON sys.dba_pending_transactions TO user;
GRANT SELECT ON sys.pending_trans$ TO user;
GRANT SELECT ON sys.dba_2pc_pending TO user;
GRANT EXECUTE ON sys.dbms_xa TO user;
The above assumes that user is the user defined to connect from JBoss to Oracle. It also assumes that either Oracle 10g R2 (patched for bug 5945463) or 11g is in use. If an unpatched version prior to 11g is used then change the last GRANT EXECUTE to:
GRANT EXECUTE ON sys.dbms_system TO user;
2. CONNECTION_TIMEDOUT
https://developer.jboss.org/thread/233579?tstart=0
Include annotation for MDB which has issues. Basically the connection check is disabled.
@ActivationConfigProperty(propertyName = "clientFailureCheckPeriod", propertyValue = "600000")
@ActivationConfigProperty(propertyName = "connectionTTL", propertyValue = "-1")
3. Could not find new XAResource to use for recovering non-serializable XAResourcehttps://developer.jboss.org/wiki/TxNonSerializableXAResource
Solution to avoid this log and will improve performance as it moves objects to expire logs:
(Periodic Recovery) ARJUNA016037: Could not find new XAResource to use for recovering non-serializable XAResource XAResourceRecord < resource:null, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff3938139a:186f9645:578db5c5:78300da, node_name=1, branch_uid=0:ffff3938139a:186f9645:578db5c5:78300df, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord@7302c959 >
Try add another ExpiryScanner instance to
jbossts-properties.xml: com.arjuna.ats.internal.arjuna.recovery.AtomicActionExpiryScanner.
<!-- Expiry scanners to use (order of invocation is random).
Names must begin with "com.arjuna.ats.arjuna.recovery.expiryScanner"
-->
<property name="com.arjuna.ats.arjuna.recovery.expiryScannerTransactionStatusManager" value="com.arjuna.ats.internal.arjuna.recovery.ExpiredTransactionStatusManagerScanner"/>
<property name="com.arjuna.ats.arjuna.recovery.expiryScannerAtomicAction" value="com.arjuna.ats.internal.arjuna.recovery.AtomicActionExpiryScanner"/>
After expiryScanInterval(default 12h) scanners should be invoked, and move expired objects from store to corresponding Expired catalog.
4. Limitations:
The permission to user has to be granted by DBA.
Without the grants to the user which is involved in distributed transaction the problem will not be solved completely.
06:44:39,576 WARN [org.hornetq.core.client] (hornetq-failure-check-thread) HQ212037: Connection failure has been detected: HQ119014: Did not receive data from invm:0. It is likely the client has exited or crashed without closing its connection, or the network between the server and client has failed. You also might have configured connection-ttl and client-failure-check-period incorrectly. Please check user manual for more information. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
06:44:39,623 WARN [org.hornetq.core.server] (hornetq-failure-check-thread) HQ222061: Client connection failed, clearing up resources for session 8eb86cc8-8234-11e6-9472-9317898f57ab
5. Test approach:
a. After those changes a system has to be load tested .
b. Server logs has to be closely monitored.
No comments:
Post a Comment