Skip to main content

This content has been archived and is no longer being updated. Links may not function; however, this content may be relevant to outdated versions of the product.

Support Article

Upgrade to Pega 7.1.7 causes DB Deadlock exceptions

SA-10036

Summary



An upgrade to Jboss EAP 6.3.2 together with Pega 7.1.7 is performed. Deadlocks in MS SQL database are observed when testing with 1 node and 5 slaves.

"com.pega.pegarules.pub.database.DatabaseException: Transaction (Process ID 77) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction".

A custom table TABLE_LOG_AUDIT contains log key actions in flows such as, User logged in, Time report created, Time report submitted. This table is filled with approximately 12 million rows.
An activity is creates an instance of the class, fetches a new sequence NBR from the DB through a stored procedure, puts the properties, does a OBJ-SAVE with the WriteNow checkbox checked and finally does a Page-Remove on the in memory page.

The error occurs because of a combination of the 12 million rows and Pega 7.1.7. The behavior disappears after truncating the log table. The error occurs frequently when logs are created for login, and sporadically when logs are created for log out of the user. It was also observed that when more than 1 user is logged in every 2 seconds on one JVM, this issue occurs.


The environment involved is:
MS SQL Server 2012
MS Windows server 2012 R2
Jboss Enterprise Application Platform 6.3.2 (also 6.1 used for comparison checking)
PRPC 7.1.7 (also 7.1.6 used for comparison checking)
JAVA SE 1.7.0_60-b19
sqljdbc4.jar (ms sql drivers)


Error Messages



2015-04-27 09:10:50,994 [xecutor-threads - 16] [ STANDARD] [ ] [ PegaRULES:07.10] ( internal.access.DatabaseImpl) ERROR your_server|IPADDRESS.10 - Encountered problem when performing a Commit
com.pega.pegarules.pub.database.DatabaseException: Transaction (Process ID 95) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
From: (HD63F05282C93AA21457337683180F5E6:IPADDRESS.10)
SQL: MERGE INTO data.so_log_audit t USING (VALUES (? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ? , ?)) AS inputs ("pzInsKey" , "EventResult" , "EventType" , "FundId" , "Language" , "LoginType" , "MemberId" , "SessionId" , "WorkType" , "pxCreateDateTime" , "pxCreateOpName" , "pxCreateOperator" , "pxCreateSystemID" , "pxInsName" , "pxObjClass" , "pxUpdateDateTime" , "pxUpdateOpName" , "pxUpdateOperator" , "pxUpdateSystemID" , "pyDescription" , "pyID" , "pyLabel" , "pyUserAgent" , "pzPVStream") ON (t.pzInsKey = inputs."pzInsKey") WHEN MATCHED THEN UPDATE SET EventResult = inputs."EventResult" , EventType = inputs."EventType" , FundId = inputs."FundId" , Language = inputs."Language" , LoginType = inputs."LoginType" , MemberId = inputs."MemberId" , SessionId = inputs."SessionId" , WorkType = inputs."WorkType" , pxCreateDateTime = inputs."pxCreateDateTime" , pxCreateOpName = inputs."pxCreateOpName" , pxCreateOperator = inputs."pxCreateOperator" , pxCreateSystemID = inputs."pxCreateSystemID" , pxInsName = inputs."pxInsName" , pxObjClass = inputs."pxObjClass" , pxUpdateDateTime = inputs."pxUpdateDateTime" , pxUpdateOpName = inputs."pxUpdateOpName" , pxUpdateOperator = inputs."pxUpdateOperator" , pxUpdateSystemID = inputs."pxUpdateSystemID" , pyDescription = inputs."pyDescription" , pyID = inputs."pyID" , pyLabel = inputs."pyLabel" , pyUserAgent = inputs."pyUserAgent" , pzPVStream = inputs."pzPVStream" WHEN NOT MATCHED THEN INSERT (pzInsKey , EventResult , EventType , FundId , Language , LoginType , MemberId , SessionId , WorkType , pxCreateDateTime , pxCreateOpName , pxCreateOperator , pxCreateSystemID , pxInsName , pxObjClass , pxUpdateDateTime , pxUpdateOpName , pxUpdateOperator , pxUpdateSystemID , pyDescription , pyID , pyLabel , pyUserAgent , pzPVStream) VALUES (inputs."pzInsKey" , inputs."EventResult" , inputs."EventType" , inputs."FundId" , inputs."Language" , inputs."LoginType" , inputs."MemberId" , inputs."SessionId" , inputs."WorkType" , inputs."pxCreateDateTime" , inputs."pxCreateOpName" , inputs."pxCreateOperator" , inputs."pxCreateSystemID" , inputs."pxInsName" , inputs."pxObjClass" , inputs."pxUpdateDateTime" , inputs."pxUpdateOpName" , inputs."pxUpdateOperator" , inputs."pxUpdateSystemID" , inputs."pyDescription" , inputs."pyID" , inputs."pyLabel" , inputs."pyUserAgent" , inputs."pzPVStream");

Caused by SQL Problems.
Problem #1, SQLState 40001, Error code 1205: com.microsoft.sqlserver.jdbc.SQLServerException: Transaction (Process ID 95) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.


Steps to Reproduce



Perform load testing
---------------------------------------
Functional test, systest environment
The result is ok
Testrun 1 Testrun 2 Testrun 3

Number of users 5 10 15
Iterations 2 2 2
Ramp up time 10 min 10 min 10 min
http errors 0% 0% 0%
Avg response time 195ms 165ms 169ms
Sent timereports (nbr of weeks)/target 20/20 40/40 56/60
Standard deviation total 694ms 460ms 583ms
High standard deviations Log in, fetching pdfs fetching pdfs Log in, fetching pdfs
Notes Some time reports alread existed in the backend system (2 x 2 weeks)
Loadtests preprod

A serious error occured quite fast in preproduction when the number of users went over 30 users / 60 seconds for log in and log out, the database got deadlock exceptions on the table TABLE_LOG_AUDIT (audit logging for events in the application ”APPLICATION”). The error has been reproduced and occurs when the number of logins exceeds 1 user / 2 seconds.


Root Cause



The root cause of this problem is a defect in Pegasystems’ code/rules. In Pega 7.1.7, merge or batch is introdued as default mode for saving (deferred/immediate) instances.

In the same transaction, the instances is grouped into multiple batches by prepared statements. The process does not necessarily maintain deterministic orders for instances. This could potentially violate execution orders wherever a mandatory order is needed. For example, foreign key constraints.


Resolution




This issue is resolved through the following local-change: 

Disable merge or batch by change the following prconfig settings. This requires restart PRPC.

<env name="classmap/usemergestatement"  value ="false" />

<env name="database/batchupdates" value="-1" />

Published September 18, 2015 - Updated October 8, 2020

Was this useful?

0% found this useful

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega Community has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice
Contact us