Tag Archives: programing language

LGWR waits for event ‘DLM cross inst call completion’ [How to Solve]

Click “blue word” above

Pay attention to us and enjoy more dry goods!

The customer has a set of Oracle 19C DataGuard database environment. The standby side always has large gap at intervals. At the same time, LGWR (ospid: 105521) waits for event ‘DLM cross Inst call completion’ for n secs. The standby side does not provide external queries. At the same time, multi instance log applications are disabled, and the operating system resources are idle, The number of LMS processes is normal. If other nodes are shut down, leaving only the apply log does not exist. DLM is a distributed lock manager, which belongs to the core mechanism of Rac architecture. It realizes multi node resource sharing scheduling and transmits requests through the interconnect network. Here is a brief record of this case:

db alert log

PR00 (PID:109603): Media Recovery Log +ARCH/anbob1/ARCHIVELOG/2021_07_12/thread_3_seq_13586.1479.1077669291
2021-07-12T20:25:29.643687+08:00
PR00 (PID:109603): Media Recovery Log +ARCH/anbob1/ARCHIVELOG/2021_07_12/thread_2_seq_14361.1072.1077669019
2021-07-12T20:29:38.183656+08:00
LGWR (ospid: 105521) waits for event 'DLM cross inst call completion' for 1 secs.
2021-07-12T20:29:48.137737+08:00
LGWR (ospid: 105521) waits for event 'DLM cross inst call completion' for 2 secs.
2021-07-12T20:31:21.952345+08:00
 rfs (PID:113884): Selected LNO:26 for T-2.S-14456 dbid 3902007743 branch 1037635587
2021-07-12T20:31:21.987333+08:00
 rfs (PID:114704): Error ORA-235 occurred during an un-locked control file
 rfs (PID:114704): transaction.  This error can be ignored.  The control
 rfs (PID:114704): file transaction will be retried.
2021-07-12T20:31:43.532600+08:00
ARC2 (PID:106404): Archived Log entry 9591 added for T-2.S-14455 ID 0xe894b1bf LAD:1
2021-07-12T20:31:47.151671+08:00
 rfs (PID:113882): Selected LNO:31 for T-3.S-13731 dbid 3902007743 branch 1037635587
2021-07-12T20:31:49.116049+08:00
 rfs (PID:113880): Selected LNO:22 for T-1.S-13006 dbid 3902007743 branch 1037635587
2021-07-12T20:31:53.393547+08:00
ARC3 (PID:106408): Archived Log entry 9592 added for T-1.S-13005 ID 0xe894b1bf LAD:1
2021-07-12T20:32:02.346585+08:00
ARC2 (PID:106404): Archived Log entry 9593 added for T-3.S-13730 ID 0xe894b1bf LAD:1
2021-07-12T20:33:13.805344+08:00
LGWR (ospid: 105521) waits for event 'DLM cross inst call completion' for 0 secs.
2021-07-12T20:33:13.805470+08:00
LGWR (ospid: 105521) is hung in an acceptable location (inwait 0x1.ffff).
2021-07-12T20:33:21.196764+08:00
LGWR (ospid: 105521) waits for event 'DLM cross inst call completion' for 2 secs.
2021-07-12T20:33:31.310737+08:00
LGWR (ospid: 105521) waits for event 'DLM cross inst call completion' for 0 secs.
2021-07-12T20:33:41.223781+08:00
LGWR (ospid: 105521) waits for event 'DLM cross inst call completion' for 1 secs.
2021-07-12T20:33:51.205776+08:00
LGWR (ospid: 105521) waits for event 'DLM cross inst call completion' for 2 secs.
2021-07-12T20:34:01.307770+08:00
LGWR (ospid: 105521) waits for event 'DLM cross inst call completion' for 0 secs.
2021-07-12T20:34:25.440231+08:00
PR00 (PID:109603): Media Recovery Log +ARCH/anbob1/ARCHIVELOG/2021_07_12/thread_2_seq_14362.1867.1077670807
2021-07-12T20:34:44.864009+08:00
PR00 (PID:109603): Media Recovery Log +ARCH/anbob1/ARCHIVELOG/2021_07_12/thread_3_seq_13587.691.1077670845
2021-07-12T20:34:45.204773+08:00
PR00 (PID:109603): Media Recovery Log +ARCH/anbob1/ARCHIVELOG/2021_07_12/thread_1_seq_12934.1156.1077670917
2021-07-12T20:36:09.378685+08:00
LGWR (ospid: 105521) waits for event 'DLM cross inst call completion' for 2 secs.
2021-07-12T20:36:19.341635+08:00
LGWR (ospid: 105521) waits for event 'DLM cross inst call completion' for 0 secs.
2021-07-12T20:36:28.416573+08:00
LGWR (ospid: 105521) waits for event 'DLM cross inst call completion' for 0 secs.
2021-07-12T20:36:38.375742+08:00
LGWR (ospid: 105521) waits for event 'DLM cross inst call completion' for 1 secs.

LGWR trace

*** 2021-07-12T20:33:43.793041+08:00 ((4))
Received ORADEBUG command (#235) 'dump KSTDUMPCURPROC 1' from process '105470'
-------------------------------------------------------------------------------
Trace Bucket Dump Begin: default bucket for process 47 (osid: 105521, LGWR)
CDB_NAME(CON_ID):CON_UID:TIME(*=approx):SEQ:COMPONENT:[email protected]:FUNCTION:SECT/DUMP:SID:SERIAL#: [EVENT#:PID] DATA
-------------------------------------------------------------------------------
IRMSDB(4):3247498417:2021-07-12 20:33:42.784 :KJCI:[email protected]:kjci_complete():4466:40278: freeing request 0x20fd651e8 (inst|inc|reqid)=(1|88|823031) with opcode=146 and completion status [DONE]
IRMSDB(4):3247498417:2021-07-12 20:33:42.784 :KJCI:[email protected]:kjci_initreq():4466:40278: request 0x20fd651e8 (inst|inc|reqid)=(1|88|823032) with group (type|id)=(1|1), opcode=146, flags=0x0, msglen=56, where=[kqlmClusterMessage] to target instances=
IRMSDB(4):3247498417:2021-07-12 20:33:42.784 :KJCI:[email protected]:kjci_initreq():4466:40278:    1 2
IRMSDB(4):3247498417:2021-07-12 20:33:42.784 :KJCI:[email protected]:kjci_processcrq():4466:40278: processing reply 0x2cff2d4e8 for request 0x20fd651e8 (inst|inc|reqid)=(1|88|823032) with opcode=146 from callee (inst|pid|psn)=(1|36|1)
IRMSDB(4):3247498417:2021-07-12 20:33:42.784 :KJCI:[email protected]:kjci_processcrq():4466:40278: processing reply 0x2cff2d718 for request 0x20fd651e8 (inst|inc|reqid)=(1|88|823032) with opcode=146 from callee (inst|pid|psn)=(2|36|1)
IRMSDB(4):3247498417:2021-07-12 20:33:42.784 :KJCI:[email protected]:kjci_complete():4466:40278: freeing request 0x20fd651e8 (inst|inc|reqid)=(1|88|823032) with opcode=146 and completion status [DONE]
IRMSDB(4):3247498417:2021-07-12 20:33:42.785 :KJCI:[email protected]:kjci_initreq():4466:40278: request 0x20fd651e8 (inst|inc|reqid)=(1|88|823033) with group (type|id)=(1|1), opcode=146, flags=0x0, msglen=56, where=[kqlmClusterMessage] to target instances=
IRMSDB(4):3247498417:2021-07-12 20:33:42.785 :KJCI:[email protected]:kjci_initreq():4466:40278:    1 2
IRMSDB(4):3247498417:2021-07-12 20:33:42.785 :KJCI:[email protected]:kjci_processcrq():4466:40278: processing reply 0x2cff2d4e8 for request 0x20fd651e8 (inst|inc|reqid)=(1|88|823033) with opcode=146 from callee (inst|pid|psn)=(1|36|1)
IRMSDB(4):3247498417:2021-07-12 20:33:42.785 :KJCI:[email protected]:kjci_processcrq():4466:40278: processing reply 0x2cff2d718 for request 0x20fd651e8 (inst|inc|reqid)=(1|88|823033) with opcode=146 from callee (inst|pid|psn)=(2|36|1)
IRMSDB(4):3247498417:2021-07-12 20:33:42.785 :KJCI:[email protected]:kjci_complete():4466:40278: freeing request 0x20fd651e8 (inst|inc|reqid)=(1|88|823033) with opcode=146 and completion status [DONE]
IRMSDB(4):3247498417:2021-07-12 20:33:42.785 :KJCI:[email protected]:kjci_initreq():4466:40278: request 0x20fd651e8 (inst|inc|reqid)=(1|88|823034) with group (type|id)=(1|1), opcode=146, flags=0x0, msglen=56, where=[kqlmClusterMessage] to target instances=
IRMSDB(4):3247498417:2021-07-12 20:33:42.785 :KJCI:[email protected]:kjci_initreq():4466:40278:    1 2

**KJCJ ** ==> ( kjci)_ processcrq – kernel lock management communication cross instance call

For cross node communication, there is no known bug in MOS. First analyze the network problem. You can also do SSD from the process blocker or view the hangmgr trace. The AHF framework in Oracle 19C CRS comes with OSW.

OSW netstat data

zzz ***Tue Jul 13 00:59:51 CST 2021
...
#kernel
IpInReceives                    1456201695         0.0
IpInHdrErrors                   0                  0.0
IpInAddrErrors                  0                  0.0
IpForwDatagrams                 0                  0.0
IpInUnknownProtos               0                  0.0
IpInDiscards                    0                  0.0
IpInDelivers                    1085210966         0.0
IpOutRequests                   1007206469         0.0
IpOutDiscards                   5280               0.0
IpOutNoRoutes                   8                  0.0
IpReasmTimeout                  6333500            0.0
IpReasmReqds                    408470736          0.0
IpReasmOKs                      37504539           0.0
IpReasmFails                    8651478            0.0
IpFragOKs                       29029579           0.0

Note:
currently, there are high IP reorganization failure packets, which is a cumulative value. You can view the daily changes below.

View the failure of daily IP reorganization

 awk '/zzz/{d=$3"/"$4" "$5}/IpReasmFails/{curr=$2;diff=curr-prev;if(diff>5)print d,diff,prev,curr;prev=curr}' *.dat
Jul/13 00:00:16 8620039  8620039
Jul/13 00:00:46 185 8620039 8620224
Jul/13 00:01:16 242 8620224 8620466
Jul/13 00:01:46 324 8620466 8620790
Jul/13 00:02:16 279 8620790 8621069
Jul/13 00:02:46 325 8621069 8621394
Jul/13 00:03:16 325 8621394 8621719
Jul/13 00:03:46 247 8621719 8621966
Jul/13 00:04:16 246 8621966 8622212
Jul/13 00:04:46 210 8622212 8622422
Jul/13 00:05:16 327 8622422 8622749
Jul/13 00:05:46 247 8622749 8622996
Jul/13 00:06:16 238 8622996 8623234
Jul/13 00:06:46 219 8623234 8623453
Jul/13 00:07:16 262 8623453 8623715
Jul/13 00:07:46 254 8623715 8623969
Jul/13 00:08:16 179 8623969 8624148
Jul/13 00:08:46 294 8624148 8624442

Note:
it can be seen that there are high IP reorganization failures at ordinary times. Let’s try to use Ping to verify the network

Using Ping authentication

— on node1

ping -s 4000 {node2-privateIP}
Note:

Forget to keep the historical output here. It is found that there is 12% package loss, indicating that the current and heartbeat networks are not healthy. However, the bond made of two network cards is used. At present, it is in active backup active and standby mode. You can try to switch another network card.

Network card switching

cat /proc/net/bonding/bond0

Note:

Check that the current primary card is ens9f0 and switch to the standby card ens9f1

ifenslave -c bond0 ens9f1

After switching between active and standby network cards: Ping is normal, IP reorganization failure disappears, DLM cross Inst call completion does not appear, DG synchronization is normal, and the problem is solved.

Error in loading latex project compilation [How to Solve]

The following error occurs when overleaf loads the latex template online.
This compile didn’t produce a PDF. This can happen if:
There is an unrecoverable LaTeX error. If there are LaTeX errors shown below or in the raw logs, please try to fix them and compile again.The document environment contains no content. If it’s empty, please add some content and compile again.This project contains a file called output.pdf. If that file exists, please rename it and compile again.

Select “menu” – “Compiler” – “XeLatex”.

Compiled again, successfully.

Go compiles the EXE executable to remove the CMD window at run time

         Recently, I started to contact go language again and wrote an EXE executable program. However, such a console window will appear every time I execute it, and there is no content yet. It’s not elegant, of course.

We can solve this problem by setting compilation parameters in go build

go build -ldflags "-s -w -H=windowsgui"

-s Omit the symbol table and debug information
-w Omit the DWARF symbol table Omit the DWARF symbol table
-H windowsgui does not print messages to the console (On Windows, -H windowsgui writes a "GUI binary" instead of a "console binary."), so there is no cmd window

In this way, the EXE program generated by go compilation can run in the background.

Error: the solution of “241” and “343” in program

When encountering the error: strain ‘,’241’,’343 ‘in program

An error is a compilation error in C/C + +.

The error refers to the source program has illegal characters, illegal characters need to be removed. Generally, the coder uses Chinese input method or directly copies and pastes the code from other places. Chinese space, Chinese quotation marks and various Chinese punctuation marks appear in the code.

Solution:
open the error file in Notepad and save it in ANSI format. You can see that some unrecognized characters become garbled.

1. Delete the space in the wrong line and the space before and after it, and try again
2. Change the color that is obviously different from other punctuation marks( Most compilers have color recognition)
3. If the code is relatively short, it is recommended to re type it all

Resolution of problems with rviz global status displayed as error

Resolution of problems with rviz global status displayed as error

Run the program rosrun rviz rviz – D in the official tutorial of ROS

rospack find turtle_ tf/rviz/turtle_ rviz.rviz

No coordinates are displayed for the results found

Later, I run rosrun rviz when I write TF transformation, and find that the status of TF is warning when add joins TF

In fact, first of all, change the map on the left side of the fixed frame under global options in the upper left corner to world, and then the global status will become OK.

Then select add in the lower left corner to add a TF display. So you can see that there are three coordinates in the middle three-dimensional diagram. Two of them are the coordinates of the two turtles, and the one in the middle is the world coordinates. Next, move the tortoise, and the two coordinates will follow.

Understand the nature of the displays on the left in rviz. In fact, those things are the same as a model. When a tortoise is represented by a coordinate system, the coordinate system will change accordingly. How do you know about this change?In frame, it will receive the corresponding message and then make corresponding actions according to the message.

The exact reason is still unknown