Program RPC 1726 error tracking

Recently, I encountered a headache in the client environment. One node successfully connected to another node via RPC, but returned 1726 error when sending THE RPC message.
The error message
First take a look at the MSDN explanation, “this remote call failed”, this sentence information content is really too little ah, there is an error I must know is the remote call failed ah.

RPC_S_CALL_FAILED
1726 (0x6BE)
The remote procedure call failed.

However, Google also rarely had a definitive explanation for the tracking of RPC 1726 errors. What did they do?The customer is urging again, help me to solve quickly!! I searched the company’s internal cases on this issue, most of them were caused by firewalls, and many of them were non-Windows built-in firewalls, such as IPS, SOME IBM suites, etc… However, it was found that the Windows firewall had been shut down on the customer’s machine, and there was no suspicious third-party firewall software. Is there a firewall between the two nodes?The client said that the firewall of the intermediate node would not block any communication with RPC… Well, always be suspicious of what customers say, and it turns out they’re wrong.

This is both a review of the code and a constant search for ways to track the problem. Customers are also anxious to urge, at this time found a Microsoft technical article described as follows:

Explanation
A server connection was lost while the server was attempting to perform a remote procedure call. It is unknown whether the remote procedure call executed, or how much of it executed. The connection might have broken because of a problem with the network hardware or because a process terminated.

   
User Action
Wait a few minutes and then try the operation again. If this message reappears, check the server or try to connect to another server. You might also have to check the integrity of the network. If the problem persists, contact the supplier of the running application.

First of all, Microsoft has said that this problem is usually caused by two possible causes: network hardware problems (which should be counted as network firewall problems) or RPC Server process termination. To be sure, our RPC Server process has not terminated, so you have to be suspicious of network reasons!

The problem tracking
This direction was clear enough, so we used unmixed mode capture on both the Client and Server sides of RPC in the Client environment. Capturing packets in unmixed mode ensures that all network packets are captured with a specified native network card. Set Wireshark’s unmixed mode as shown below, and start capturing packets:

The RPC 1726 error problem is then reproduced, stopping the Client side and the Server side from capturing packets. To begin the analysis, set the following filter in the Wirehshark filter on the Client side:

(ip.src_host==x.x.x.98 && ip.dst_host==x.x.x.207) || ((ip.src_host==x.x.x.207 && ip.dst_host==x.x.x.98)) && dcerpc

X.x.x.x.98 is the IP address of the Client, and x.x.x.207 is the address of the Server. This filtering means that only the RPC package with interaction between the Client and the Server is shown (in our program, the default is TCP/ IP-based RPC communication). We did not receive a Response package from the Server!


Is that a problem that RPC Server handles?On the Server side, we used the same Wireshark Filter to Filter out the RPC package communicated between Client and Server, and we did not find that we received the RPC Request. The network packet was swallowed by the network firewall. Although at this time it can be basically proved that it is not the problem of our product itself, but is there more powerful evidence?
TCP connection Reset
At this time, take a look at TCP packets. First, the following filters are adopted on the Client side:

(ip.src_host==x.x.x.98 && ip.dst_host==x.x.x.207) || ((ip.src_host==x.x.x.207 && ip.dst_host==x.x.x.98))

As you can see in the figure below, after sending Request No. 3372, you received a TCP Reset packet!! That’s what firewalls do all the time. In other words, the Client and Server are disconnected.


Does the connection Reset really come from the Server to the Client?After analyzing the Server with the same Wireshark Filter, it is found that the Server did not send the TCP Reset packet and received the TCP Reset packet with IP Source as Client at the same time. It can be found from the Wireshark capture packet of the Client that the Client did not send the TCP Reset packet to the Server. Then this time is basically sure that the Client and Server in the middle of a node of the firewall, shut down this connection!!
It looks like we’re almost done, and since the root cause isn’t our product, it’s time for their webmasters to find out. But the customer always thinks their firewall is ok, he just wants our product to work properly… Since the customer is in the United States, it is Webex every time. In such a network environment, it is more suitable for the network administrator to slowly catch up and investigate.
We hope that the administrator will find them, but they do not cooperate with us, so we have to make our product run normally, so we can only find other feasible solutions. Finally, the feasible solution is to use RPC based on Namepiped for communication, which is a last resort.

Refer to the article
RPC Error:
http://www.microsoft.com/technet/support/ee/transform.aspx?ProdName=Windows+Operating+System& ProdVer = 5.0 & amp; EvtID=1726& EvtSrc=RPC

Read More: