[HBase Error]“java.lang.OutOfMemoryError: Requested array size exceeds VM limit”

Use version cdh5.4.5, hbase1.0.0

Soon after the new company arrived, the regionserver outage occurred. The exception reported is as follows:

2017-05-12 21:15:26,396 FATAL [B.defaultRpcServer.handler=123,queue=6,port=60020] regionserver.RSRpcServices: Run out of memory; RSRpcServices will abort itself immediately

java.lang.OutOfMemoryError: Requested array size exceeds VM limit

at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)

at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)

at org.apache.hadoop.hbase.io.ByteBufferOutputStream.checkSizeAndGrow(ByteBufferOutputStream.java:77)

at org.apache.hadoop.hbase.io.ByteBufferOutputStream.write(ByteBufferOutputStream.java:116)

at org.apache.hadoop.hbase.KeyValue.oswrite(KeyValue.java:2532)

at org.apache.hadoop.hbase.KeyValueUtil.oswrite(KeyValueUtil.java:548)

at org.apache.hadoop.hbase.codec.KeyValueCodec$KeyValueEncoder.write(KeyValueCodec.java:58)

at org.apache.hadoop.hbase.ipc.IPCUtil.buildCellBlock(IPCUtil.java:122)

at org.apache.hadoop.hbase.ipc.RpcServer$Call.setResponse(RpcServer.java:376)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)

at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)

at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)

at java.lang.Thread.run(Thread.java:745)

Focus on “requested array size exceeds VM limit”

In openjdk, there is a limit that the size of the array is 2 to the power of 31 – 2. If it exceeds this size, the JVM will report an error.

In fact, this is definitely a bug in HBase IPC. In some cases, the length of the created array exceeds the limit of the JVM. Through searching, a patch is found and the problem is fixed
hbase-14598 mainly modifies the length of the array. If it exceeds this, an exception will be sent directly to the client. The direct reason is, but for an operation and maintenance company, it is more important to know which table request causes this problem?

We have a patch, hbase-16033  , More logs are provided. Finally, the following types of logs are found:

[B.defaultRpcServer.handler=90,queue=12,port=60020] ipc.RpcServer: (responseTooLarge): {"processingtimems":2822,"call":"Multi(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest)","client":"","param":"region= ., for 1 actions and 1st row key=A","starttimems":1494609020832,"queuetimems":0,"class":"HRegionServer","responsesize":31697082,"method":"Multi"}

The main problem here is response size, that is, the amount of data returned at one time is too large, which leads to this problem.

In addition, in the search process, we also found that someone had a similar problem. Click Connect, which is basically the same as our type. It is worth noting that the two patches are: hbase-14946 and hbase-14978, which solve the problem of batch reading and writing exceeding the limit. The above pathc is to solve the problem of not reporting errors, and the following is the basis.

We need to find time to upgrade. I hope it will help you.

Read More: