Tag Archives: FAILED: Execution Error

[Solved] Hive Error: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask

Error while processing statement: failed: execution error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.mapredlocaltask

1. Cluster environment

CDH cluster, hive’s engine is Mr.

2. Origin of error

Today, I ran a hive task in the cluster of the test environment and reported an error while processing statement: failed: execution error, return code 3 from org.apache.hadoop.hive.ql.exec.mr.mapredlocaltask.

3. Error reason

This error is because the map join parameter of hive is on by default:

hive.auto.convert.join=true

When using hive for map join, this type of error will be reported if the node memory is insufficient.

4. Error analysis

Mapjoin refers to join on the map side. Its principle is broadcast join, that is, the small table is used as a complete driving table for join operation. Usually, the data in each table to be connected will be processed in different maps. That is, the value corresponding to the same key may exist in different maps. In this way, you must wait until you connect in reduce. To make mapjoin work smoothly, you must meet the following conditions: except that the data of one table is distributed in different maps, the data of other connected tables must have a complete copy in each map. Map join will read all the small tables into memory and directly match the data of another table with the data of the table in memory in the map stage (at this time, the distributed cache can be used to distribute the small tables to various nodes for mapper loading). Due to the join operation during map, the reduction operation is omitted and the efficiency will be much higher.

When the machine memory is insufficient, an error will be reported if you cannot join on the map side.

5. Solution

1. You can close the above map join and change it to common join
shell command line: set hive. Auto. Convert. Join = false 2. Modify the parameters under the configuration file to close the map join. Use common join
hive_conf.xml

<property>
<name>hive.auto.convert.join</name>
<value>false</value>//Modify true to false
<description>Enables the optimization about converting common join into mapjoin</description>
</property>