Group By Operator</code bbb>oup aggregate, common attribute
aggregationsăgrouping is for which aggregation function
mode, generally hash, computes the hash of keys
keys When there is no keys attribute, there is only one grouping.
outputColumnNames Temporary column names for output
For example
explain select sum(sal) from tb_emp;
Look at its Group By Operator
+---------------------------------------------------------------------------------------------+
|Explain |
+---------------------------------------------------------------------------------------------+
| Group By Operator |
| aggregations: sum(sal) |
| mode: hash |
| outputColumnNames: _col0 |
| Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE|
+---------------------------------------------------------------------------------------------+
Again for instance
explain select deptno,sum(sal) from tb_emp group by deptno;
Look at its Group By Operator</code bbb>
+------------------------------------------------------------------------------------------------+
|Explain |
+------------------------------------------------------------------------------------------------+
| Group By Operator |
| aggregations: sum(sal) |
| keys: deptno (type: int) |
| mode: hash |
| outputColumnNames: _col0, _col1 |
| Statistics: Num rows: 89 Data size: 718 Basic stats: COMPLETE Column stats: NONE|
+------------------------------------------------------------------------------------------------+
The group by implementation principle
The process of transforming a GROUP BY task into a MR task is as follows:
Map: Generate key-value pairs, using the column in the GROUP BY condition as the Key and the result of the aggregation function as the Value
Shuffle: Hash according to the value of the Key, and send the key-value pairs to different Reducers according to the Hash value
Reduce: Reduce based on the columns of the SELECT clause and the aggregation function
conclusion
Group By Operator</code bbb>s four attributes.
g> By Operator
can als>ve Group By oper>
. Group by>rator
reference
Group by Execution Plan Analysis (Hive
Read More:
- Hive view execution plan
- Handling of expression not in group by key [value] reported by hive on October 12, 2020
- Group by query only_ FULL_ GROUP_ By error
- SQL query time group_ Was cut by group_ CONCAT()
- Error: cannot fetch last explain plan from plan_table
- Error Code: 1055. Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated
- Hive execution task report cannot find main class error
- Syntax error or access violation: 1055 Expression #1 of SELECT list is not in GROUP BY clause and co
- Execution error, return code 1 from org.apache.hadoop . hive.ql.exec .DDLTask.
- Failed: execution error, return code 1 from org.apache.hadoop . hive.ql.exec .DDLTask…
- [Solved] hiveonspark:Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
- FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(me
- Exception on start hive: caused by: java.net.noroutetohostexception: no route to host
- “Hive metadata problem” hive.metastore.HiveMetaException : Failed to get schema version.
- Implementation of multithread sequential alternate execution by using lock in Java
- Two kinds of errors caused by root / lack of execution permission x
- PSQLException: ERROR: cached plan must not change result type
- About error 1005 (HY000) in MySQL: can’t create table ‘_______ ‘(errno: 150) fool’s plan
- [C + +] C + + overload operator = must be a nonstatic member function?
- About java “Error: bad binary operator types”