Hive versus relational databases
hive is so similar to relational databases that there is always an illusion in hive learning that hive is a database, not a database. Hive is the client side of Hadoop, with HDFS at the bottom, and the execution engine is MapReduce, which is executed on Hadoop and, in other words, a layer of Hadoop’s client package.
1. Data update
- hive read more write less
- mysql usually needs to modify
frequently
2. Data delay
- mysql usually executes in seconds
- hive for a longer time:
- hive query, there is no index, need to scan the whole table, so the delay is high
- mapreduce when the hive is executed, there will be a shuffle, shuffle to drop the disk, the delay is high
3. Data size
- hive data scale is large
- hive is stored in HDFS and built on clusters. You can add machine vertical expansion
- mysql has storage bottlenecks
Mysql store on disk - li>