dfs.namenode.name . dir and dfs.datanode.data What are the. Dir directories?
dfs.namenode.name . dir and dfs.datanode.data What are the. Dir directories? What’s the effect? Can we find the location of files or directories in the HDFS file system in the local file system?
Can we find the location of a specific file or directory in the HDFS file system in the above two directories of the local file system? Is there a one-to-one mapping relationship?
dfs.namenode.name . dir is the directory to save the fsimage image image, which is used to store the metadata in the namenode of Hadoop; dfs.datanode.data . dir is the directory where HDFS file system data files are stored. It is used to store multiple data blocks in the datanode of Hadoop.
According to HDFS- site.xml In the local file system, dfs.namenode.name The corresponding directory of. Dir is file / usr / local / Hadoop / TMP / DFs / name, dfs.datanode.data The corresponding directory of. Dir is file / usr / local / Hadoop / TMP / DFs / data.
There is no one-to-one pairing between files or directories in HDFS file system and files or directories in local Linux system
dfs.name.dir
Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.
This parameter is used to determine the directory where the meta information of HDFS file system is stored.
If this parameter is set to multiple directories, multiple copies of meta information are stored in these directories.
For example:
<property>
<name> dfs.name.dir< ;/name>
<value>/pvdata/hadoopdata/name/,/opt/hadoopdata/name/</value>
</property>
dfs.data.dir
Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.
This parameter is used to determine the directory where the data of HDFS file system is stored.
We can set this parameter to the directory on multiple partitions, that is, we can build HDFS on different partitions.
For example:
<property>
<name> dfs.data.dir< ;/name>
<value>/dev/sda3/hadoopdata/,/dev/sda1/hadoopdata/</value>
</property>
How to deal with the data node after formatting the file system many times, Namenode can’t start
1. Problem description
when I format the file system many times, such as
2 root@localhost : / usr / local / hadoop-1.0.2 ᦇ bin / Hadoop namenode – Format
the datanode cannot be started. Check the log and find that the error is:
2012-04-20 20:39:46, 501 ERROR org.apache.hadoop . hdfs.server.datanode .DataNode: java.io.IOException : Incompatible namespaceIDs in /home/gqy/hadoop/data: namenode namespaceID = 155319143; Datanode namespaceid = 1036135033
2. The cause of the problem
when we perform file system formatting, it will be in the namenode data folder (that is, in the configuration file) dfs.name.dir Save a current / version file in the path of the local system, record the namespaceid, and identify the version of the formatted namenode. If we format the namenode frequently, we can save it in the datanode (that is, in the configuration file) dfs.data.dir The current / version file in the path of the local system is just the ID of the namenode that you saved when you first formatted it. Therefore, the ID between the datanode and the namenode is inconsistent.
3、 Solution
put the configuration file in the dfs.datadir Change the namespaceid in current / version in the path of the local system to the same as namenode