idea编译hadoop源码并运行
必须用jdk1.7,不支持1.8,1.6理论上可以,具体看官方文档
必须安装protobuf,必须2.5.0的版本
1.tar -zxf protobuf-2.5.0.tar.gz
2.执行命令./configure prefix=/usr/local/protobuf-2.5.0 最后没有提示error字样就ok
3.执行命令make && make install
4.添加环境变量执行命令 vi ~/.bashrc,添加如下内容
export PROTOBUF_HOME=/usr/local/protobuf-2.5.0
export PATH=.:$PROTOBUF_HOME/bin:$PATH
5.执行source ~/.bashrc立即生效环境
6.验证,执行命令protoc --version
IDEA导入hadoop2.6.0源码,idea直接open 选择hadoop源码根目录即可,然后idea配置maven,命令为
package -Pdist -DskipTests -Dtar -Dmaven.javadoc.skip=true
执行maven编译,编译成功后进入夏眠的步骤
hadoop-hdfs项目的src/test/resources/log4.properties拷贝到src/main/resources/
启动namenode进程:
hadoop-hdfs[把以下的依赖项改成Compile级别]
hadoop-common
commons-collections:commons-collections:3.2.1
commons-configuration:commons-configuration:1.6
hadoop-auth
org.slf4j:slf4j-api:1.7.5
org.apache.httpcomponents:httpclient:4.2.5
org.apache.httpcomponents:httpcore:4.2.5
异常:
https://www.360docs.net/doc/e05195416.html,ng.IllegalArgumentException: Invalid URI for NameNode address (check fs.defaultFS): file:/// has no authority.
修改配置文件:
hadoop-hdfs项目的hdfs-default.xml 可以新建个hdfs-site.xml
should store the name table(fsimage). If this is a comma-delimited list
of directories then the name table is replicated in all of the
directories, for redundancy.
should store its blocks. If this is a comma-delimited
list of directories, then data will be stored in all named
directories, typically on different devices.
Directories that do not exist are ignored.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
If "true", enable permission checking in HDFS.
If "false", permission checking is turned off,
but all other behavior is unchanged.
Switching from one parameter value to the other does not change the mode,
owner or group of files or directories.
hado
op-common项目的core-default.xml 可以新建个core-site.xml
instead
hadoop-hdfs的把webapps拷贝到hadoop-hdfs/src/main/resources/下
运行namenode时使用-format先格式化hdfs文件系统
再次启动namenode不要使用-format参数
启动datanode进程
启动ResourceManager进程
拷贝resourcemanager项目的src/test/resources/log4.properties拷贝到src/main/resources/ 需要新建
hadoop-yarn-server-resourcemanager[把以下的依赖项改成Compile级别]
hadoop-common
commons-configuration:commons-configuration:1.6
hadoop-auth(有两个都要改)
org.apache.curator:curator-framework:2.6.0
org.apache.curator:curator-client:2.6.0
启动resourcemanager后发现访问http://localhost:8088报错
继续把org.apache.httpcomponents:httpclient:4.2.5改成Compile
异常:
https://www.360docs.net/doc/e05195416.html,ng.IllegalStateException: Queue configuration missing child queue names for root
把resourcemanager/conf/capacity-scheduler.xml复制到resourcemanager/src/main/resources/
启动NodeManager进程
拷贝nodemanager项目的src/test/resources/log4.properties拷贝到src/main/resources/
hadoop-yarn-server-nodemanager[把以下的依赖项改成Compile级别]
hadoop-common
commons-configuration:commons-configuration:1.6
commons-collections:commons-collections:3.2.1
hadoop-auth
org.mortbay.jetty:jetty:6.1.26
org.htrace:htrace-core:3.0.4
mapreduce在yarn上运行需要配置mapred-default.xml
Can be one of local, classic or yarn.
yarn-default.xml
异常:
Caused by: https://www.360docs.net/doc/e05195416.html,ng.ClassNotFoundException: Class org.apache.hadoop.mapred.ShuffleHandler not found
在nodemanager项目添加依赖包
模块依赖
hadoop-mapreduce-client-core
外部依赖
hadoop-mapreduce-client-shuffle-2.6.0.jar