idea编译hadoop源码并运行

必须用jdk1.7,不支持1.8,1.6理论上可以,具体看官方文档
必须安装protobuf,必须2.5.0的版本
1.tar -zxf protobuf-2.5.0.tar.gz
2.执行命令./configure prefix=/usr/local/protobuf-2.5.0 最后没有提示error字样就ok
3.执行命令make && make install
4.添加环境变量执行命令 vi ~/.bashrc,添加如下内容
export PROTOBUF_HOME=/usr/local/protobuf-2.5.0
export PATH=.:$PROTOBUF_HOME/bin:$PATH
5.执行source ~/.bashrc立即生效环境
6.验证,执行命令protoc --version
IDEA导入hadoop2.6.0源码,idea直接open 选择hadoop源码根目录即可,然后idea配置maven,命令为
package -Pdist -DskipTests -Dtar -Dmaven.javadoc.skip=true
执行maven编译,编译成功后进入夏眠的步骤
hadoop-hdfs项目的src/test/resources/log4.properties拷贝到src/main/resources/
启动namenode进程:
hadoop-hdfs[把以下的依赖项改成Compile级别]
hadoop-common
commons-collections:commons-collections:3.2.1
commons-configuration:commons-configuration:1.6
hadoop-auth
org.slf4j:slf4j-api:1.7.5
org.apache.httpcomponents:httpclient:4.2.5
org.apache.httpcomponents:httpcore:4.2.5
异常:
https://www.360docs.net/doc/e05195416.html,ng.IllegalArgumentException: Invalid URI for NameNode address (check fs.defaultFS): file:/// has no authority.
修改配置文件:
hadoop-hdfs项目的hdfs-default.xml 可以新建个hdfs-site.xml

https://www.360docs.net/doc/e05195416.html,.dir
file:///Users/RandyChan/IdeaProjects/work/hadoop-2.6.0-src/tmp/dfs/name
Determines where on the local filesystem the DFS name node
should store the name table(fsimage). If this is a comma-delimited list
of directories then the name table is replicated in all of the
directories, for redundancy.



dfs.datanode.data.dir
file:///Users/RandyChan/IdeaProjects/work/hadoop-2.6.0-src/tmp/dfs/data
Determines where on the local filesystem an DFS data node
should store its blocks. If this is a comma-delimited
list of directories, then data will be stored in all named
directories, typically on different devices.
Directories that do not exist are ignored.



dfs.replication
1
Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.



dfs.permissions.enabled
false

If "true", enable permission checking in HDFS.
If "false", permission checking is turned off,
but all other behavior is unchanged.
Switching from one parameter value to the other does not change the mode,
owner or group of files or directories.


hado

op-common项目的core-default.xml 可以新建个core-site.xml

hadoop.tmp.dir
/Users/RandyChan/IdeaProjects/work/hadoop-2.6.0-src/tmp
A base for other temporary directories.


https://www.360docs.net/doc/e05195416.html,
hdfs://localhost:9000
Deprecated. Use (fs.defaultFS) property
instead


hadoop-hdfs的把webapps拷贝到hadoop-hdfs/src/main/resources/下
运行namenode时使用-format先格式化hdfs文件系统
再次启动namenode不要使用-format参数
启动datanode进程
启动ResourceManager进程
拷贝resourcemanager项目的src/test/resources/log4.properties拷贝到src/main/resources/ 需要新建
hadoop-yarn-server-resourcemanager[把以下的依赖项改成Compile级别]
hadoop-common
commons-configuration:commons-configuration:1.6
hadoop-auth(有两个都要改)
org.apache.curator:curator-framework:2.6.0
org.apache.curator:curator-client:2.6.0
启动resourcemanager后发现访问http://localhost:8088报错
继续把org.apache.httpcomponents:httpclient:4.2.5改成Compile
异常:
https://www.360docs.net/doc/e05195416.html,ng.IllegalStateException: Queue configuration missing child queue names for root
把resourcemanager/conf/capacity-scheduler.xml复制到resourcemanager/src/main/resources/
启动NodeManager进程
拷贝nodemanager项目的src/test/resources/log4.properties拷贝到src/main/resources/
hadoop-yarn-server-nodemanager[把以下的依赖项改成Compile级别]
hadoop-common
commons-configuration:commons-configuration:1.6
commons-collections:commons-collections:3.2.1
hadoop-auth
org.mortbay.jetty:jetty:6.1.26
org.htrace:htrace-core:3.0.4
mapreduce在yarn上运行需要配置mapred-default.xml

https://www.360docs.net/doc/e05195416.html,
yarn
The runtime framework for executing MapReduce jobs.
Can be one of local, classic or yarn.


yarn-default.xml

the valid service name should only contain a-zA-Z0-9_ and can not start with numbers
yarn.nodemanager.aux-services

mapreduce_shuffle

异常:
Caused by: https://www.360docs.net/doc/e05195416.html,ng.ClassNotFoundException: Class org.apache.hadoop.mapred.ShuffleHandler not found
在nodemanager项目添加依赖包
模块依赖
hadoop-mapreduce-client-core
外部依赖
hadoop-mapreduce-client-shuffle-2.6.0.jar

相关文档
最新文档