我是靠谱客的博主 从容招牌,最近开发中收集的这篇文章主要介绍hadoop和hive单机部署,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

hadoop和hive单机部署

本文hadoop安装的是Hadoop-2.6.0-cdh5.7.0版本,hive是hive-1.1.0-cdh5.7.0版本

1.hadoop安装

1.1jdk安装

jdk的安装包在oracle官网都能下载,可选择自己需要的版本

#解压jdk到/usr/local目录下
[root@hadoop001 ~] tar -xzvf jdk-8u181-linux-x64.tar.gz -C /usr/java/
#解压之后查看一下文件夹的权限,如果权限不是root权限注意修改一下权限
[root@hadoop001 ~] chown -R root:root /usr/java/
#配置环境变量
[root@hadoop001 ~] vim ~/.bash_profile

export JAVA_HOME=/usr/java/jdk1.8.0_181
PATH=$JAVA_HOME/bin:$PATH:$HOME/bin
export PATH

#刷新环境变量使其生效
[root@hadoop001 ~] source ~/.bash_profile
#查看一下版本,看下有没有配置成功
[root@hadoop001 ~] java -version

1.2配置ssh互信

虽然只有一台机起,但是仍然要配置ssh的信任关系

[root@hadoop001 ~] ssh-keygen
#一路回车
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
df:71:f6:3e:bb:bb:6c:38:91:f4:bc:70:a1:dd:86:a9 root@flower1
The key's randomart image is:
+--[ RSA 2048]----+
|                 |
|                 |
|                 |
|             . . |
|        S   o Ooo|
|         . . Oo*o|
|          . ..=.o|
|            Eo.= |
|              o*B|
+-----------------+
#将.ssh目录下的公钥文件输入到信任文件authorized_keys中
[hadoop@hadoop001 ~]$ cd .ssh
[hadoop@hadoop001 .ssh]$ cat id_rsa.pub >> authorized_keys
#修改hosts文件,便于通过主机名访问
[root@hadoop003 ~] vim /etc/hosts
172.19.35.154 hadoop001
#设置完毕之后可以测试一下ssh跳转,如果出现了访问依旧需要密码,请查看authorized_keys的权限
[hadoop@hadoop001 .ssh]$ chmod 600 authorized_keys

#配置完ssh互信之后仍需要输入密码的解决办法,一般都是由于用户目录的权限配置不当,SSH不希望home目录和~/.ssh目录对组有写权限
chmod g-w /home/hadoop 
chmod 700 /home/hadoop/.ssh
chmod 600 /home/hadoop/.ssh/authorized_keys

Tips:如果后续有某台机器的公钥文件发生了变更,需要删除known_hosts文件中对应主机名地址的那行记录,然后重新配置一下信任关系,不然会连不上。

1.3安装hadoop

#创建hadoop用户
[root@hadoop001 ~] useradd hadoop
#切换到hadoop用户
[root@hadoop001 ~] su - hadoop
#解压hadoop的安装包
[hadoop@hadoop001 ~] tar -xzvf hadoop-2.6.0-cdh5.7.0.tar.gz -C ../app/
#创建hadoop的软链
ln -s /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/ /home/hadoop/app/hadoop
#配置hadoop的环境变量
[hadoop@hadoop001 ~] vim ~/.bash_profile

export HADOOP_HOME=/home/hadoop/app/hadoop
PATH=${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
export PATH

#使环境变量生效
[hadoop@hadoop001 ~] source ~/.bash_profile
配置hadoop的配置文件
[hadoop@hadoop001 hadoop]$ pwd
/home/hadoop/app/hadoop/etc/hadoop

1.配置core-site.xml文件

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop001:9000</value>
        <description>hdfs内部通讯访问地址</description>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <!--临时文件目录需要自己建立-->
        <value>/home/hadoop/app/hadoop/tmp</value>
    </property>
</configuration>

2.配置hdfs-site.xml文件

<configuration>
        <property>
            <name>dfs.namenode.name.dir</name>
            <value>/home/hadoop/app/hadoop/data/dfs/name</value>
            <description> namenode 存放name table(fsimage)本地目录需要修改,如果没有需要自己创建文件目录)</description>
        </property>

        <property>
            <name>dfs.datanode.data.dir</name>
            <value>/home/hadoop/app/hadoop/data/dfs/data</value>
            <description>datanode存放block本地目录(需要修改,如果没有需要自己创建文件目录)</description>
        </property>

        <property>
            	<!--由于只有一台机器,hdfs的副本数就指定为1-->
                <name>dfs.replication</name>
                <value>1</value>
         </property>
</configuration>

3.修改mapred-site.xml文件

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

4.修改yarn-site.xml

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>               

5.修改hadoop-env.sh文件

#指定本地的jdk
export JAVA_HOME=/usr/java/jdk1.8.0_181
格式化hadoop
#hadoop namenode格式化,看到INFO common.Storage: Storage directory /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/data/dfs/name has been successfully formatted.即为成功格式化
[hadoop@hadoop001 hadoop]$ hadoop namenode -format
启动hadoop
[hadoop@hadoop001 hadoop] start-all.sh
#启动完之后,可以jps命令查看一下是否启动成功
[hadoop@hadoop001 hadoop] jps
2241 NameNode
2599 NodeManager
2987 Jps
2350 DataNode
2927 ResourceManager
#测试一下hdfs能否正常运行
[hadoop@hadoop001 hadoop]$ hdfs dfs -ls
19/04/09 00:05:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ls: `.': No such file or directory

2.hive安装

1.安装mysql

hive的元数据存储在mysql中,所以要先安装mysql,mysql版本是5.7.11

#1.解压及创建目录
[root@hadoop001 local] tar -xzvf mysql-5.7.11-linux-glibc2.5-x86_64.tar.gz -C /usr/local
[root@hadoop001 local] mv mysql-5.7.11-linux-glibc2.5-x86_64 mysql

[root@hadoop001 local] mkdir mysql/arch mysql/data mysql/tmp

#2.修改my.cnf配置
[root@hadoop001 local] vim /etc/my.cnf
#以下配置全部替换
[client]
port            = 3306
socket          = /usr/local/mysql/data/mysql.sock
default-character-set=utf8mb4

[mysqld]
port            = 3306
socket          = /usr/local/mysql/data/mysql.sock

skip-slave-start

skip-external-locking
key_buffer_size = 256M
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 4M
query_cache_size= 32M
max_allowed_packet = 16M
myisam_sort_buffer_size=128M
tmp_table_size=32M

table_open_cache = 512
thread_cache_size = 8
wait_timeout = 86400
interactive_timeout = 86400
max_connections = 600

# Try number of CPU's*2 for thread_concurrency
#thread_concurrency = 32 

#isolation level and default engine 
default-storage-engine = INNODB
transaction-isolation = READ-COMMITTED

server-id  = 1739
basedir     = /usr/local/mysql
datadir     = /usr/local/mysql/data
pid-file     = /usr/local/mysql/data/hostname.pid

#open performance schema
log-warnings
sysdate-is-now

binlog_format = ROW
log_bin_trust_function_creators=1
log-error  = /usr/local/mysql/data/hostname.err
log-bin = /usr/local/mysql/arch/mysql-bin
expire_logs_days = 7

innodb_write_io_threads=16

relay-log  = /usr/local/mysql/relay_log/relay-log
relay-log-index = /usr/local/mysql/relay_log/relay-log.index
relay_log_info_file= /usr/local/mysql/relay_log/relay-log.info

log_slave_updates=1
gtid_mode=OFF
enforce_gtid_consistency=OFF

# slave
slave-parallel-type=LOGICAL_CLOCK
slave-parallel-workers=4
master_info_repository=TABLE
relay_log_info_repository=TABLE
relay_log_recovery=ON

#other logs
#general_log =1
#general_log_file  = /usr/local/mysql/data/general_log.err
#slow_query_log=1
#slow_query_log_file=/usr/local/mysql/data/slow_log.err

#for replication slave
sync_binlog = 500


#for innodb options 
innodb_data_home_dir = /usr/local/mysql/data/
innodb_data_file_path = ibdata1:1G;ibdata2:1G:autoextend

innodb_log_group_home_dir = /usr/local/mysql/arch
innodb_log_files_in_group = 4
innodb_log_file_size = 1G
innodb_log_buffer_size = 200M

#根据生产需要,调整pool size 
innodb_buffer_pool_size = 2G
#innodb_additional_mem_pool_size = 50M #deprecated in 5.6
tmpdir = /usr/local/mysql/tmp

innodb_lock_wait_timeout = 1000
#innodb_thread_concurrency = 0
innodb_flush_log_at_trx_commit = 2

innodb_locks_unsafe_for_binlog=1

#innodb io features: add for mysql5.5.8
performance_schema
innodb_read_io_threads=4
innodb-write-io-threads=4
innodb-io-capacity=200
#purge threads change default(0) to 1 for purge
innodb_purge_threads=1
innodb_use_native_aio=on

#case-sensitive file names and separate tablespace
innodb_file_per_table = 1
lower_case_table_names=1

[mysqldump]
quick
max_allowed_packet = 128M

[mysql]
no-auto-rehash
default-character-set=utf8mb4

[mysqlhotcopy]
interactive-timeout

[myisamchk]
key_buffer_size = 256M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M

#3.创建用户组及用户
[root@hadoop001 local] groupadd -g 101 dba
[root@hadoop001 local] useradd -u 514 -g dba -G root -d /usr/local/mysql mysqladmin

#4.copy 环境变量配置文件至mysqladmin用户的home目录中,为了以下步骤配置个人环境变量
[root@hadoop001 local] cp /etc/skel/.* /usr/local/mysql

#5.配置环境变量
[root@hadoop001 local] vim mysql/.bash_profile
# .bash_profile
# Get the aliases and functions

if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs
export MYSQL_BASE=/usr/local/mysql
export PATH=${MYSQL_BASE}/bin:$PATH


unset USERNAME

#stty erase ^H
set umask to 022
umask 022
PS1=`uname -n`":"'$USER'":"'$PWD'":>"; export PS1

## end

#6.赋权限和用户组,切换用户mysqladmin,安装
#配置文件权限
[root@hadoop001 local] chown  mysqladmin:dba /etc/my.cnf 
[root@hadoop001 local] chmod  640 /etc/my.cnf 
#mysql目录权限
[root@hadoop001 local] chown -R mysqladmin:dba /usr/local/mysql
[root@hadoop001 local] chmod -R 755 /usr/local/mysql 

#7.配置服务及开机自启动
[root@hadoop001 local] cd /usr/local/mysql
#将服务文件拷贝到init.d下,并重命名为mysql
[root@hadoop001 mysql] cp support-files/mysql.server /etc/rc.d/init.d/mysql 
#赋予可执行权限
[root@hadoop001 mysql] chmod +x /etc/rc.d/init.d/mysql
#删除服务
[root@hadoop001 mysql] chkconfig --del mysql
#添加服务
[root@hadoop001 mysql] chkconfig --add mysql
[root@hadoop001 mysql] chkconfig --level 345 mysql on

8.安装libaio及安装mysql的初始db
[root@hadoop001 mysql] yum -y install libaio
[root@hadoop001 mysql] sudo su - mysqladmin

hadoop001:mysqladmin:/usr/local/mysql:> bin/mysqld 
--defaults-file=/etc/my.cnf 
--user=mysqladmin 
--basedir=/usr/local/mysql/ 
--datadir=/usr/local/mysql/data/ 
--initialize

#在初始化时如果加上 –initial-insecure,则会创建空密码的 root@localhost 账号,否则会创建带密码的 root@localhost 账号,密码直接写在 log-error 日志文件中(在5.6版本中是放在 ~/.mysql_secret 文件里,更加隐蔽,不熟悉的话可能会无所适从)

#9.查看临时密码
hadoop001:mysqladmin:/usr/local/mysql/data:>cat hostname.err |grep password 
2017-07-22T02:15:29.439671Z 1 [Note] A temporary password is generated for root@localhost: kFCqrXeh2y(0

#10.启动
/usr/local/mysql/bin/mysqld_safe --defaults-file=/etc/my.cnf &

#11.登录及修改用户密码
hadoop001:mysqladmin:/usr/local/mysql/data:>mysql -uroot -p'kFCqrXeh2y(0'
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or g.
Your MySQL connection id is 2
Server version: 5.7.11-log

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

mysql> alter user root@localhost identified by '123456';
Query OK, 0 rows affected (0.05 sec)

mysql> GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123456' ;
Query OK, 0 rows affected, 1 warning (0.02 sec)


mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

mysql> exit;
Bye

su命令和su -命令区别就是:su只是切换了root身份,但Shell环境仍然是普通用户的Shell;而su -连用户和Shell环境一起切换成root身份了。只有切换了Shell环境才不会出现PATH环境变量错误,报command not found的错误。

su切换成root用户以后,pwd一下,发现工作目录仍然是普通用户的工作目录;而用su -命令切换以后,工作目录变成root的工作目录了。

chkconfig 检查和设置系统的各种服务。

语法:chkconfig [–add][–del][–list][系统服务] :增、删、查系统服务(/etc/【rc.d】/init.d/……)
chkconfig [–level <等级代号>] [系统服务][on/off/reset] :开启关闭重置系统服务
 --level<等级代号> 指定读系统服务要在哪一个执行等级中开启或关闭。
level
abrt-ccpp 0:关闭 1:关闭 2:关闭 3:启用 4:关闭 5:启用 6:关闭
abrt-oops 0:关闭 1:关闭 2:关闭 3:启用 4:关闭 5:启用 6:关闭
abrtd 0:关闭 1:关闭 2:关闭 3:启用 4:关闭 5:启用 6:关闭
等级0表示:表示关机
  等级1表示:单用户模式
  等级2表示:无网络连接的多用户命令行模式
  等级3表示:有网络连接的多用户命令行模式
  等级4表示:不可用
  等级5表示:带图形界面的多用户模式
  等级6表示:重新启动

2.安装hive

#解压hive安装包
[root@hadoop001 ~] tar -xzvf hive-1.1.0-cdh5.7.0.tar.gz -C ../app/
#配置软链
[root@hadoop001 ~] ln -s hive-1.1.0-cdh5.7.0/ /home/hadoop/app/hive
#修改环境变量
[root@hadoop001 ~] vim ~/.bash_profile 

export HIVE_HOME=/home/hadoop/app/hive
PATH=$HIVE_HOME/bin:$PATH
export PATH
#生效环境变量
[root@hadoop001 ~] source ~/.bash_profile 

修改hive配置文件

1.修改hive-env.sh文件,hive-env.sh默认是没有的.从模板cp一份就行

#配置hadoop目录
HADOOP_HOME=/home/hadoop/app/hadoop

2.修改hive-site.xml文件

<configuration>
<property>
        <name>javax.jdo.option.ConnectionURL</name>

        <!--hive_basic为要创建的数据库名,注意字符集设置-->
        <value>jdbc:mysql://localhost:3306/hive_basic?createDatabaseIfNotExist=true&amp;characterEncoding=UTF-8&amp;useSSL=false</value>
</property>
<property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
</property>
<property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <!--MySQL登录账户名-->
        <value>root</value>
</property>

<property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <!--MySQL登录密码-->
        <value>123456</value>
</property>
  <property>
    <!--hive表在hdfs的位置-->
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
  </property>
</configuration>              

3.拷贝mysql-connector至$Hive_HOME/lib目录下,如果没有jar包的话会报错

cp mysql-connector-java-5.1.42-bin.jar ../app/hive/lib/

4.初始化hive元数据数据库

schematool -dbType mysql -initSchema

5.启动hive

[hadoop@hadoop001 hadoop]$ hive
Logging initialized using configuration in jar:file:/home/hadoop/app/hive-1.1.0-cdh5.7.0/lib/hive-common-1.1.0-cdh5.7.0.jar!/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive> show databases;
OK
default
test
Time taken: 0.593 seconds, Fetched: 2 row(s)

最后

以上就是从容招牌为你收集整理的hadoop和hive单机部署的全部内容,希望文章能够帮你解决hadoop和hive单机部署所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(50)

评论列表共有 0 条评论

立即
投稿
返回
顶部