我是靠谱客的博主 幽默绿草,最近开发中收集的这篇文章主要介绍1.8.5 大数据-Spark-SparkSql与Hive集成(spark-shell/spark-sql/beeline),觉得挺不错的,现在分享给大家,希望可以做个参考。
概述
一、需要配置的项目
1.拷贝hive的配置文件hive-site.xml
到spark的conf目录 记得检查hive-site.xml中metastore的url的 配置
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://bigdata-pro01.kfk.com/metastore?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://bigdata-pro03.kfk.com:9083</value>
</property>
2.拷贝hive中的MySQL jar包到spark 的jars目录
有个mysql的driver包记得
3.检查spark-env.sh
文件中hadoop的配置项
HADOOP_CONF_DIR=/opt/modules/hadoop-2.5.0/etc/hadoop
二、需要启动的服务
1.sudo service mysqld start
2.bin/hive
--service metastore
三、验证
准备环境 hive建个表
[kfk@bigdata-pro03 datas]$ touch kfk.txt
[kfk@bigdata-pro03 datas]$ vi kfk.txt
001 spark
002 hive
003 hbase
004 hadoop
[kfk@bigdata-pro03 hive-0.13.1-bin]$ bin/hive
hive (default)> show databases;
OK
database_name
default
Time taken: 0.124 seconds, Fetched: 1 row(s)
hive (default)> create database kfk;
OK
Time taken: 0.169 seconds
hive (default)> use kfk
hive (kfk)> CREATE TABLE IF NOT EXISTS test(
>
userid string,
>
username string)
>
ROW FORMAT DELIMITED FIELDS TERMINATED BY ' '
>
STORED AS textfile;
hive (kfk)> load data local inpath "/opt/datas/kfk.txt" into table test;
Copying data from file:/opt/datas/kfk.txt
Copying file: file:/opt/datas/kfk.txt
Loading data to table kfk.test
Table kfk.test stats: [numFiles=1, numRows=0, totalSize=40, rawDataSize=0]
OK
Time taken: 0.388 seconds
hive (kfk)>
select * from test;
OK
test.userid
test.username
001
spark
002
hive
003
hbase
004
hadoop
Time taken: 0.153 seconds, Fetched: 4 row(s)
spark-shell 模式验证
[kfk@bigdata-pro03 spark-2.2.0-bin]$ bin/spark-shell
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
20/06/23 17:09:56 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context Web UI available at http://192.168.0.153:4040
Spark context available as 'sc' (master = local[*], app id = local-1592946597423).
Spark session available as 'spark'.
Welcome to
____
__
/ __/__
___ _____/ /__
_ / _ / _ `/ __/
'_/
/___/ .__/_,_/_/ /_/_
version 2.2.0
/_/
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_11)
Type in expressions to have them evaluated.
Type :help for more information.
scala> spark.sql("select * from kfk.test").show
+------+--------+
|userid|username|
+------+--------+
|
001|
spark|
|
002|
hive|
|
003|
hbase|
|
004|
hadoop|
+------+--------+
spark-sql模式验证
[kfk@bigdata-pro03 spark-2.2.0-bin]$ bin/spark-sql
spark-sql (default)> show databases;
databaseName
default
kfk
spark-sql (default)> use kfk ;
spark-sql (default)> show tables;
database
tableName
isTemporary
kfk
test
false
thriftserver / beeline模式验证
[kfk@bigdata-pro03 spark-2.2.0-bin]$ ./sbin/start-thriftserver.sh
[kfk@bigdata-pro03 spark-2.2.0-bin]$ bin/beeline
!connect jdbc:hive2://bigdata-pro03.kfk.com:10000
或者!connect jdbc:hive2://localhost:10000
Connecting to jdbc:hive2://bigdata-pro03.kfk.com:10000
Enter username for jdbc:hive2://bigdata-pro03.kfk.com:10000: kfk
Enter password for jdbc:hive2://bigdata-pro03.kfk.com:10000: ***
20/06/23 21:09:19 INFO Utils: Supplied authorities: bigdata-pro03.kfk.com:10000
20/06/23 21:09:19 INFO Utils: Resolved authority: bigdata-pro03.kfk.com:10000
20/06/23 21:09:19 INFO HiveConnection: Will try to open client transport with JDBC Uri: jdbc:hive2://bigdata-pro03.kfk.com:10000
Connected to: Spark SQL (version 2.2.0)
Driver: Hive JDBC (version 1.2.1.spark2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
2: jdbc:hive2://bigdata-pro03.kfk.com:10000> select * from kfk.test;
+---------+-----------+--+
| userid
| username
|
+---------+-----------+--+
| 001
| spark
|
| 002
| hive
|
| 003
| hbase
|
| 004
| hadoop
|
+---------+-----------+--+
最后
以上就是幽默绿草为你收集整理的1.8.5 大数据-Spark-SparkSql与Hive集成(spark-shell/spark-sql/beeline)的全部内容,希望文章能够帮你解决1.8.5 大数据-Spark-SparkSql与Hive集成(spark-shell/spark-sql/beeline)所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
发表评论 取消回复