【Spark】 Spark beeline简单使用操作说明可能遇到的问题配置文件

62 阅读 0 评论 41 点赞

我是靠谱客的博主优美便当，最近开发中收集的这篇文章主要介绍【Spark】 Spark beeline简单使用操作说明可能遇到的问题配置文件，觉得挺不错的，现在分享给大家，希望可以做个参考。

概述

操作

Spark Thrift Server 是 Spark 社区基于 HiveServer2 实现的一个 Thrift 服务。旨在无缝兼容
HiveServer2。因为 Spark Thrift Server 的接口和协议都和 HiveServer2 完全一致，因此我们部
署好 Spark Thrift Server 后，可以直接使用 hive 的 beeline 访问 Spark Thrift Server 执行相关
语句。Spark Thrift Server 的目的也只是取代 HiveServer2，因此它依旧可以和 Hive Metastore
进行交互，获取到 hive 的元数据。
如果想连接 Thrift Server，需要通过以下几个步骤：
➢ Spark 要接管 Hive 需要把 hive-site.xml 拷贝到 conf/目录下
➢ 把 Mysql 的驱动 copy 到 jars/目录下
➢ 如果访问不到 hdfs，则需要把 core-site.xml 和 hdfs-site.xml 拷贝到 conf/目录下
➢ 启动 Thrift Server

sbin/start-thriftserver.sh

➢ 使用 beeline 连接 Thrift Server

bin/beeline -u jdbc:hive2://linux1:10000 -n root

说明

使用了Spark Thrift Server之后，我们无需再启动HiveServer2 的服务

可能遇到的问题

如果运行select语句碰到Class com.hadoop.compression.lzo.LzoCodec not found的问题可以参考这里

配置文件

这里附上我的配置文件

hive-site.xml

	<!--显示当前数据库以及查询表的头信息-->
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
<!--关闭元数据的检查-->
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<!--使用tez作为引擎-->
<property>
<name>hive.execution.engine</name>
<value>tez</value>
</property>

core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!-- 指定HDFS中NameNode的地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop102:9000</value>
</property>
<!-- 指定Hadoop运行时产生文件的存储目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/module/hadoop-2.7.2/data/tmp</value>
</property>
<property>
<name>io.compression.codecs</name>
<value>
org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.DefaultCodec,
org.apache.hadoop.io.compress.BZip2Codec,
org.apache.hadoop.io.compress.SnappyCodec,
com.hadoop.compression.lzo.LzoCodec,
com.hadoop.compression.lzo.LzopCodec
</value>
</property>
<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
</configuration>

hdfs-site.xml

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<!-- 指定Hadoop辅助名称节点主机配置 -->
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop104:50090</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>