hive的数据定义之创建数据库和表

225 阅读 0 评论 149 点赞

我是靠谱客的博主如意小白菜，这篇文章主要介绍hive的数据定义之创建数据库和表，现在分享给大家，希望可以做个参考。

1、对数据库的操作

　　create database hive_db　　//创建数据库hive_db

　　create table hive_db.test（字段）　　//在数据库hive_db中创建test表

　　create database student_db location '/user/hive/student.db'　　//创建数据库student_db，但是在hdfs中显示student.db，在hive控制端中显示studentdb（在有location的情况下）

　　create database if not exists hive_db

　　show databases like 'hive*'　　//结果为hive_db

　　drop database hive_db　　//这种方式只能删除空数据库

　　drop database studentdb casecade　　//强制删除非空数据库

　　describe database hive_db　　//显示数据库的信息

　　create database teacherdb comment "数据库teacherdb的备注"

2、对表的操作

　　create table if not exists hive_db.t1(字段)　　//在数据库hive_db中创建表t1

　　show tables in hive_db like "t*"　　//在数据库hive_db中寻找以t开头的表。

　　create table student1 as select * from stu;　　//复制表及其数据

　　describe extended records;　　//查看表信息

　　describe formatted records　　//查看表详细信息

2.1、内部表与外部表的相互转换：

　　alter table student set tblproperties("EXTERNAL"="TRUE");　　//内部表转换为外部表

　　alter table student set tblproperties("EXTERNAL"="FALSE");　　//外部表转换为内部表

2.2、分区表（分区在hdfs上其实是目录，分区名不是表结构中的字段名而是在创建表和分区时另外加的）：

　　create table stu_partition(id int,name string)

　　partitioned by (month string)

　　row format delimited fields terminited by 't';

　　此表名为stu_partition按照月份来分区。

　　上传数据到分区表：

　　load data local inpath '/home/hdc/Document/student1.txt' into table stu_partition partition(month="201906");

　　分区表查找：

　　select * from stu_partition;　　//查找分区表中的所有记录；

　　select * from stu_partition where month="201906"　　//查找分区表中分区名201906中的所有记录　　

　　查看分区：

　　show partitions stu_partition;

　　增加分区：

　　alter table stu_partition add partition (month="201908");

　　alter table stu_partition add partition (month="201909") partition (month="201910");

　　删除分区：

　　alter table stu_partition drop partition(month="201908");

　　alter table stu_partition drop partition(month="201909"),partition (month="201910");

　　ps:二级分区指的是2个分区字段，按照字段的顺序来设置分区顺序，例如：partition(month="201909",day="01")就是一个二级分区，其目录结构是day文件夹是month文件夹的子文件夹。

　利用Hadoop和hive命令创建分区的区别：

　　其实Hadoop命令创建分区就是在数据仓库中的表下创建一个文件夹，若将数据导入Hadoop命令创建的分区，再利用hive的select语句查询，将查询不到结果。这是因为Hadoop命令创建

　　的分区在hive中没有关于此分区的元数据信息。

　　而利用hive命令创建的分区不仅会在hdfs上的hive数据仓库中创建相应的文件夹，而且还将此文件夹在hdfs上的信息（元数据）存储在hive中的matestore数据库中。

　解决方法：

　　（1）msck repair table stu_partition;

　　（2）alter table stu_partition add partition(month="201911");

　　　　//此方法为分区表在hdfs上创建文件夹和在hive中创建此文件夹的元数据，之前因为利用Hadoop命令手动创建了文件夹故现在只需创建元数据。

　　（3）正常上传数据即load data local inpath '/home/hdc/Document/student1.txt' into table stu_partition partition(month="201911");

删除表：

　　drop table if exists stu_partition；

修改表：

　　表重命名：alter table stu_partition rename to student_partition;

　　修改表中列信息：alter table student_partition change columns id student_id int;

　　增加列：alter table student_partition add columns(

　　　　　　　　ClassId int commet "备注信息"，

　　　　　　　　ClassName string comment "备注信息"

　　　　　　);

　　删除或者替换列：alter table student_partition replace columns(

　　　　　　　　　　　　id string commet "备注信息"，

　　　　　　　　　　　　name string commet "备注信息"

　　　　　　　　　　);//此种替换是指将所用列全部删除再来新建以上两列。、

PS：alter语句改变的是表的元数据信息而不是真正的数据。

转载于:https://www.cnblogs.com/hdc520/p/11094215.html

最后

以上就是如意小白菜最近收集整理的关于hive的数据定义之创建数据库和表的全部内容，更多相关hive内容请搜索靠谱客的其他文章。

本图文内容来源于网友提供，作为学习参考使用，或来自网络收集整理，版权属于原作者所有。

本文分类：数据库
浏览次数：225 次浏览
发布日期：2023-11-17 03:45:08
本文链接：https://www.kaopuke.com/article/k-p-k_13_u_23_o_14_f2_13__7__26_4.html

hive的数据定义之创建数据库和表

最后

评论列表共有 0 条评论

发表评论取消回复

hive的数据定义之创建数据库和表

最后

相关文章

评论列表共有 0 条评论

发表评论 取消回复

发表评论取消回复