Hive中实现有序，有序concat拼接，有序集合，hive方法操作命令，与自带方法列表

113 阅读 0 评论 75 点赞

我是靠谱客的博主故意海燕，最近开发中收集的这篇文章主要介绍Hive中实现有序，有序concat拼接，有序集合，hive方法操作命令，与自带方法列表，觉得挺不错的，现在分享给大家，希望可以做个参考。

概述

前言

记得以前用过这个函数，这次开发怎么都找不到了，不常用的原因，也是笔记没做好

方法一

GROUP_CONCAT(distinct id ORDER BY id DESC SEPARATOR ‘_’)

好像是用过的
这个报错：Invalid function GROUP_CONCAT
可能是版本问题：当前hive版本：hive-common-2.1.1-cdh6.2.0
                              apache-hive-1.2.1-也没有这个函数
                              1.2.2也没有
我记错了，吧

其他

CONCAT(‘My’, NULL, ‘QL’)
CONCAT_WS(’,’,‘First name’,NULL,‘Last Name’)
CONCAT_WS(SEPARATOR ,collect_set(column))

方法二

concat_ws(’,’,sort_array(collect_set(concat(content_id,’#&’,SCORE))))

缺点是不支持倒序

方法三

自定义udf

方法四变通

concat_ws(’,’,sort_array(collect_set(concat(1-score,’#&’,content_id,’#&’,SCORE)))) item_score
采用序号，row_numbet

后续

hive 方法查看添加删除

hive 函数大全当前版本
发现个问题，注册的udf 删不了了

show functions;
desc function !;
show functions like '*concat*';
drop temporary function ****;

读文件3种方式

1.for line in `cat functions.txt`; do echo "desc function '${line}';"  >> asdf.txt; done

2.
for line in `cat functions.txt`
do
  echo ${line}
done

3.
cat functions.txt | while read line
do
  echo $line
done

4.
while read line
do
  echo $line
done < functions.txt

* 时有问题，当前目录下文件

desc function 'app3.0.log'
desc function 'application_1577181410627_109940.log'
desc function 'asdf.txt'
desc function 'data'
desc function 'dealer.sh'
desc function 'flume'
desc function 'functions.txt'
desc function 'qwer.txt'
desc function 'showallfuncs.sh'
desc function 'sqoop'
desc function 'test'
desc function 'test.sh'
desc function 'wxapp_kafka.log'
desc function 'wxapp_sqlserver_open.sh

hive 自带函数

当前版本250个有3个函数描述文件跨行高亮标出

方法名	方法说明中文	方法说明英文
!	! a-逻辑非	! a - Logical not
!=	a！=b-如果a不等于b，则返回TRUE	a != b - Returns TRUE if a is not equal to b
$sum0	$sum0（x）-返回一组数字的总和，如果为空，则返回零	$sum0(x) - Returns the sum of a set of numbers, zero if empty
%	a%b-返回a除以b时的余数	a % b - Returns the remainder when dividing a by b
&	a&b-按位与	a & b - Bitwise and
*	a*b-将a乘以b	a * b - Multiplies a by b
+	a+b-返回a+b	a + b - Returns a+b
-	a-b-返回差分a-b	a - b - Returns the difference a-b
/	a/b-将a除以b	a / b - Divide a by b
<	a<b-如果a小于b，则返回TRUE	a < b - Returns TRUE if a is less than b
<=	a<=b-如果a不大于b，则返回TRUE	a <= b - Returns TRUE if a is not greater than b
<=>	对于非空操作数，a<=>b-返回相同的结果，如果两个操作数都为null，则返回TRUE；如果其中一个操作数为null，则返回FALSE	a <=> b - Returns same result with EQUAL(=) operator for non-null operands, but returns TRUE if both are NULL, FALSE if one of the them is NULL
<>	a<>b-如果a不等于b，则返回TRUE	a <> b - Returns TRUE if a is not equal to b
=	a=b-如果a等于b，则返回TRUE，否则返回false	a = b - Returns TRUE if a equals b and false otherwise
==	a==b-如果a等于b，则返回TRUE，否则返回false	a == b - Returns TRUE if a equals b and false otherwise
>	a>b-如果a大于b，则返回TRUE	a > b - Returns TRUE if a is greater than b
>=	a>=b-如果a不小于b，则返回TRUE	a >= b - Returns TRUE if a is not smaller than b
^	a^b—按位异或	a ^ b - Bitwise exclusive or
abs	abs（x）-返回x的绝对值	abs(x) - returns the absolute value of x
acos	acos（x）-如果-1<=x<=1，则返回x的反余弦；否则返回NULL	acos(x) - returns the arc cosine of x if -1<=x<=1 or NULL otherwise
add_months	add_months（start_date，num_months，output_date_format）-返回开始日期后num_months的日期。	add_months(start_date, num_months, output_date_format) - Returns the date that is num_months after start_date.
and	a1和a2还有。。。和-逻辑and	a1 and a2 and … and an - Logical and
array	array（n0，n1…）-用给定的元素创建一个数组	array(n0, n1…) - Creates an array with the given elements
array_contains	array_contains（array，value）-如果数组包含值，则返回TRUE。	array_contains(array, value) - Returns TRUE if the array contains value.
ascii	ascii（str）-返回str的第一个字符的数值	ascii(str) - returns the numeric value of the first character of str
asin	asin（x）-如果-1<=x<=1，则返回x的弧正弦；否则返回NULL	asin(x) - returns the arc sine of x if -1<=x<=1 or NULL otherwise
assert_true	assertu true（condition）-如果“condition”不为true，则引发异常。	assert_true(condition) - Throw an exception if ‘condition’ is not true.
atan	返回x的atan（arctan）（x以弧度表示）	atan(x) - returns the atan (arctan) of x (x is in radians)
avg	平均数（a）的返回数集	avg(x) - Returns the mean of a set of numbers
base64	base64（bin）-将参数从二进制转换为base64字符串	base64(bin) - Convert the argument from binary to a base 64 string
between	在a之间[不是]在b和c之间-评估a是否在b和c之间	between a [NOT] BETWEEN b AND c - evaluate if a is [not] in between b and c
bin	bin（n）-以二进制形式返回n	bin(n) - returns n in binary
bround	bround（x[，d]）—使用半偶数舍入模式将x舍入到d个小数位。	bround(x[, d]) - round x to d decimal places using HALF_EVEN rounding mode.
case	CASE a WHEN b THEN c[WHEN d THEN e]*[ELSE f]END-当a=b时，返回c；当a=d时，返回e；否则返回f	CASE a WHEN b THEN c [WHEN d THEN e]* [ELSE f] END - When a = b, returns c; when a = d, return e; else return f
cbrt	cbrt（double）-返回double值的立方根。	cbrt(double) - Returns the cube root of a double value.
ceil	ceil（x）-找到不小于x的最小整数	ceil(x) - Find the smallest integer not smaller than x
ceiling	天花板（x）-找到不小于x的最小整数	ceiling(x) - Find the smallest integer not smaller than x
chr	chr（str）-将n（其中n:[0，256）转换为ascii等效值，作为varchar如果n小于0返回空字符串。如果n>256，返回chr（n%256）。	chr(str) - convert n where n : [0, 256) into the ascii equivalent as a varchar.If n is less than 0 return the empty string. If n > 256, return chr(n % 256).
coalesce	coalesce（a1，a2，…）—返回第一个非空参数	coalesce(a1, a2, …) - Returns the first non-null argument
collect_list	collectu list（x）-返回具有重复项的对象列表	collect_list(x) - Returns a list of objects with duplicates
collect_set	collect_set（x）-返回一组消除了重复元素的对象	collect_set(x) - Returns a set of objects with duplicate elements eliminated
compute_stats	compute_stats（x）-返回一组基元类型值的统计摘要。	compute_stats(x) - Returns the statistical summary of a set of primitive type values.
concat	混凝土（str1，str2。。。strN）-返回str1、str2、。。。strN或concat（bin1，bin2。。。binN）-返回二进制数据bin1，bin2，…中字节的串联。。。宾恩	concat(str1, str2, … strN) - returns the concatenation of str1, str2, … strN or concat(bin1, bin2, … binN) - returns the concatenation of bytes in binary data bin1, bin2, … binN
concat_ws	concat_ws（separator，[string	array（string）]+）—返回由分隔符分隔的字符串的串联。
context_ngrams	上下文语法（expr，array<string1，string2，…>，k，pf）估计符合指定上下文的前k个最频繁的n-gram。第二个参数指定一个字符串，指定n个gram元素的位置，空值代表必须由n-gram元素填充的“blank”。	context_ngrams(expr, array<string1, string2, …>, k, pf) estimates the top-k most frequent n-grams that fit into the specified context. The second parameter specifies a string of words that specify the positions of the n-gram elements, with a null value standing in for a ‘blank’ that must be filled by an n-gram element.
conv	conv（num，from_base，to_base）-将num from_base转换为_base	conv(num, from_base, to_base) - convert num from from_base to to_base
corr	corr（x，y）-返回皮尔逊相关系数	corr(x,y) - Returns the Pearson coefficient of correlation
	在一组数对之间	between a set of number pairs
cos	cos（x）-返回x的余弦（x以弧度表示）	cos(x) - returns the cosine of x (x is in radians)
count	count（*）-返回已检索行的总数，包括包含空值的行。	count(*) - Returns the total number of retrieved rows, including rows containing NULL values.
	count（expr）-返回提供的表达式为非NULL的行数。	count(expr) - Returns the number of rows for which the supplied expression is non-NULL.
	count（DISTINCT expr[，expr…]）—返回所提供表达式唯一且非空的行数。	count(DISTINCT expr[, expr…]) - Returns the number of rows for which the supplied expression(s) are unique and non-NULL.
covar_pop	covar_pop（x，y）-返回一组数对的总体协方差	covar_pop(x,y) - Returns the population covariance of a set of number pairs
covar_samp	covar_samp（x，y）-返回一组数对的样本协方差	covar_samp(x,y) - Returns the sample covariance of a set of number pairs
crc32	crc32（str或bin）-计算字符串或二进制参数的循环冗余校验值，并返回bigint值。	crc32(str or bin) - Computes a cyclic redundancy check value for string or binary argument and returns bigint value.
create_union	create_union（tag，obj1，obj2，obj3，…）—为给定的标记创建一个与对象的联合	create_union(tag, obj1, obj2, obj3, …) - Creates a union with the object for given tag
cume_dist	函数“cume_dist”没有文档	There is no documentation for function ‘cume_dist’
current_database	current_database（）-返回当前使用的数据库名称	current_database() - returns currently using database name
current_date	current_date（）—返回查询计算开始时的当前日期。同一查询中所有当前日期的调用都返回相同的值。	current_date() - Returns the current date at the start of query evaluation. All calls of current_date within the same query return the same value.
current_timestamp	current_timestamp（）—返回查询计算开始时的当前时间戳。在同一个查询中对当前时间戳的所有调用都返回相同的值。	current_timestamp() - Returns the current timestamp at the start of query evaluation. All calls of current_timestamp within the same query return the same value.
current_user	current_user（）—返回当前用户名	current_user() - Returns current user name
date_add	date_add（start_date，num_days）-返回开始日期后num_days的日期。	date_add(start_date, num_days) - Returns the date that is num_days after start_date.
date_format	date_format（date/timestamp/string，fmt）-将日期/时间戳/string转换为日期格式fmt指定格式的字符串值。	date_format(date/timestamp/string, fmt) - converts a date/timestamp/string to a value of string in the format specified by the date format fmt.
date_sub	date_sub（start_date，num_days）-返回开始日期之前num_days的日期。	date_sub(start_date, num_days) - Returns the date that is num_days before start_date.
datediff	datediff（date1，date2）-返回date1和date2之间的天数	datediff(date1, date2) - Returns the number of days between date1 and date2
day	day（param）-返回日期/时间戳所在月份的日期，或interval的day组件	day(param) - Returns the day of the month of date/timestamp, or day component of interval
dayofmonth	dayofmonth（param）-返回日期/时间戳所在月份的日期，或间隔的日组件	dayofmonth(param) - Returns the day of the month of date/timestamp, or day component of interval
dayofweek	dayofweek（param）-返回日期/时间戳的星期几（1=星期日，2=星期一，…，7=星期六）	dayofweek(param) - Returns the day of the week of date/timestamp (1 = Sunday, 2 = Monday, …, 7 = Saturday)
decode	decode（bin，str）-使用第二个参数字符集解码第一个参数	decode(bin, str) - Decode the first argument using the second argument character set
default.produdfone	功能’默认值.produdfone’不存在。	Function ‘default.produdfone’ does not exist.
degrees	度（x）-将弧度转换为度	degrees(x) - Converts radians to degrees
dense_rank	没有关于函数“稠密等级”的文档	There is no documentation for function ‘dense_rank’
div	a div b-将a除以b四舍五入到长整数	a div b - Divide a by b rounded to the long integer
e	e（）—返回e	e() - returns E
elt	elt（n，str1，str2，…）—返回第n个字符串	elt(n, str1, str2, …) - returns the n-th string
encode	encode（str，str）-使用第二个参数字符集对第一个参数进行编码	encode(str, str) - Encode the first argument using the second argument character set
ewah_bitmap	ewah_bitmap（expr）-返回列的ewah压缩位图表示。	ewah_bitmap(expr) - Returns an EWAH-compressed bitmap representation of a column.
ewah_bitmap_and	ewah_bitmap_and（b1，b2）-返回两个位图中按位“与”的ewah压缩位图。	ewah_bitmap_and(b1, b2) - Return an EWAH-compressed bitmap that is the bitwise AND of two bitmaps.
ewah_bitmap_empty	ewah_bitmap_empty（bitmap）-测试ewah压缩位图是否全为零的谓词	ewah_bitmap_empty(bitmap) - Predicate that tests whether an EWAH-compressed bitmap is all zeros
ewah_bitmap_or	ewah_bitmap_or（b1，b2）-返回两个位图中按位或的ewah压缩位图。	ewah_bitmap_or(b1, b2) - Return an EWAH-compressed bitmap that is the bitwise OR of two bitmaps.
exp	返回x的幂	exp(x) - Returns e to the power of x
explode	分解（a）-将数组a的元素拆分为多行，或将映射的元素拆分为多行和多列	explode(a) - separates the elements of array a into multiple rows, or the elements of a map into multiple rows and columns
factorial	factorial（int）-返回n个factorial。有效n为[0…20]。	factorial(int) - Returns n factorial. Valid n is [0…20].
field	字段（str，str1，str2，…）—返回str1，str2，….中str的索引，。。。列表或0（如果未找到）	field(str, str1, str2, …) - returns the index of str in the str1,str2,… list or 0 if not found
find_in_set	find_in_set（str，str_array）-返回str_数组中第一个出现的str，其中str_array是逗号分隔的字符串。如果任一参数为null，则返回null。如果第一个参数有逗号，则返回0。	find_in_set(str,str_array) - Returns the first occurrence of str in str_array where str_array is a comma-delimited string. Returns null if either argument is null. Returns 0 if the first argument has any commas.
first_value	函数“first_value”没有文档	There is no documentation for function ‘first_value’
floor	floor（x）-查找不大于x的最大整数	floor(x) - Find the largest integer not greater than x
floor_day	floor_day（param）-返回一天粒度的时间戳	floor_day(param) - Returns the timestamp at a day granularity
floor_hour	floor_hour（param）-返回小时粒度的时间戳	floor_hour(param) - Returns the timestamp at a hour granularity
floor_minute	floor_minute（param）-以分钟粒度返回时间戳	floor_minute(param) - Returns the timestamp at a minute granularity
floor_month	floor_month（param）-返回月份粒度的时间戳	floor_month(param) - Returns the timestamp at a month granularity
floor_quarter	floor_quarter（param）-返回四分之一粒度的时间戳	floor_quarter(param) - Returns the timestamp at a quarter granularity
floor_second	flooru second（param）-返回秒粒度的时间戳	floor_second(param) - Returns the timestamp at a second granularity
floor_week	flooru week（param）-以周粒度返回时间戳	floor_week(param) - Returns the timestamp at a week granularity
floor_year	floor_year（param）-返回以年为单位的时间戳	floor_year(param) - Returns the timestamp at a year granularity
format_number	format_number（X，D或F）-将数字X格式化为“#，###，############################。如果D为0，则结果没有小数点或小数部分。它的功能应该类似于MySQL的格式	format_number(X, D or F) - Formats the number X to a format like ‘#,###,###.##’, rounded to D decimal places, Or Uses the format specified F to format, and returns the result as a string. If D is 0, the result has no decimal point or fractional part. This is supposed to function like MySQL’s FORMAT
from_unixtime	fromu unixtime（unix_time，format）-返回指定格式的unix时间	from_unixtime(unix_time, format) - returns unix_time in the specified format
from_utc_timestamp	from_utc_timestamp（timestamp，string timezone）-假定给定的时间戳为utc并转换为给定的时区（从配置单元0.8.0开始）	from_utc_timestamp(timestamp, string timezone) - Assumes given timestamp is UTC and converts to given timezone (as of Hive 0.8.0)
get_json_object	get_json_object（json_txt，path）-从path中提取一个json对象	get_json_object(json_txt, path) - Extract a json object from path
get_splits	get_splits（string，int）-返回被引用表string的长度为int的序列化splits数组。	get_splits(string,int) - Returns an array of length int serialized splits for the referenced tables string.
greatest	最大值（v1，v2，…）—返回值列表中的最大值	greatest(v1, v2, …) - Returns the greatest value in a list of values
grouping	分组（a，b）-指示中的指定列表达式是否聚合。返回1表示聚合，返回0表示未聚合。	grouping(a, b) - Indicates whether a specified column expression in is aggregated or not. Returns 1 for aggregated or 0 for not aggregated.
hash	hash（a1，a2，…）—返回参数的哈希值	hash(a1, a2, …) - Returns a hash value of the arguments
hex	十六进制（n、bin或str）-将参数转换为十六进制	hex(n, bin, or str) - Convert the argument to hexadecimal
histogram_numeric	histogram_numeric（expr，nb）-使用nb bin计算数值“expr”的直方图。	histogram_numeric(expr, nb) - Computes a histogram on numeric ‘expr’ using nb bins.
hour	hour（param）-返回字符串/timestamp/interval的小时组成	hour(param) - Returns the hour componemnt of the string/timestamp/interval
if	IF（expr1，expr2，expr3）-如果expr1为真（expr1<>0和expr1<>NULL），则IF（）返回expr2；否则返回expr3。IF（）返回数值或字符串值，具体取决于使用它的上下文。	IF(expr1,expr2,expr3) - If expr1 is TRUE (expr1 <> 0 and expr1 <> NULL) then IF() returns expr2; otherwise it returns expr3. IF() returns a numeric or string value, depending on the context in which it is used.
in	test in（val1，val2…）-如果test等于任何valN，则返回true	test in(val1, val2…) - returns true if test equals any valN
in_file	in_file（str，filename）-如果str出现在文件中，则返回true	in_file(str, filename) - Returns true if str appears in the file
index	index（a，n）-返回	index(a, n) - Returns the n-th element of a
initcap	initcap（str）-返回str，每个单词的第一个字母都是大写，所有其他字母都是小写。单词用空格分隔。	initcap(str) - Returns str, with the first letter of each word in uppercase, all other letters in lowercase. Words are delimited by white space.
inline	inline（ARRAY（STRUCT（）[，STRUCT（）]）-将数组和结构分解为表	inline( ARRAY( STRUCT()[,STRUCT()] - explodes and array and struct into a table
instr	instr（str，substr）-返回str中第一次出现substr的索引	instr(str, substr) - Returns the index of the first occurance of substr in str
internal_interval	内部间隔（intervalType，intervalArg）	internal_interval(intervalType,intervalArg)
isnotnull	isnotnull a-如果a不为NULL，则返回true，否则返回false	isnotnull a - Returns true if a is not NULL and false otherwise
isnull	isnull a-如果a为NULL，则返回true，否则返回false	isnull a - Returns true if a is NULL and false otherwise
java_method	java_方法（class，method[，arg1[，arg2…]]）使用反射调用方法	java_method(class,method[,arg1[,arg2…]]) calls method with reflection
json_tuple	json元组（jsonStr，p1，p2，…，pn）类似get_json_对象，但它使用多个名称并返回一个元组。所有的输入参数和输出列类型都是字符串。	json_tuple(jsonStr, p1, p2, …, pn) - like get_json_object, but it takes multiple names and return a tuple. All the input parameters and output column types are string.
lable.produdfone	功能’produdfone标签’不存在。	Function ‘lable.produdfone’ does not exist.
lag	LAG（标量表达式[，offset][，default]）OVER（[query_partition_clause]order_by_子句）；LAG函数用于访问前一行的数据。	LAG (scalar_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause); The LAG function is used to access data from a previous row.
last_day	last_day（date）-返回日期所属月份的最后一天。	last_day(date) - Returns the last day of the month which the date belongs to.
last_value	函数“last_value”没有文档	There is no documentation for function ‘last_value’
lcase	lcase（str）-返回所有字符都改为小写的str	lcase(str) - Returns str with all characters changed to lowercase
lead	LEAD（标量_expression[，offset][，default]）OVER（[query_partition_clause]order_by_子句）；LEAD函数用于从下一行返回数据。	LEAD (scalar_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause); The LEAD function is used to return data from the next row.
least	least（v1，v2，…）—返回值列表中的最小值	least(v1, v2, …) - Returns the least value in a list of values
length	length（str	binary）-返回str的长度或二进制数据中的字节数
levenshtein	levenshtein（str1，str2）-此函数计算两个字符串之间的levenshtein距离。	levenshtein(str1, str2) - This function calculates the Levenshtein distance between two strings.
like	like（str，pattern）-检查str是否与pattern匹配	like(str, pattern) - Checks if str matches pattern
ln	ln（x）-返回x的自然对数	ln(x) - Returns the natural logarithm of x
locate	locate（substr，str[，pos]）—返回str中第一个在pos位置之后出现的substr的位置	locate(substr, str[, pos]) - Returns the position of the first occurance of substr in str after position pos
log	log（[b]，x）-返回以b为底的x的对数	log([b], x) - Returns the logarithm of x with base b
log10	log10（x）-返回以10为底的x的对数	log10(x) - Returns the logarithm of x with base 10
log2	log2（x）-返回以2为底的x的对数	log2(x) - Returns the logarithm of x with base 2
logged_in_user	logged_in_user（）-返回登录用户名	logged_in_user() - Returns logged in user name
lower	lower（str）-返回所有字符都改为小写的str	lower(str) - Returns str with all characters changed to lowercase
lpad	lpad（str，len，pad）-返回str，left padded with pad的长度为len	lpad(str, len, pad) - Returns str, left-padded with pad to a length of len
ltrim	ltrim（str）-删除str中的前导空格字符	ltrim(str) - Removes the leading space characters from str
map	map（key0，value0，key1，value1…）-使用给定的键/值对创建映射	map(key0, value0, key1, value1…) - Creates a map with the given key/value pairs
map_keys	map_keys（map）-返回包含输入映射键的无序数组。	map_keys(map) - Returns an unordered array containing the keys of the input map.
map_values	map_values（map）-返回包含输入映射值的无序数组。	map_values(map) - Returns an unordered array containing the values of the input map.
mask	屏蔽给定值	masks the given value
mask_first_n	屏蔽值的前n个字符	masks the first n characters of the value
mask_hash	返回给定值的哈希值	returns hash of the given value
mask_last_n	遮罩值的最后n个字符	masks the last n characters of the value
mask_show_first_n	屏蔽值的前n个字符之外的所有字符	masks all but first n characters of the value
mask_show_last_n	屏蔽值的最后n个字符	masks all but last n characters of the value
matchpath	没有函数“matchpath”的文档	There is no documentation for function ‘matchpath’
max	max（expr）-返回expr的最大值	max(expr) - Returns the maximum value of expr
md5	md5（str或bin）-计算字符串或二进制文件的md5128位校验和。	md5(str or bin) - Calculates an MD5 128-bit checksum for the string or binary.
min	min（expr）-返回expr的最小值	min(expr) - Returns the minimum value of expr
minute	minute（param）-返回字符串/timestamp/interval的分钟组件	minute(param) - Returns the minute component of the string/timestamp/interval
month	month（param）-返回日期/时间戳/间隔的月份组件	month(param) - Returns the month component of the date/timestamp/interval
months_between	monthsu between（date1，date2）-返回日期date1和date2之间的月数	months_between(date1, date2) - returns number of months between dates date1 and date2
named_struct	named_struct（name1，val1，name2，val2，…）—使用给定的字段名和值创建一个结构	named_struct(name1, val1, name2, val2, …) - Creates a struct with the given field names and values
negative	负a-返回-a	negative a - Returns -a
next_day	next_day（start_date，week的day）-返回晚于start_date并按指示命名的第一个日期。	next_day(start_date, day_of_week) - Returns the first date which is later than start_date and named as indicated.
ngrams	由串的数组组成的数组pf’是控制内存使用的可选精度因子。	ngrams(expr, n, k, pf) - Estimates the top-k n-grams in rows that consist of sequences of strings, represented as arrays of strings, or arrays of arrays of strings. ‘pf’ is an optional precision factor that controls memory usage.
noop	没有函数“noop”的文档	There is no documentation for function ‘noop’
noopstreaming	没有“noopstreaming”函数的文档	There is no documentation for function ‘noopstreaming’
noopwithmap	没有函数“noopwithmap”的文档	There is no documentation for function ‘noopwithmap’
noopwithmapstreaming	函数“noopwithmapstreaming”没有文档	There is no documentation for function ‘noopwithmapstreaming’
not	不是-逻辑上不是	not a - Logical not
ntile	没有函数“ntile”的文档	There is no documentation for function ‘ntile’
nvl	nvl（value，default_value）-如果value为null，则返回默认值，否则返回value	nvl(value,default_value) - Returns default value if value is null else returns value
or	a1或a2或。。。或-逻辑or	a1 or a2 or … or an - Logical or
parse_url	parse_url（url，partToExtract[，key]）-从url中提取部分	parse_url(url, partToExtract[, key]) - extracts a part from a URL
parse_url_tuple	parse-url元组（url，partname1，partname2，…，partnameN）-从url中提取N（N>=1）个部分。	parse_url_tuple(url, partname1, partname2, …, partnameN) - extracts N (N>=1) parts from a URL.
	它接受一个URL和一个或多个partname，并返回一个tuple。所有的输入参数和输出列类型都是字符串。	It takes a URL and one or multiple partnames, and returns a tuple. All the input parameters and output column types are string.
percent_rank	没有“percent_rank”函数的文档	There is no documentation for function ‘percent_rank’
percentile	percentile（expr，pc）-返回pc（范围：[0,1]）处expr的百分比。pc可以是双数组或双数组	percentile(expr, pc) - Returns the percentile(s) of expr at pc (range: [0,1]).pc can be a double or double array
percentile_approx	percentile_approach（expr，pc，[nb]）-对于非常大的数据，使用可选参数[nb]作为要使用的直方图箱数，从直方图计算近似百分位值。nb值越大，近似值就越精确，代价是内存使用率越高。	percentile_approx(expr, pc, [nb]) - For very large data, computes an approximate percentile value from a histogram, using the optional argument [nb] as the number of histogram bins to use. A higher value of nb results in a more accurate approximation, at the cost of higher memory usage.
pi	pi（）—返回pi	pi() - returns pi
pmod	pmodb-计算正模	a pmod b - Compute the positive modulo
posexplode	posexplode（a）-行为类似于数组的explode，但包含了原始数组中项的位置	posexplode(a) - behaves like explode for arrays, but includes the position of items in the original array
positive	正a-返回a	positive a - Returns a
pow	pow（x1，x2）-将x1提升到x2的幂	pow(x1, x2) - raise x1 to the power of x2
power	功率（x1，x2）-将x1提升到x2的幂	power(x1, x2) - raise x1 to the power of x2
printf	printf（字符串格式，对象。。。args）-可以根据printf样式格式字符串格式化字符串的函数	printf(String format, Obj… args) - function that can format strings according to printf-style format strings
quarter	quarter（date/timestamp/string）-返回日期所在的季度，范围为1到4。	quarter(date/timestamp/string) - Returns the quarter of the year for date, in the range 1 to 4.
radians	弧度（x）-将度数转换为弧度	radians(x) - Converts degrees to radians
rand	rand（[seed]）—返回介于0和1之间的伪随机数	rand([seed]) - Returns a pseudorandom number between 0 and 1
rank	没有函数“rank”的文档	There is no documentation for function ‘rank’
reflect	reflect（class，method[，arg1[，arg2…]]）使用反射调用方法	reflect(class,method[,arg1[,arg2…]]) calls method with reflection
reflect2	reflect2（arg0，method[，arg1[，arg2…]]）使用反射调用arg0的方法	reflect2(arg0,method[,arg1[,arg2…]]) calls method of arg0 with reflection
regexp	str regexp regexp-如果str与regexp匹配，则返回true，否则返回false	str regexp regexp - Returns true if str matches regexp and false otherwise
regexp_extract	regexp_extract（str，regexp[，idx]）-提取与regexp匹配的组	regexp_extract(str, regexp[, idx]) - extracts a group that matches regexp
regexp_replace	regexp_replace（str，regexp，rep）-用rep替换匹配regexp的str的所有子字符串	regexp_replace(str, regexp, rep) - replace all substrings of str that match regexp with rep
repeat	重复（str，n）-重复str n次	repeat(str, n) - repeat str n times
replace	replace（str，search，rep）-将“search”与“rep”匹配的所有子字符串替换为“str”	replace(str, search, rep) - replace all substrings of ‘str’ that match ‘search’ with ‘rep’
reverse	反向（str）-反向str	reverse(str) - reverse str
rlike	str rlike regexp-如果str与regexp匹配，则返回true，否则返回false	str rlike regexp - Returns true if str matches regexp and false otherwise
round	舍入（x[，d]）—将x舍入到d个小数位	round(x[, d]) - round x to d decimal places
row_number	没有“row_number”函数的文档	There is no documentation for function ‘row_number’
rpad	rpad（str，len，pad）-返回str，右填充pad到len的长度	rpad(str, len, pad) - Returns str, right-padded with pad to a length of len
rtrim	rtrim（str）-删除str中的尾随空格字符	rtrim(str) - Removes the trailing space characters from str
second	second（date）-返回字符串/timestamp/interval的第二个组件	second(date) - Returns the second component of the string/timestamp/interval
sentences	句子（str，lang，country）-将str拆分为句子数组，其中每个句子都是一个单词数组。“lang”和“country”参数是可选的，如果省略，则使用默认的区域设置。	sentences(str, lang, country) - Splits str into arrays of sentences, where each sentence is an array of words. The ‘lang’ and’country’ arguments are optional, and if omitted, the default locale is used.
sha	sha（str或bin）-计算字符串或二进制的sha-1摘要，并以十六进制字符串的形式返回值。	sha(str or bin) - Calculates the SHA-1 digest for string or binary and returns the value as a hex string.
sha1	sha1（str或bin）-计算字符串或二进制的SHA-1摘要，并以十六进制字符串的形式返回值。	sha1(str or bin) - Calculates the SHA-1 digest for string or binary and returns the value as a hex string.
sha2	sha2（string/binary，len）-计算SHA-2哈希函数族（SHA-224、SHA-256、SHA-384和SHA-512）。	sha2(string/binary, len) - Calculates the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512).
shiftleft	左移（a，b）-按位左移	shiftleft(a, b) - Bitwise left shift
shiftright	shiftright（a，b）-按位右移	shiftright(a, b) - Bitwise right shift
shiftrightunsigned	shiftrightunsigned（a，b）-位无符号右移	shiftrightunsigned(a, b) - Bitwise unsigned right shift
sign	sign（x）-返回x的符号	sign(x) - returns the sign of x )
sin	sin（x）-返回x的正弦值（x以弧度为单位）	sin(x) - returns the sine of x (x is in radians)
size	size（a）-返回a的大小	size(a) - Returns the size of a
sort_array	sort_array（array（obj1，obj2，…）—根据数组元素的自然顺序对输入数组进行升序排序。	sort_array(array(obj1, obj2,…)) - Sorts the input array in ascending order according to the natural ordering of the array elements.
soundex	soundex（string）-返回字符串的soundex代码。	soundex(string) - Returns soundex code of the string.
space	space（n）-返回n个空格	space(n) - returns n spaces
split	split（str，regex）-围绕匹配regex的事件拆分str	split(str, regex) - Splits str around occurances that match regex
sqrt	sqrt（x）-返回x的平方根	sqrt(x) - returns the square root of x
stack	堆栈（n，cols…）-将k列转换为n行，每行大小为k/n	stack(n, cols…) - turns k columns into n rows of size k/n each
std	std（x）-返回一组数字的标准偏差	std(x) - Returns the standard deviation of a set of numbers
stddev	stddev（x）-返回一组数字的标准偏差	stddev(x) - Returns the standard deviation of a set of numbers
stddev_pop	stddev_pop（x）-返回一组数字的标准偏差	stddev_pop(x) - Returns the standard deviation of a set of numbers
stddev_samp	stddev_samp（x）-返回一组数字的样本标准偏差	stddev_samp(x) - Returns the sample standard deviation of a set of numbers
str_to_map	str_to_map（text，delimiter1，delimiter2）-通过解析文本创建映射	str_to_map(text, delimiter1, delimiter2) - Creates a map by parsing text
struct	struct（col1，col2，col3，…）—用给定的字段值创建一个结构	struct(col1, col2, col3, …) - Creates a struct with the given field values
substr	substr（str，pos[，len]）-返回从pos开始的长度为len的str的子字符串或substr（bin，pos[，len]）-返回从pos开始，长度为len的字节数组的片段	substr(str, pos[, len]) - returns the substring of str that starts at pos and is of length len orsubstr(bin, pos[, len]) - returns the slice of byte array that starts at pos and is of length len
substring	substring（str，pos[，len]）-返回从pos开始的长度为len或substring（bin，pos[，len]）的str子字符串-返回从pos开始、长度为len的字节数组的片段	substring(str, pos[, len]) - returns the substring of str that starts at pos and is of length len orsubstring(bin, pos[, len]) - returns the slice of byte array that starts at pos and is of length len
substring_index	substring_index（str，delim，count）-返回string str中分隔符delim出现count次之前的子字符串。	substring_index(str, delim, count) - Returns the substring from string str before count occurrences of the delimiter delim.
sum	sum（x）-返回一组数字的总和	sum(x) - Returns the sum of a set of numbers
tan	tan（x）-返回x的正切值（x以弧度表示）	tan(x) - returns the tangent of x (x is in radians)
to_date	结束日期（expr）-提取date或datetime表达式expr的日期部分	to_date(expr) - Extracts the date part of the date or datetime expression expr
to_unix_timestamp	to_unix_timestamp（date[，pattern]）-返回unix时间戳	to_unix_timestamp(date[, pattern]) - Returns the UNIX timestamp
to_utc_timestamp	to_utc_timestamp（timestamp，string timezone）-假设给定的时间戳在给定的时区中并转换为utc（从配置单元0.8.0开始）	to_utc_timestamp(timestamp, string timezone) - Assumes given timestamp is in given timezone and converts to UTC (as of Hive 0.8.0)
translate	translate（input，from，to）-通过将from字符串中的字符替换为to字符串中的相应字符来转换输入字符串	translate(input, from, to) - translates the input string by replacing the characters present in the from string with the corresponding characters in the to string
trim	trim（str）-删除str中的前导和尾随空格字符	trim(str) - Removes the leading and trailing space characters from str
trunc	trunc（date，fmt）-返回日期，其中一天的时间部分被截断为格式模型fmt指定的单位。如果省略fmt，则日期将被截断为最近的一天。它现在只支持’MONTH’/‘MON’/‘MM’和’YEAR’/‘YYYY’/'YY’作为格式。	trunc(date, fmt) - Returns returns date with the time portion of the day truncated to the unit specified by the format model fmt. If you omit fmt, then date is truncated to the nearest day. It now only supports ‘MONTH’/‘MON’/‘MM’ and ‘YEAR’/‘YYYY’/‘YY’ as format.
ucase	ucase（str）-返回str，所有字符都改为大写	ucase(str) - Returns str with all characters changed to uppercase
unbase64	unbase64（str）-将参数从base64字符串转换为二进制	unbase64(str) - Convert the argument from a base 64 string to binary
unhex	unhex（str）-将十六进制参数转换为二进制	unhex(str) - Converts hexadecimal argument to binary
unix_timestamp	unix_timestamp（date[，pattern]）-将时间转换为数字	unix_timestamp(date[, pattern]) - Converts the time to a number
upper	upper（str）-返回所有字符都改为大写的str	upper(str) - Returns str with all characters changed to uppercase
uuid	uuid（）—返回通用唯一标识符（uuid）字符串。	uuid() - Returns a universally unique identifier (UUID) string.
var_pop	var_pop（x）-返回一组数字的方差	var_pop(x) - Returns the variance of a set of numbers
var_samp	var_samp（x）-返回一组数字的样本方差	var_samp(x) - Returns the sample variance of a set of numbers
variance	variance（x）-返回一组数字的方差	variance(x) - Returns the variance of a set of numbers
version	version（）—返回配置单元内部版本字符串—包括基本版本和修订。	version() - Returns the Hive build version string - includes base version and revision.
weekofyear	weekofyear（date）-返回给定日期所在的一年中的某一周。一周从星期一开始，第1周是第一周，超过3天。	weekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days.
when	CASE WHEN a THEN b[WHEN c THEN d]*[ELSE e]END-当a=true时，返回b；当c=true时，返回d；ELSE返回e	CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END - When a = true, returns b; when c = true, return d; else return e
windowingtablefunction	没有函数“windowingtablefunction”的文档	There is no documentation for function ‘windowingtablefunction’
xpath	xpath（xml，xpath）-返回xml节点中与xpath表达式匹配的值的字符串数组	xpath(xml, xpath) - Returns a string array of values within xml nodes that match the xpath expression
xpath_boolean	xpath_boolean（xml，xpath）-计算布尔xpath表达式	xpath_boolean(xml, xpath) - Evaluates a boolean xpath expression
xpath_double	xpath_double（xml，xpath）-返回与xpath表达式匹配的双精度值	xpath_double(xml, xpath) - Returns a double value that matches the xpath expression
xpath_float	xpath_float（xml，xpath）-返回与xpath表达式匹配的浮点值	xpath_float(xml, xpath) - Returns a float value that matches the xpath expression
xpath_int	xpath_int（xml，xpath）-返回与xpath表达式匹配的整数值	xpath_int(xml, xpath) - Returns an integer value that matches the xpath expression
xpath_long	xpath_long（xml，xpath）-返回与xpath表达式匹配的long值	xpath_long(xml, xpath) - Returns a long value that matches the xpath expression
xpath_number	xpath_number（xml，xpath）-返回与xpath表达式匹配的双精度值	xpath_number(xml, xpath) - Returns a double value that matches the xpath expression
xpath_short	xpath_short（xml，xpath）-返回与xpath表达式匹配的短值	xpath_short(xml, xpath) - Returns a short value that matches the xpath expression
xpath_string	xpath_string（xml，xpath）-返回与xpath表达式匹配的第一个xml节点的文本内容	xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the xpath expression
year	year（param）-返回日期/时间戳/间隔的年份组件	year(param) - Returns the year component of the date/timestamp/interval
==	\|	==a
~	~n-按位不是	~ n - Bitwise not

最后

以上就是故意海燕为你收集整理的Hive中实现有序，有序concat拼接，有序集合，hive方法操作命令，与自带方法列表的全部内容，希望文章能够帮你解决Hive中实现有序，有序concat拼接，有序集合，hive方法操作命令，与自带方法列表所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错，欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供，作为学习参考使用，或来自网络收集整理，版权属于原作者所有。

本文分类：大数据
浏览次数：113 次浏览
发布日期：2023-12-24 12:30:10
本文链接：https://www.kaopuke.com/article/k-p-k_13_u_23_o_2_f4_13_zgx.html

Hive中实现有序，有序concat拼接，有序集合，hive方法操作命令，与自带方法列表

概述

前言

方法一

其他

方法二

方法三

方法四变通

后续

hive 方法查看添加删除

读文件3种方式

hive 自带函数

最后

评论列表共有 0 条评论

发表评论取消回复

Hive中实现有序，有序concat拼接，有序集合，hive方法操作命令，与自带方法列表

概述

前言

方法一

其他

方法二

方法三

方法四 变通

后续

hive 方法 查看 添加 删除

读文件3种方式

hive 自带函数

最后

相关文章

评论列表共有 0 条评论

发表评论 取消回复

方法四变通

hive 方法查看添加删除

发表评论取消回复