概述
前言
记得以前用过这个函数,这次开发怎么都找不到了,不常用的原因,也是笔记没做好
方法一
- GROUP_CONCAT(distinct id ORDER BY id DESC SEPARATOR ‘_’)
好像是用过的
这个报错:Invalid function GROUP_CONCAT
可能是版本问题:当前hive版本:hive-common-2.1.1-cdh6.2.0
apache-hive-1.2.1-也没有这个函数
1.2.2也没有
我记错了,吧
其他
- CONCAT(‘My’, NULL, ‘QL’)
- CONCAT_WS(’,’,‘First name’,NULL,‘Last Name’)
- CONCAT_WS(SEPARATOR ,collect_set(column))
方法二
- concat_ws(’,’,sort_array(collect_set(concat(content_id,’#&’,SCORE))))
缺点是不支持倒序
方法三
- 自定义udf
方法四 变通
- concat_ws(’,’,sort_array(collect_set(concat(1-score,’#&’,content_id,’#&’,SCORE)))) item_score
- 采用序号,row_numbet
后续
hive 方法 查看 添加 删除
- hive 函数大全当前版本
- 发现个问题,注册的udf 删不了了
show functions;
desc function !;
show functions like '*concat*';
drop temporary function ****;
读文件3种方式
1.for line in `cat functions.txt`; do echo "desc function '${line}';" >> asdf.txt; done
2.
for line in `cat functions.txt`
do
echo ${line}
done
3.
cat functions.txt | while read line
do
echo $line
done
4.
while read line
do
echo $line
done < functions.txt
* 时有问题,当前目录下文件
desc function 'app3.0.log'
desc function 'application_1577181410627_109940.log'
desc function 'asdf.txt'
desc function 'data'
desc function 'dealer.sh'
desc function 'flume'
desc function 'functions.txt'
desc function 'qwer.txt'
desc function 'showallfuncs.sh'
desc function 'sqoop'
desc function 'test'
desc function 'test.sh'
desc function 'wxapp_kafka.log'
desc function 'wxapp_sqlserver_open.sh
hive 自带函数
- 当前版本250个 有3个函数描述文件跨行 高亮标出
方法名 | 方法说明中文 | 方法说明英文 |
---|---|---|
! | ! a-逻辑非 | ! a - Logical not |
!= | a!=b-如果a不等于b,则返回TRUE | a != b - Returns TRUE if a is not equal to b |
$sum0 | $sum0(x)-返回一组数字的总和,如果为空,则返回零 | $sum0(x) - Returns the sum of a set of numbers, zero if empty |
% | a%b-返回a除以b时的余数 | a % b - Returns the remainder when dividing a by b |
& | a&b-按位与 | a & b - Bitwise and |
* | a*b-将a乘以b | a * b - Multiplies a by b |
+ | a+b-返回a+b | a + b - Returns a+b |
- | a-b-返回差分a-b | a - b - Returns the difference a-b |
/ | a/b-将a除以b | a / b - Divide a by b |
< | a<b-如果a小于b,则返回TRUE | a < b - Returns TRUE if a is less than b |
<= | a<=b-如果a不大于b,则返回TRUE | a <= b - Returns TRUE if a is not greater than b |
<=> | 对于非空操作数,a<=>b-返回相同的结果,如果两个操作数都为null,则返回TRUE;如果其中一个操作数为null,则返回FALSE | a <=> b - Returns same result with EQUAL(=) operator for non-null operands, but returns TRUE if both are NULL, FALSE if one of the them is NULL |
<> | a<>b-如果a不等于b,则返回TRUE | a <> b - Returns TRUE if a is not equal to b |
= | a=b-如果a等于b,则返回TRUE,否则返回false | a = b - Returns TRUE if a equals b and false otherwise |
== | a==b-如果a等于b,则返回TRUE,否则返回false | a == b - Returns TRUE if a equals b and false otherwise |
> | a>b-如果a大于b,则返回TRUE | a > b - Returns TRUE if a is greater than b |
>= | a>=b-如果a不小于b,则返回TRUE | a >= b - Returns TRUE if a is not smaller than b |
^ | a^b—按位异或 | a ^ b - Bitwise exclusive or |
abs | abs(x)-返回x的绝对值 | abs(x) - returns the absolute value of x |
acos | acos(x)-如果-1<=x<=1,则返回x的反余弦;否则返回NULL | acos(x) - returns the arc cosine of x if -1<=x<=1 or NULL otherwise |
add_months | add_months(start_date,num_months,output_date_format)-返回开始日期后num_months的日期。 | add_months(start_date, num_months, output_date_format) - Returns the date that is num_months after start_date. |
and | a1和a2还有。。。和-逻辑and | a1 and a2 and … and an - Logical and |
array | array(n0,n1…)-用给定的元素创建一个数组 | array(n0, n1…) - Creates an array with the given elements |
array_contains | array_contains(array,value)-如果数组包含值,则返回TRUE。 | array_contains(array, value) - Returns TRUE if the array contains value. |
ascii | ascii(str)-返回str的第一个字符的数值 | ascii(str) - returns the numeric value of the first character of str |
asin | asin(x)-如果-1<=x<=1,则返回x的弧正弦;否则返回NULL | asin(x) - returns the arc sine of x if -1<=x<=1 or NULL otherwise |
assert_true | assertu true(condition)-如果“condition”不为true,则引发异常。 | assert_true(condition) - Throw an exception if ‘condition’ is not true. |
atan | 返回x的atan(arctan)(x以弧度表示) | atan(x) - returns the atan (arctan) of x (x is in radians) |
avg | 平均数(a)的返回数集 | avg(x) - Returns the mean of a set of numbers |
base64 | base64(bin)-将参数从二进制转换为base64字符串 | base64(bin) - Convert the argument from binary to a base 64 string |
between | 在a之间[不是]在b和c之间-评估a是否在b和c之间 | between a [NOT] BETWEEN b AND c - evaluate if a is [not] in between b and c |
bin | bin(n)-以二进制形式返回n | bin(n) - returns n in binary |
bround | bround(x[,d])—使用半偶数舍入模式将x舍入到d个小数位。 | bround(x[, d]) - round x to d decimal places using HALF_EVEN rounding mode. |
case | CASE a WHEN b THEN c[WHEN d THEN e]*[ELSE f]END-当a=b时,返回c;当a=d时,返回e;否则返回f | CASE a WHEN b THEN c [WHEN d THEN e]* [ELSE f] END - When a = b, returns c; when a = d, return e; else return f |
cbrt | cbrt(double)-返回double值的立方根。 | cbrt(double) - Returns the cube root of a double value. |
ceil | ceil(x)-找到不小于x的最小整数 | ceil(x) - Find the smallest integer not smaller than x |
ceiling | 天花板(x)-找到不小于x的最小整数 | ceiling(x) - Find the smallest integer not smaller than x |
chr | chr(str)-将n(其中n:[0,256)转换为ascii等效值,作为varchar如果n小于0返回空字符串。如果n>256,返回chr(n%256)。 | chr(str) - convert n where n : [0, 256) into the ascii equivalent as a varchar.If n is less than 0 return the empty string. If n > 256, return chr(n % 256). |
coalesce | coalesce(a1,a2,…)—返回第一个非空参数 | coalesce(a1, a2, …) - Returns the first non-null argument |
collect_list | collectu list(x)-返回具有重复项的对象列表 | collect_list(x) - Returns a list of objects with duplicates |
collect_set | collect_set(x)-返回一组消除了重复元素的对象 | collect_set(x) - Returns a set of objects with duplicate elements eliminated |
compute_stats | compute_stats(x)-返回一组基元类型值的统计摘要。 | compute_stats(x) - Returns the statistical summary of a set of primitive type values. |
concat | 混凝土(str1,str2。。。strN)-返回str1、str2、。。。strN或concat(bin1,bin2。。。binN)-返回二进制数据bin1,bin2,…中字节的串联。。。宾恩 | concat(str1, str2, … strN) - returns the concatenation of str1, str2, … strN or concat(bin1, bin2, … binN) - returns the concatenation of bytes in binary data bin1, bin2, … binN |
concat_ws | concat_ws(separator,[string | array(string)]+)—返回由分隔符分隔的字符串的串联。 |
context_ngrams | 上下文语法(expr,array<string1,string2,…>,k,pf)估计符合指定上下文的前k个最频繁的n-gram。第二个参数指定一个字符串,指定n个gram元素的位置,空值代表必须由n-gram元素填充的“blank”。 | context_ngrams(expr, array<string1, string2, …>, k, pf) estimates the top-k most frequent n-grams that fit into the specified context. The second parameter specifies a string of words that specify the positions of the n-gram elements, with a null value standing in for a ‘blank’ that must be filled by an n-gram element. |
conv | conv(num,from_base,to_base)-将num from_base转换为_base | conv(num, from_base, to_base) - convert num from from_base to to_base |
corr | corr(x,y)-返回皮尔逊相关系数 | corr(x,y) - Returns the Pearson coefficient of correlation |
在一组数对之间 | between a set of number pairs | |
cos | cos(x)-返回x的余弦(x以弧度表示) | cos(x) - returns the cosine of x (x is in radians) |
count | count(*)-返回已检索行的总数,包括包含空值的行。 | count(*) - Returns the total number of retrieved rows, including rows containing NULL values. |
count(expr)-返回提供的表达式为非NULL的行数。 | count(expr) - Returns the number of rows for which the supplied expression is non-NULL. | |
count(DISTINCT expr[,expr…])—返回所提供表达式唯一且非空的行数。 | count(DISTINCT expr[, expr…]) - Returns the number of rows for which the supplied expression(s) are unique and non-NULL. | |
covar_pop | covar_pop(x,y)-返回一组数对的总体协方差 | covar_pop(x,y) - Returns the population covariance of a set of number pairs |
covar_samp | covar_samp(x,y)-返回一组数对的样本协方差 | covar_samp(x,y) - Returns the sample covariance of a set of number pairs |
crc32 | crc32(str或bin)-计算字符串或二进制参数的循环冗余校验值,并返回bigint值。 | crc32(str or bin) - Computes a cyclic redundancy check value for string or binary argument and returns bigint value. |
create_union | create_union(tag,obj1,obj2,obj3,…)—为给定的标记创建一个与对象的联合 | create_union(tag, obj1, obj2, obj3, …) - Creates a union with the object for given tag |
cume_dist | 函数“cume_dist”没有文档 | There is no documentation for function ‘cume_dist’ |
current_database | current_database()-返回当前使用的数据库名称 | current_database() - returns currently using database name |
current_date | current_date()—返回查询计算开始时的当前日期。同一查询中所有当前日期的调用都返回相同的值。 | current_date() - Returns the current date at the start of query evaluation. All calls of current_date within the same query return the same value. |
current_timestamp | current_timestamp()—返回查询计算开始时的当前时间戳。在同一个查询中对当前时间戳的所有调用都返回相同的值。 | current_timestamp() - Returns the current timestamp at the start of query evaluation. All calls of current_timestamp within the same query return the same value. |
current_user | current_user()—返回当前用户名 | current_user() - Returns current user name |
date_add | date_add(start_date,num_days)-返回开始日期后num_days的日期。 | date_add(start_date, num_days) - Returns the date that is num_days after start_date. |
date_format | date_format(date/timestamp/string,fmt)-将日期/时间戳/string转换为日期格式fmt指定格式的字符串值。 | date_format(date/timestamp/string, fmt) - converts a date/timestamp/string to a value of string in the format specified by the date format fmt. |
date_sub | date_sub(start_date,num_days)-返回开始日期之前num_days的日期。 | date_sub(start_date, num_days) - Returns the date that is num_days before start_date. |
datediff | datediff(date1,date2)-返回date1和date2之间的天数 | datediff(date1, date2) - Returns the number of days between date1 and date2 |
day | day(param)-返回日期/时间戳所在月份的日期,或interval的day组件 | day(param) - Returns the day of the month of date/timestamp, or day component of interval |
dayofmonth | dayofmonth(param)-返回日期/时间戳所在月份的日期,或间隔的日组件 | dayofmonth(param) - Returns the day of the month of date/timestamp, or day component of interval |
dayofweek | dayofweek(param)-返回日期/时间戳的星期几(1=星期日,2=星期一,…,7=星期六) | dayofweek(param) - Returns the day of the week of date/timestamp (1 = Sunday, 2 = Monday, …, 7 = Saturday) |
decode | decode(bin,str)-使用第二个参数字符集解码第一个参数 | decode(bin, str) - Decode the first argument using the second argument character set |
default.produdfone | 功能’默认值.produdfone’不存在。 | Function ‘default.produdfone’ does not exist. |
degrees | 度(x)-将弧度转换为度 | degrees(x) - Converts radians to degrees |
dense_rank | 没有关于函数“稠密等级”的文档 | There is no documentation for function ‘dense_rank’ |
div | a div b-将a除以b四舍五入到长整数 | a div b - Divide a by b rounded to the long integer |
e | e()—返回e | e() - returns E |
elt | elt(n,str1,str2,…)—返回第n个字符串 | elt(n, str1, str2, …) - returns the n-th string |
encode | encode(str,str)-使用第二个参数字符集对第一个参数进行编码 | encode(str, str) - Encode the first argument using the second argument character set |
ewah_bitmap | ewah_bitmap(expr)-返回列的ewah压缩位图表示。 | ewah_bitmap(expr) - Returns an EWAH-compressed bitmap representation of a column. |
ewah_bitmap_and | ewah_bitmap_and(b1,b2)-返回两个位图中按位“与”的ewah压缩位图。 | ewah_bitmap_and(b1, b2) - Return an EWAH-compressed bitmap that is the bitwise AND of two bitmaps. |
ewah_bitmap_empty | ewah_bitmap_empty(bitmap)-测试ewah压缩位图是否全为零的谓词 | ewah_bitmap_empty(bitmap) - Predicate that tests whether an EWAH-compressed bitmap is all zeros |
ewah_bitmap_or | ewah_bitmap_or(b1,b2)-返回两个位图中按位或的ewah压缩位图。 | ewah_bitmap_or(b1, b2) - Return an EWAH-compressed bitmap that is the bitwise OR of two bitmaps. |
exp | 返回x的幂 | exp(x) - Returns e to the power of x |
explode | 分解(a)-将数组a的元素拆分为多行,或将映射的元素拆分为多行和多列 | explode(a) - separates the elements of array a into multiple rows, or the elements of a map into multiple rows and columns |
factorial | factorial(int)-返回n个factorial。有效n为[0…20]。 | factorial(int) - Returns n factorial. Valid n is [0…20]. |
field | 字段(str,str1,str2,…)—返回str1,str2,….中str的索引,。。。列表或0(如果未找到) | field(str, str1, str2, …) - returns the index of str in the str1,str2,… list or 0 if not found |
find_in_set | find_in_set(str,str_array)-返回str_数组中第一个出现的str,其中str_array是逗号分隔的字符串。如果任一参数为null,则返回null。如果第一个参数有逗号,则返回0。 | find_in_set(str,str_array) - Returns the first occurrence of str in str_array where str_array is a comma-delimited string. Returns null if either argument is null. Returns 0 if the first argument has any commas. |
first_value | 函数“first_value”没有文档 | There is no documentation for function ‘first_value’ |
floor | floor(x)-查找不大于x的最大整数 | floor(x) - Find the largest integer not greater than x |
floor_day | floor_day(param)-返回一天粒度的时间戳 | floor_day(param) - Returns the timestamp at a day granularity |
floor_hour | floor_hour(param)-返回小时粒度的时间戳 | floor_hour(param) - Returns the timestamp at a hour granularity |
floor_minute | floor_minute(param)-以分钟粒度返回时间戳 | floor_minute(param) - Returns the timestamp at a minute granularity |
floor_month | floor_month(param)-返回月份粒度的时间戳 | floor_month(param) - Returns the timestamp at a month granularity |
floor_quarter | floor_quarter(param)-返回四分之一粒度的时间戳 | floor_quarter(param) - Returns the timestamp at a quarter granularity |
floor_second | flooru second(param)-返回秒粒度的时间戳 | floor_second(param) - Returns the timestamp at a second granularity |
floor_week | flooru week(param)-以周粒度返回时间戳 | floor_week(param) - Returns the timestamp at a week granularity |
floor_year | floor_year(param)-返回以年为单位的时间戳 | floor_year(param) - Returns the timestamp at a year granularity |
format_number | format_number(X,D或F)-将数字X格式化为“#,###,############################。如果D为0,则结果没有小数点或小数部分。它的功能应该类似于MySQL的格式 | format_number(X, D or F) - Formats the number X to a format like ‘#,###,###.##’, rounded to D decimal places, Or Uses the format specified F to format, and returns the result as a string. If D is 0, the result has no decimal point or fractional part. This is supposed to function like MySQL’s FORMAT |
from_unixtime | fromu unixtime(unix_time,format)-返回指定格式的unix时间 | from_unixtime(unix_time, format) - returns unix_time in the specified format |
from_utc_timestamp | from_utc_timestamp(timestamp,string timezone)-假定给定的时间戳为utc并转换为给定的时区(从配置单元0.8.0开始) | from_utc_timestamp(timestamp, string timezone) - Assumes given timestamp is UTC and converts to given timezone (as of Hive 0.8.0) |
get_json_object | get_json_object(json_txt,path)-从path中提取一个json对象 | get_json_object(json_txt, path) - Extract a json object from path |
get_splits | get_splits(string,int)-返回被引用表string的长度为int的序列化splits数组。 | get_splits(string,int) - Returns an array of length int serialized splits for the referenced tables string. |
greatest | 最大值(v1,v2,…)—返回值列表中的最大值 | greatest(v1, v2, …) - Returns the greatest value in a list of values |
grouping | 分组(a,b)-指示中的指定列表达式是否聚合。返回1表示聚合,返回0表示未聚合。 | grouping(a, b) - Indicates whether a specified column expression in is aggregated or not. Returns 1 for aggregated or 0 for not aggregated. |
hash | hash(a1,a2,…)—返回参数的哈希值 | hash(a1, a2, …) - Returns a hash value of the arguments |
hex | 十六进制(n、bin或str)-将参数转换为十六进制 | hex(n, bin, or str) - Convert the argument to hexadecimal |
histogram_numeric | histogram_numeric(expr,nb)-使用nb bin计算数值“expr”的直方图。 | histogram_numeric(expr, nb) - Computes a histogram on numeric ‘expr’ using nb bins. |
hour | hour(param)-返回字符串/timestamp/interval的小时组成 | hour(param) - Returns the hour componemnt of the string/timestamp/interval |
if | IF(expr1,expr2,expr3)-如果expr1为真(expr1<>0和expr1<>NULL),则IF()返回expr2;否则返回expr3。IF()返回数值或字符串值,具体取决于使用它的上下文。 | IF(expr1,expr2,expr3) - If expr1 is TRUE (expr1 <> 0 and expr1 <> NULL) then IF() returns expr2; otherwise it returns expr3. IF() returns a numeric or string value, depending on the context in which it is used. |
in | test in(val1,val2…)-如果test等于任何valN,则返回true | test in(val1, val2…) - returns true if test equals any valN |
in_file | in_file(str,filename)-如果str出现在文件中,则返回true | in_file(str, filename) - Returns true if str appears in the file |
index | index(a,n)-返回 | index(a, n) - Returns the n-th element of a |
initcap | initcap(str)-返回str,每个单词的第一个字母都是大写,所有其他字母都是小写。单词用空格分隔。 | initcap(str) - Returns str, with the first letter of each word in uppercase, all other letters in lowercase. Words are delimited by white space. |
inline | inline(ARRAY(STRUCT()[,STRUCT()])-将数组和结构分解为表 | inline( ARRAY( STRUCT()[,STRUCT()] - explodes and array and struct into a table |
instr | instr(str,substr)-返回str中第一次出现substr的索引 | instr(str, substr) - Returns the index of the first occurance of substr in str |
internal_interval | 内部间隔(intervalType,intervalArg) | internal_interval(intervalType,intervalArg) |
isnotnull | isnotnull a-如果a不为NULL,则返回true,否则返回false | isnotnull a - Returns true if a is not NULL and false otherwise |
isnull | isnull a-如果a为NULL,则返回true,否则返回false | isnull a - Returns true if a is NULL and false otherwise |
java_method | java_方法(class,method[,arg1[,arg2…]])使用反射调用方法 | java_method(class,method[,arg1[,arg2…]]) calls method with reflection |
json_tuple | json元组(jsonStr,p1,p2,…,pn)类似get_json_对象,但它使用多个名称并返回一个元组。所有的输入参数和输出列类型都是字符串。 | json_tuple(jsonStr, p1, p2, …, pn) - like get_json_object, but it takes multiple names and return a tuple. All the input parameters and output column types are string. |
lable.produdfone | 功能’produdfone标签’不存在。 | Function ‘lable.produdfone’ does not exist. |
lag | LAG(标量表达式[,offset][,default])OVER([query_partition_clause]order_by_子句);LAG函数用于访问前一行的数据。 | LAG (scalar_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause); The LAG function is used to access data from a previous row. |
last_day | last_day(date)-返回日期所属月份的最后一天。 | last_day(date) - Returns the last day of the month which the date belongs to. |
last_value | 函数“last_value”没有文档 | There is no documentation for function ‘last_value’ |
lcase | lcase(str)-返回所有字符都改为小写的str | lcase(str) - Returns str with all characters changed to lowercase |
lead | LEAD(标量_expression[,offset][,default])OVER([query_partition_clause]order_by_子句);LEAD函数用于从下一行返回数据。 | LEAD (scalar_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause); The LEAD function is used to return data from the next row. |
least | least(v1,v2,…)—返回值列表中的最小值 | least(v1, v2, …) - Returns the least value in a list of values |
length | length(str | binary)-返回str的长度或二进制数据中的字节数 |
levenshtein | levenshtein(str1,str2)-此函数计算两个字符串之间的levenshtein距离。 | levenshtein(str1, str2) - This function calculates the Levenshtein distance between two strings. |
like | like(str,pattern)-检查str是否与pattern匹配 | like(str, pattern) - Checks if str matches pattern |
ln | ln(x)-返回x的自然对数 | ln(x) - Returns the natural logarithm of x |
locate | locate(substr,str[,pos])—返回str中第一个在pos位置之后出现的substr的位置 | locate(substr, str[, pos]) - Returns the position of the first occurance of substr in str after position pos |
log | log([b],x)-返回以b为底的x的对数 | log([b], x) - Returns the logarithm of x with base b |
log10 | log10(x)-返回以10为底的x的对数 | log10(x) - Returns the logarithm of x with base 10 |
log2 | log2(x)-返回以2为底的x的对数 | log2(x) - Returns the logarithm of x with base 2 |
logged_in_user | logged_in_user()-返回登录用户名 | logged_in_user() - Returns logged in user name |
lower | lower(str)-返回所有字符都改为小写的str | lower(str) - Returns str with all characters changed to lowercase |
lpad | lpad(str,len,pad)-返回str,left padded with pad的长度为len | lpad(str, len, pad) - Returns str, left-padded with pad to a length of len |
ltrim | ltrim(str)-删除str中的前导空格字符 | ltrim(str) - Removes the leading space characters from str |
map | map(key0,value0,key1,value1…)-使用给定的键/值对创建映射 | map(key0, value0, key1, value1…) - Creates a map with the given key/value pairs |
map_keys | map_keys(map)-返回包含输入映射键的无序数组。 | map_keys(map) - Returns an unordered array containing the keys of the input map. |
map_values | map_values(map)-返回包含输入映射值的无序数组。 | map_values(map) - Returns an unordered array containing the values of the input map. |
mask | 屏蔽给定值 | masks the given value |
mask_first_n | 屏蔽值的前n个字符 | masks the first n characters of the value |
mask_hash | 返回给定值的哈希值 | returns hash of the given value |
mask_last_n | 遮罩值的最后n个字符 | masks the last n characters of the value |
mask_show_first_n | 屏蔽值的前n个字符之外的所有字符 | masks all but first n characters of the value |
mask_show_last_n | 屏蔽值的最后n个字符 | masks all but last n characters of the value |
matchpath | 没有函数“matchpath”的文档 | There is no documentation for function ‘matchpath’ |
max | max(expr)-返回expr的最大值 | max(expr) - Returns the maximum value of expr |
md5 | md5(str或bin)-计算字符串或二进制文件的md5128位校验和。 | md5(str or bin) - Calculates an MD5 128-bit checksum for the string or binary. |
min | min(expr)-返回expr的最小值 | min(expr) - Returns the minimum value of expr |
minute | minute(param)-返回字符串/timestamp/interval的分钟组件 | minute(param) - Returns the minute component of the string/timestamp/interval |
month | month(param)-返回日期/时间戳/间隔的月份组件 | month(param) - Returns the month component of the date/timestamp/interval |
months_between | monthsu between(date1,date2)-返回日期date1和date2之间的月数 | months_between(date1, date2) - returns number of months between dates date1 and date2 |
named_struct | named_struct(name1,val1,name2,val2,…)—使用给定的字段名和值创建一个结构 | named_struct(name1, val1, name2, val2, …) - Creates a struct with the given field names and values |
negative | 负a-返回-a | negative a - Returns -a |
next_day | next_day(start_date,week的day)-返回晚于start_date并按指示命名的第一个日期。 | next_day(start_date, day_of_week) - Returns the first date which is later than start_date and named as indicated. |
ngrams | 由串的数组组成的数组pf’是控制内存使用的可选精度因子。 | ngrams(expr, n, k, pf) - Estimates the top-k n-grams in rows that consist of sequences of strings, represented as arrays of strings, or arrays of arrays of strings. ‘pf’ is an optional precision factor that controls memory usage. |
noop | 没有函数“noop”的文档 | There is no documentation for function ‘noop’ |
noopstreaming | 没有“noopstreaming”函数的文档 | There is no documentation for function ‘noopstreaming’ |
noopwithmap | 没有函数“noopwithmap”的文档 | There is no documentation for function ‘noopwithmap’ |
noopwithmapstreaming | 函数“noopwithmapstreaming”没有文档 | There is no documentation for function ‘noopwithmapstreaming’ |
not | 不是-逻辑上不是 | not a - Logical not |
ntile | 没有函数“ntile”的文档 | There is no documentation for function ‘ntile’ |
nvl | nvl(value,default_value)-如果value为null,则返回默认值,否则返回value | nvl(value,default_value) - Returns default value if value is null else returns value |
or | a1或a2或。。。或-逻辑or | a1 or a2 or … or an - Logical or |
parse_url | parse_url(url,partToExtract[,key])-从url中提取部分 | parse_url(url, partToExtract[, key]) - extracts a part from a URL |
parse_url_tuple | parse-url元组(url,partname1,partname2,…,partnameN)-从url中提取N(N>=1)个部分。 | parse_url_tuple(url, partname1, partname2, …, partnameN) - extracts N (N>=1) parts from a URL. |
它接受一个URL和一个或多个partname,并返回一个tuple。所有的输入参数和输出列类型都是字符串。 | It takes a URL and one or multiple partnames, and returns a tuple. All the input parameters and output column types are string. | |
percent_rank | 没有“percent_rank”函数的文档 | There is no documentation for function ‘percent_rank’ |
percentile | percentile(expr,pc)-返回pc(范围:[0,1])处expr的百分比。pc可以是双数组或双数组 | percentile(expr, pc) - Returns the percentile(s) of expr at pc (range: [0,1]).pc can be a double or double array |
percentile_approx | percentile_approach(expr,pc,[nb])-对于非常大的数据,使用可选参数[nb]作为要使用的直方图箱数,从直方图计算近似百分位值。nb值越大,近似值就越精确,代价是内存使用率越高。 | percentile_approx(expr, pc, [nb]) - For very large data, computes an approximate percentile value from a histogram, using the optional argument [nb] as the number of histogram bins to use. A higher value of nb results in a more accurate approximation, at the cost of higher memory usage. |
pi | pi()—返回pi | pi() - returns pi |
pmod | pmodb-计算正模 | a pmod b - Compute the positive modulo |
posexplode | posexplode(a)-行为类似于数组的explode,但包含了原始数组中项的位置 | posexplode(a) - behaves like explode for arrays, but includes the position of items in the original array |
positive | 正a-返回a | positive a - Returns a |
pow | pow(x1,x2)-将x1提升到x2的幂 | pow(x1, x2) - raise x1 to the power of x2 |
power | 功率(x1,x2)-将x1提升到x2的幂 | power(x1, x2) - raise x1 to the power of x2 |
printf | printf(字符串格式,对象。。。args)-可以根据printf样式格式字符串格式化字符串的函数 | printf(String format, Obj… args) - function that can format strings according to printf-style format strings |
quarter | quarter(date/timestamp/string)-返回日期所在的季度,范围为1到4。 | quarter(date/timestamp/string) - Returns the quarter of the year for date, in the range 1 to 4. |
radians | 弧度(x)-将度数转换为弧度 | radians(x) - Converts degrees to radians |
rand | rand([seed])—返回介于0和1之间的伪随机数 | rand([seed]) - Returns a pseudorandom number between 0 and 1 |
rank | 没有函数“rank”的文档 | There is no documentation for function ‘rank’ |
reflect | reflect(class,method[,arg1[,arg2…]])使用反射调用方法 | reflect(class,method[,arg1[,arg2…]]) calls method with reflection |
reflect2 | reflect2(arg0,method[,arg1[,arg2…]])使用反射调用arg0的方法 | reflect2(arg0,method[,arg1[,arg2…]]) calls method of arg0 with reflection |
regexp | str regexp regexp-如果str与regexp匹配,则返回true,否则返回false | str regexp regexp - Returns true if str matches regexp and false otherwise |
regexp_extract | regexp_extract(str,regexp[,idx])-提取与regexp匹配的组 | regexp_extract(str, regexp[, idx]) - extracts a group that matches regexp |
regexp_replace | regexp_replace(str,regexp,rep)-用rep替换匹配regexp的str的所有子字符串 | regexp_replace(str, regexp, rep) - replace all substrings of str that match regexp with rep |
repeat | 重复(str,n)-重复str n次 | repeat(str, n) - repeat str n times |
replace | replace(str,search,rep)-将“search”与“rep”匹配的所有子字符串替换为“str” | replace(str, search, rep) - replace all substrings of ‘str’ that match ‘search’ with ‘rep’ |
reverse | 反向(str)-反向str | reverse(str) - reverse str |
rlike | str rlike regexp-如果str与regexp匹配,则返回true,否则返回false | str rlike regexp - Returns true if str matches regexp and false otherwise |
round | 舍入(x[,d])—将x舍入到d个小数位 | round(x[, d]) - round x to d decimal places |
row_number | 没有“row_number”函数的文档 | There is no documentation for function ‘row_number’ |
rpad | rpad(str,len,pad)-返回str,右填充pad到len的长度 | rpad(str, len, pad) - Returns str, right-padded with pad to a length of len |
rtrim | rtrim(str)-删除str中的尾随空格字符 | rtrim(str) - Removes the trailing space characters from str |
second | second(date)-返回字符串/timestamp/interval的第二个组件 | second(date) - Returns the second component of the string/timestamp/interval |
sentences | 句子(str,lang,country)-将str拆分为句子数组,其中每个句子都是一个单词数组。“lang”和“country”参数是可选的,如果省略,则使用默认的区域设置。 | sentences(str, lang, country) - Splits str into arrays of sentences, where each sentence is an array of words. The ‘lang’ and’country’ arguments are optional, and if omitted, the default locale is used. |
sha | sha(str或bin)-计算字符串或二进制的sha-1摘要,并以十六进制字符串的形式返回值。 | sha(str or bin) - Calculates the SHA-1 digest for string or binary and returns the value as a hex string. |
sha1 | sha1(str或bin)-计算字符串或二进制的SHA-1摘要,并以十六进制字符串的形式返回值。 | sha1(str or bin) - Calculates the SHA-1 digest for string or binary and returns the value as a hex string. |
sha2 | sha2(string/binary,len)-计算SHA-2哈希函数族(SHA-224、SHA-256、SHA-384和SHA-512)。 | sha2(string/binary, len) - Calculates the SHA-2 family of hash functions (SHA-224, SHA-256, SHA-384, and SHA-512). |
shiftleft | 左移(a,b)-按位左移 | shiftleft(a, b) - Bitwise left shift |
shiftright | shiftright(a,b)-按位右移 | shiftright(a, b) - Bitwise right shift |
shiftrightunsigned | shiftrightunsigned(a,b)-位无符号右移 | shiftrightunsigned(a, b) - Bitwise unsigned right shift |
sign | sign(x)-返回x的符号 | sign(x) - returns the sign of x ) |
sin | sin(x)-返回x的正弦值(x以弧度为单位) | sin(x) - returns the sine of x (x is in radians) |
size | size(a)-返回a的大小 | size(a) - Returns the size of a |
sort_array | sort_array(array(obj1,obj2,…)—根据数组元素的自然顺序对输入数组进行升序排序。 | sort_array(array(obj1, obj2,…)) - Sorts the input array in ascending order according to the natural ordering of the array elements. |
soundex | soundex(string)-返回字符串的soundex代码。 | soundex(string) - Returns soundex code of the string. |
space | space(n)-返回n个空格 | space(n) - returns n spaces |
split | split(str,regex)-围绕匹配regex的事件拆分str | split(str, regex) - Splits str around occurances that match regex |
sqrt | sqrt(x)-返回x的平方根 | sqrt(x) - returns the square root of x |
stack | 堆栈(n,cols…)-将k列转换为n行,每行大小为k/n | stack(n, cols…) - turns k columns into n rows of size k/n each |
std | std(x)-返回一组数字的标准偏差 | std(x) - Returns the standard deviation of a set of numbers |
stddev | stddev(x)-返回一组数字的标准偏差 | stddev(x) - Returns the standard deviation of a set of numbers |
stddev_pop | stddev_pop(x)-返回一组数字的标准偏差 | stddev_pop(x) - Returns the standard deviation of a set of numbers |
stddev_samp | stddev_samp(x)-返回一组数字的样本标准偏差 | stddev_samp(x) - Returns the sample standard deviation of a set of numbers |
str_to_map | str_to_map(text,delimiter1,delimiter2)-通过解析文本创建映射 | str_to_map(text, delimiter1, delimiter2) - Creates a map by parsing text |
struct | struct(col1,col2,col3,…)—用给定的字段值创建一个结构 | struct(col1, col2, col3, …) - Creates a struct with the given field values |
substr | substr(str,pos[,len])-返回从pos开始的长度为len的str的子字符串或substr(bin,pos[,len])-返回从pos开始,长度为len的字节数组的片段 | substr(str, pos[, len]) - returns the substring of str that starts at pos and is of length len orsubstr(bin, pos[, len]) - returns the slice of byte array that starts at pos and is of length len |
substring | substring(str,pos[,len])-返回从pos开始的长度为len或substring(bin,pos[,len])的str子字符串-返回从pos开始、长度为len的字节数组的片段 | substring(str, pos[, len]) - returns the substring of str that starts at pos and is of length len orsubstring(bin, pos[, len]) - returns the slice of byte array that starts at pos and is of length len |
substring_index | substring_index(str,delim,count)-返回string str中分隔符delim出现count次之前的子字符串。 | substring_index(str, delim, count) - Returns the substring from string str before count occurrences of the delimiter delim. |
sum | sum(x)-返回一组数字的总和 | sum(x) - Returns the sum of a set of numbers |
tan | tan(x)-返回x的正切值(x以弧度表示) | tan(x) - returns the tangent of x (x is in radians) |
to_date | 结束日期(expr)-提取date或datetime表达式expr的日期部分 | to_date(expr) - Extracts the date part of the date or datetime expression expr |
to_unix_timestamp | to_unix_timestamp(date[,pattern])-返回unix时间戳 | to_unix_timestamp(date[, pattern]) - Returns the UNIX timestamp |
to_utc_timestamp | to_utc_timestamp(timestamp,string timezone)-假设给定的时间戳在给定的时区中并转换为utc(从配置单元0.8.0开始) | to_utc_timestamp(timestamp, string timezone) - Assumes given timestamp is in given timezone and converts to UTC (as of Hive 0.8.0) |
translate | translate(input,from,to)-通过将from字符串中的字符替换为to字符串中的相应字符来转换输入字符串 | translate(input, from, to) - translates the input string by replacing the characters present in the from string with the corresponding characters in the to string |
trim | trim(str)-删除str中的前导和尾随空格字符 | trim(str) - Removes the leading and trailing space characters from str |
trunc | trunc(date,fmt)-返回日期,其中一天的时间部分被截断为格式模型fmt指定的单位。如果省略fmt,则日期将被截断为最近的一天。它现在只支持’MONTH’/‘MON’/‘MM’和’YEAR’/‘YYYY’/'YY’作为格式。 | trunc(date, fmt) - Returns returns date with the time portion of the day truncated to the unit specified by the format model fmt. If you omit fmt, then date is truncated to the nearest day. It now only supports ‘MONTH’/‘MON’/‘MM’ and ‘YEAR’/‘YYYY’/‘YY’ as format. |
ucase | ucase(str)-返回str,所有字符都改为大写 | ucase(str) - Returns str with all characters changed to uppercase |
unbase64 | unbase64(str)-将参数从base64字符串转换为二进制 | unbase64(str) - Convert the argument from a base 64 string to binary |
unhex | unhex(str)-将十六进制参数转换为二进制 | unhex(str) - Converts hexadecimal argument to binary |
unix_timestamp | unix_timestamp(date[,pattern])-将时间转换为数字 | unix_timestamp(date[, pattern]) - Converts the time to a number |
upper | upper(str)-返回所有字符都改为大写的str | upper(str) - Returns str with all characters changed to uppercase |
uuid | uuid()—返回通用唯一标识符(uuid)字符串。 | uuid() - Returns a universally unique identifier (UUID) string. |
var_pop | var_pop(x)-返回一组数字的方差 | var_pop(x) - Returns the variance of a set of numbers |
var_samp | var_samp(x)-返回一组数字的样本方差 | var_samp(x) - Returns the sample variance of a set of numbers |
variance | variance(x)-返回一组数字的方差 | variance(x) - Returns the variance of a set of numbers |
version | version()—返回配置单元内部版本字符串—包括基本版本和修订。 | version() - Returns the Hive build version string - includes base version and revision. |
weekofyear | weekofyear(date)-返回给定日期所在的一年中的某一周。一周从星期一开始,第1周是第一周,超过3天。 | weekofyear(date) - Returns the week of the year of the given date. A week is considered to start on a Monday and week 1 is the first week with >3 days. |
when | CASE WHEN a THEN b[WHEN c THEN d]*[ELSE e]END-当a=true时,返回b;当c=true时,返回d;ELSE返回e | CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END - When a = true, returns b; when c = true, return d; else return e |
windowingtablefunction | 没有函数“windowingtablefunction”的文档 | There is no documentation for function ‘windowingtablefunction’ |
xpath | xpath(xml,xpath)-返回xml节点中与xpath表达式匹配的值的字符串数组 | xpath(xml, xpath) - Returns a string array of values within xml nodes that match the xpath expression |
xpath_boolean | xpath_boolean(xml,xpath)-计算布尔xpath表达式 | xpath_boolean(xml, xpath) - Evaluates a boolean xpath expression |
xpath_double | xpath_double(xml,xpath)-返回与xpath表达式匹配的双精度值 | xpath_double(xml, xpath) - Returns a double value that matches the xpath expression |
xpath_float | xpath_float(xml,xpath)-返回与xpath表达式匹配的浮点值 | xpath_float(xml, xpath) - Returns a float value that matches the xpath expression |
xpath_int | xpath_int(xml,xpath)-返回与xpath表达式匹配的整数值 | xpath_int(xml, xpath) - Returns an integer value that matches the xpath expression |
xpath_long | xpath_long(xml,xpath)-返回与xpath表达式匹配的long值 | xpath_long(xml, xpath) - Returns a long value that matches the xpath expression |
xpath_number | xpath_number(xml,xpath)-返回与xpath表达式匹配的双精度值 | xpath_number(xml, xpath) - Returns a double value that matches the xpath expression |
xpath_short | xpath_short(xml,xpath)-返回与xpath表达式匹配的短值 | xpath_short(xml, xpath) - Returns a short value that matches the xpath expression |
xpath_string | xpath_string(xml,xpath)-返回与xpath表达式匹配的第一个xml节点的文本内容 | xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the xpath expression |
year | year(param)-返回日期/时间戳/间隔的年份组件 | year(param) - Returns the year component of the date/timestamp/interval |
== | | | ==a |
~ | ~n-按位不是 | ~ n - Bitwise not |
最后
以上就是故意海燕为你收集整理的Hive中实现有序,有序concat拼接,有序集合,hive方法操作命令,与自带方法列表的全部内容,希望文章能够帮你解决Hive中实现有序,有序concat拼接,有序集合,hive方法操作命令,与自带方法列表所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
发表评论 取消回复