HBASE 学习----第一篇--Hfile 解析

77 阅读 0 评论 51 点赞

我是靠谱客的博主彪壮刺猬，这篇文章主要介绍HBASE 学习----第一篇--Hfile 解析，现在分享给大家，希望可以做个参考。

追踪hfile的读方法：

大家都知道使用hbase hfile -p hfile文件路径是可以查看hfile中的key value的，我们来看一下hfile的读取用到了那个类。

1. 追踪hbase这个shell，得到：

复制代码

1
2
elif [ "$COMMAND" = "hfile" ] ; then
CLASS='org.apache.hadoop.hbase.io.hfile.HFile'

我们看到具体执行hfile相关的类是在org.apache.hadoop.hbase.io.hfile.HFile 中

2. 打开org.apache.hadoop.hbase.io.hfile.HFile中的main函数：

复制代码

1
2
3
4
public static void main(String[] args) throws IOException {
HFilePrettyPrinter prettyPrinter = new HFilePrettyPrinter();
System.exit(prettyPrinter.run(args));
}

看到具体的执行由HFilePrettyPrinter执行。

打开org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter

复制代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

public int run(String[] args) {
conf = HBaseConfiguration.create();
try {
FSUtils.setFsDefault(conf, FSUtils.getRootDir(conf));
if (!parseOptions(args))
return 1;
} catch (IOException ex) {
LOG.error("Error parsing command-line options", ex);
return 1;
} catch (ParseException ex) {
LOG.error("Error parsing command-line options", ex);
return 1;
}
// iterate over all files found
for (Path fileName : files) {
try {
processFile(fileName);
} catch (IOException ex) {
LOG.error("Error reading " + fileName, ex);
}
}
if (verbose || printKey) {
System.out.println("Scanned kv count -> " + count);
}
return 0;
}

继续追踪processFile方法，快看到希望了。呵呵

复制代码

1
2
3
4
5
6
7
private void processFile(Path file) throws IOException {
if (verbose)
System.out.println("Scanning -> " + file);
FileSystem fs = file.getFileSystem(conf);
if (!fs.exists(file)) {
System.err.println("ERROR, file doesnt exist: " + file);
}

复制代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

HFile.Reader reader = HFile.createReader(fs, file, new CacheConfig(conf));
Map<byte[], byte[]> fileInfo = reader.loadFileInfo();
KeyValueStatsCollector fileStats = null;
if (verbose || printKey || checkRow || checkFamily || printStats) {
// scan over file and read key/value's and check if requested
HFileScanner scanner = reader.getScanner(false, false, false);
fileStats = new KeyValueStatsCollector();
boolean shouldScanKeysValues = false;
if (this.isSeekToRow) {
// seek to the first kv on this row
shouldScanKeysValues =
(scanner.seekTo(KeyValue.createFirstOnRow(this.row).getKey()) != -1);
} else {
shouldScanKeysValues = scanner.seekTo();
}
if (shouldScanKeysValues)
scanKeysValues(file, fileStats, scanner, row);
}
.......
reader.close();
}

这里打印keyvalue，继续：scanKeysValues方法：

复制代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
private void scanKeysValues(Path file, KeyValueStatsCollector fileStats,
HFileScanner scanner,
byte[] row) throws IOException {
KeyValue pkv = null;
do {
KeyValue kv = scanner.getKeyValue();
.........
// dump key value
if (printKey) {
System.out.print("K: " + kv);
if (printValue) {
System.out.print(" V: " + Bytes.toStringBinary(kv.getValue()));
}
System.out.println();
}
..............
} while (scanner.next());
}