我是靠谱客的博主 壮观纸鹤,最近开发中收集的这篇文章主要介绍实例中给出child-parent(孩子——父母)表,要求输出grandchild-grandparent(孙子——爷奶)表,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

 

    一·需求描述:

 

要求给出的数据寻找关心的数据,它是对原始数据所包含信息的挖掘。下面进入这个实例。

实例中给出child-parent(孩子——父母)表,要求输出grandchild-grandparent(孙子——爷奶)表。

=================样本输入===================

child
parent
Tom
Lucy
Tom
Jack
Jone
Lucy
Jone
Jack
Lucy
Mary
Lucy
Ben
Jack
Alice
Jack
Jesse
Terry
Alice
Terry
Jesse
Philip
Terry
Philip
Alma
Mark
Terry
Mark
Alma
Tom
Lucy
Tom
Jack
Jone
Lucy
Jone
Jack
Lucy
Mary
Lucy
Ben
Jack
Alice
Jack
Jesse
Terry
Alice
Terry
Jesse
Philip
Terry
Philip
Alma
Mark
Terry
Mark
Alma

 

  家族树状关系谱:

 

=================样本输出===================

grandchild
grandparent
Tom
Alice
Tom
Jesse
Jone
Alice
Jone
Jesse
Tom
Mary
Tom
Ben
Jone
Mary
Jone
Ben
Philip
Alice
Philip
Jesse
Mark
Alice
Mark
Jesse
Tom
Alice
Tom
Jesse
Jone
Alice
Jone
Jesse
Tom
Mary
Tom
Ben
Jone
Mary
Jone
Ben
Philip
Alice
Philip
Jesse
Mark
Alice
Mark
Jesse

 

    二·设计思路:

    取一对样本为例:

        child   parent

        Tom    Lucy

        Lucy    Mary

 

      mapper代码片段:

        context.write(new Text(values[0]), new Text(values[1]+"_1"));//key是value的小孩 key:Tom    value:Lucy_1

        context.write(new Text(values[1]), new Text(values[0]+"_2"));//key是value的父母 key:Lucy    value:Tom_2

     即mapper读取文件的每一行都输出正反,并进行标记

 

 

 

 

    三·程序代码:

    mapper.java

package com.company.family;
import java.io.IOException;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class FamilyMapper extends Mapper<LongWritable, Text, Text, Text>{
@Override
protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, Text>.Context context)
throws IOException, InterruptedException {
//value:"Tom	Lucy"
String line = value.toString();
String[] values = line.split("
");
context.write(new Text(values[0]), new Text(values[1]+"_1"));//key是value的小孩	key:Tom	value:Lucy_1
context.write(new Text(values[1]), new Text(values[0]+"_2"));//key是value的父母	key:Lucy value:Tom_2
}
}
//value:"Tom	Lucy"
String line = value.toString();
String[] values = line.split("
");
context.write(new Text(values[0]), new Text(values[1]+"_1"));//key是value的小孩	key:Tom	value:Lucy_1
context.write(new Text(values[1]), new Text(values[0]+"_2"));//key是value的父母	key:Lucy value:Tom_2
}
}

    reducer.java

package com.company.family;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
public class FamilyReducer extends Reducer<Text, Text, Text, Text>{
@Override
protected void reduce(Text key, Iterable<Text> values, Reducer<Text, Text, Text, Text>.Context context)
throws IOException, InterruptedException {
//key:Lucy values:{Tom_2,Jone_2,Marry_1,Ben_1}
List<String> yeyelist = new ArrayList<String>();
List<String> children = new ArrayList<String>();
for(Text val:values){
if(val.toString().endsWith("_1")){
yeyelist.add(val.toString());
}else if(val.toString().endsWith("_2")){
children.add(val.toString());
}
}
//Tom	Marry
//Tom	Ben
//Jone	Marry
//Jone	Ben
for(String child:children){
for(String yeye:yeyelist){
context.write(new Text(child.substring(0, child.length()-2)), new Text(yeye.substring(0, yeye.length()-2)));
}
}
}
}
//key:Lucy values:{Tom_2,Jone_2,Marry_1,Ben_1}
List<String> yeyelist = new ArrayList<String>();
List<String> children = new ArrayList<String>();
for(Text val:values){
if(val.toString().endsWith("_1")){
yeyelist.add(val.toString());
}else if(val.toString().endsWith("_2")){
children.add(val.toString());
}
}
//Tom	Marry
//Tom	Ben
//Jone	Marry
//Jone	Ben
for(String child:children){
for(String yeye:yeyelist){
context.write(new Text(child.substring(0, child.length()-2)), new Text(yeye.substring(0, yeye.length()-2)));
}
}
}
}

    runner.java

package com.company.family;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class FamilyRunner {
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf);
//对任务job的描述
//job的jar路径
job.setJarByClass(FamilyRunner.class);
//job对应的Mapper
job.setMapperClass(FamilyMapper.class);
//job的Reducer
job.setReducerClass(FamilyReducer.class);
//Mapper的输出类型
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
//Reducer的输出类型
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
//job 处理文件路径
FileInputFormat.setInputPaths(job, new Path("/Users/xuran/Desktop/week"));
//job 处理之后文件路径
FileOutputFormat.setOutputPath(job, new Path("/Users/xuran/Desktop/week/result"));
//提交job
boolean waitForCompletion = job.waitForCompletion(true);
System.exit(waitForCompletion?0:1);
}
}//对任务job的描述
//job的jar路径
job.setJarByClass(FamilyRunner.class);
//job对应的Mapper
job.setMapperClass(FamilyMapper.class);
//job的Reducer
job.setReducerClass(FamilyReducer.class);
//Mapper的输出类型
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
//Reducer的输出类型
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
//job 处理文件路径
FileInputFormat.setInputPaths(job, new Path("/Users/xuran/Desktop/week"));
//job 处理之后文件路径
FileOutputFormat.setOutputPath(job, new Path("/Users/xuran/Desktop/week/result"));
//提交job
boolean waitForCompletion = job.waitForCompletion(true);
System.exit(waitForCompletion?0:1);
}
}

 

最后再贡献出一张图:

 

 

最后

以上就是壮观纸鹤为你收集整理的实例中给出child-parent(孩子——父母)表,要求输出grandchild-grandparent(孙子——爷奶)表的全部内容,希望文章能够帮你解决实例中给出child-parent(孩子——父母)表,要求输出grandchild-grandparent(孙子——爷奶)表所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(64)

评论列表共有 0 条评论

立即
投稿
返回
顶部