Tag Archives: 二次排序

[转]Hadoop MapReduce 二次排序原理及其应用

转载自：《Hadoop MapReduce 二次排序原理及应用》

关于二次排序主要涉及到这么几个东西：

在0.20.0以前使用的是

setPartitionerClass
setOutputkeyComparatorClass
setOutputValueGroupingComparator

在0.20.0以后使用是

job.setPartitionerClass(Partitioner p);
job.setSortComparatorClass([......]
继续阅读

[转 ]Hadoop - How to do a secondary sort on values ?

关于在hadoop中，如何让reduce阶段同一个key下的values有序，一篇很好的文章，写的比《Hadoop权威指南》清楚！

转载自：

http://www.bigdataspeak.com/2013/02/hadoop-how-to-do-secondary-sort-on_25.html

The problem at hand here is that you need to work upon a sorted values set in your reducer.[......]