首页 文章

Java 8 Stream:根据其他收集器定义收集器

提问于
浏览
2

我'm new to using Java 8 Stream APIs but I' m希望用它来解决以下问题 . 假设我有一个名为 InputRecord 的POJO,其中包含 namefieldAfieldB 属性,可以表示以下各行记录:

name | fieldA | fieldB
----------------------
A    | 100    | 1.1
A    | 150    | 2.0
B    | 200    | 1.5
A    | 120    | 1.3

InputRecord 看起来像:

public class InputRecord {
    private String name;
    private Integer fieldA;
    private BigDecimal fieldB;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public Integer getFieldA() {
        return fieldA;
    }

    public void setFieldA(Integer fieldA) {
        this.fieldA = fieldA;
    }

    public BigDecimal getFieldB() {
        return fieldB;
    }

    public void setFieldB(BigDecimal fieldB) {
        this.fieldB = fieldB;
    }
}

上述四条记录需要合并为两个按名称分组的记录,其中:

  • 属性 fieldA 相加

  • 属性 fieldB 总和

  • 组合记录包括 fieldC 属性,该属性是将 fieldAfieldB 的累加和相乘的结果 .

因此上面的结果将是:

name | sumOfFieldA | sumOfFieldB | fieldC (sumOfFieldA*sumOfFieldB)
-------------------------------------------------------------------
A    | 370         | 4.4         | 1628
B    | 200         | 1.5         | 300

名为 OutputRecord 的另一个POJO将表示组合记录的每个行记录:

public class OutputRecord {
    private String name;
    private Integer sumOfFieldA;
    private BigDecimal sumOfFieldB;
    private BigDecimal fieldC;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public Integer getSumOfFieldA() {
        return sumOfFieldA;
    }

    public void setSumOfFieldA(Integer sumOfFieldA) {
        this.sumOfFieldA = sumOfFieldA;
    }

    public BigDecimal getSumOfFieldB() {
        return sumOfFieldB;
    }

    public void setSumOfFieldB(BigDecimal sumOfFieldB) {
        this.sumOfFieldB = sumOfFieldB;
    }

    public BigDecimal getFieldC() {
        return fieldC;
    }

    public void setFieldC(BigDecimal fieldC) {
        this.fieldC = fieldC;
    }
}

将InputRecords列表转换为OutputRecords列表有哪些好的方法/解决方案?

我看到以下链接是否会有所帮助,但我一直试图将收藏家放在 fieldAfieldB 以便为 fieldC 形成一个新的收藏家:Java 8 Stream: groupingBy with multiple Collectors

Collector<InputRecord, ?, Integer> fieldACollector = Collectors.summingInt(InputRecord::getFieldA);
Collector<InputRecord, ?, BigDecimal> fieldBCollector = Collectors.reducing(BigDecimal.ZERO, InputRecord::getFieldB, BigDecimal::add);

List<Collector<InputRecord, ?, ?>> collectors = Arrays.asList(fieldACollector, fieldBCollector); // need a fieldCCollector object in the list

然后将使用 collectors 对象创建 complexCollector 对象(根据上面链接中Tagir Valeev接受的答案) .

4 回答

  • 3

    对我来说,最干净的方法是为此构建一个自定义收集器 . 这里有多行代码,但您可以在方法下隐藏它,因此您的最终操作将如下所示:

    Collection<OutputRecord> output = List.of(first, second, thrid, fourth)
                .stream()
                .parallel()
                .collect(toOutputRecords());
    

    而实际 toOutputRecords 将是:

    private static Collector<InputRecord, ?, Collection<OutputRecord>> toOutputRecords() {
        class Acc {
    
            Map<String, OutputRecord> map = new HashMap<>();
    
            void add(InputRecord elem) {
                String value = elem.getName();
                // constructor without fieldC since you compute it at the end
                OutputRecord record = new OutputRecord(value, elem.getFieldA(), elem.getFieldB());
                mergeIntoMap(map, value, record);
            }
    
            Acc merge(Acc right) {
                Map<String, OutputRecord> leftMap = map;
                Map<String, OutputRecord> rightMap = right.map;
    
                for (Entry<String, OutputRecord> entry : rightMap.entrySet()) {
                    mergeIntoMap(leftMap, entry.getKey(), entry.getValue());
                }
                return this;
            }
    
            private void mergeIntoMap(Map<String, OutputRecord> map, String value, OutputRecord record) {
    
                map.merge(value, record, (left, right) -> {
                    left.setSumOfFieldA(left.getSumOfFieldA() + right.getSumOfFieldA());
                    left.setSumOfFieldB(left.getSumOfFieldB().add(right.getSumOfFieldB()));
    
                    return left;
                });
            }
    
            public Collection<OutputRecord> finisher() {
                for (Entry<String, OutputRecord> e : map.entrySet()) {
                    OutputRecord output = e.getValue();
                    output.setFieldC(output.getSumOfFieldB().multiply(BigDecimal.valueOf(output.getSumOfFieldA())));
                }
                return map.values();
            }
    
        }
        return Collector.of(Acc::new, Acc::add, Acc::merge, Acc::finisher);
    }
    
  • 0

    您可以使用Stream.reduce(..)将两个记录转换为单个记录 . 它创建了一堆需要由JVM进行垃圾回收的临时对象 .

    Collection<InputRecord> input = Arrays.asList(
            new InputRecord("A", 100, new BigDecimal(1.1)),
            new InputRecord("A", 150, new BigDecimal(2.0)),
            new InputRecord("B", 200, new BigDecimal(1.5)),
            new InputRecord("A", 120, new BigDecimal(1.3)));
    
    Collection<OutputRecord> output = input.stream()
            // group records for particular Name into a List
            .collect(Collectors.groupingBy(InputRecord::getName))
            .values().stream()
            // Reduce every List to a single records, performing summing
            .map(records -> records.stream()
                    .reduce((a, b) ->
                            new InputRecord(a.getName(),
                                    a.getFieldA() + b.getFieldA(),
                                    a.getFieldB().add(b.getFieldB()))))
            .filter(Optional::isPresent)
            .map(Optional::get)
            // Finally transform the InputRecord to OutputRecord
            .map(record -> new OutputRecord(record.getName(),
                    record.getFieldA(),
                    record.getFieldB(),
                    record.getFieldB().multiply(new BigDecimal(record.getFieldA()))))
            .collect(Collectors.toList());
    
  • 1

    您可以使用组合和聚合函数从InputRecords列表生成OutputRecords列表 .

    Map<String, OutputRecord> result = inputRecords.stream().collect(() -> new HashMap<>(),
                    (HashMap<String, OutputRecord> map, InputRecord inObj) -> {
                        OutputRecord out = map.get(inObj.getName());
                        if (out == null) {
                            out = new OutputRecord();
                            out.setName(inObj.getName());
                            out.setSumOfFieldA(inObj.getFieldA());
                            out.setSumOfFieldB(inObj.getFieldB());
                        } else {
    
                            Integer s = out.getSumOfFieldA();
                            out.setSumOfFieldA(s + inObj.getFieldA());
                            BigDecimal bd = out.getSumOfFieldB();
                            out.setSumOfFieldB(bd.add(inObj.getFieldB()));
                        }
                        out.setFieldC(out.getSumOfFieldB().multiply(new BigDecimal(out.getSumOfFieldA())));
                        map.put(out.getName(), out);
    
                    }, (HashMap<String, OutputRecord> out1, HashMap<String, OutputRecord> out2) -> {
                        out1.putAll(out2);
                    });
    
            System.out.println(result);
    
  • 0

    我认为组合多个收集器的通用实用方法将更好,而不是定义定制的收集器(我认为它很复杂且难以维护) . 例如:

    public static <T, A1, A2, R1, R2> Collector<T, Tuple2<A1, A2>, Tuple2<R1, R2>> combine(final Collector<? super T, A1, R1> collector1,
            final Collector<? super T, A2, R2> collector2) {
     ...
    }
    

    使用combine方法,解决方案将是:

    Collector<InputRecord, ?, Integer> fieldACollector = MoreCollectors.summingInt(InputRecord::getFieldA);
    Collector<InputRecord, ?, BigDecimal> fieldBCollector = MoreCollectors.reducing(BigDecimal.ZERO, InputRecord::getFieldB, BigDecimal::add);
    
    inputRecords.stream().collect(MoreCollectors.groupingBy(InputRecord::getName, 
                                MoreCollectors.combine(fieldACollector, fieldBCollector)))
            .entrySet().stream()
            .map(e -> new OutputRecord(e.getKey(), e.getValue()._1, e.getValue()._2))
            .collect(Collectors.toList());
    

    以下是AbacusUtilcombine 的示例实现

    StreamEx.of(inputRecords)
            .groupBy(InputRecord::getName, MoreCollectors.combine(fieldACollector, fieldBCollector))
            .map(e -> new OutputRecord(e.getKey(), e.getValue()._1, e.getValue()._2)).toList();
    

相关问题