原因:mongodb每一個文檔默認只有16M。聚合的結果是一個BSON文檔,當超過16M大小時,就會報內存不夠錯誤。
exceeded memory limit for $group.but didn't allow external sort.
可以采用打開使用磁盤來解決大小問題。例如
db.flowlog.aggregate([{$group:{_id:"$_id"}}], {allowDiskUse: true})
java代碼片段
AggregationOptions options = new AggregationOptions.Builder().allowDiskUse(true).build(); Aggregation agg = Aggregation.newAggregation().withOptions(options);
但是如果結果集超過了16M,那么依然會報錯誤。
采用一個下面的聚合方法
Aggregation agg = Aggregation.newAggregation( Aggregation.group(field1 , field2 , field3) .sum(field4).as("sampleField1") .sum(field5).as("sampleField2"), Aggregation.project(field4, field5), new AggregationOperation() { @Override public DBObject toDBObject(AggregationOperationContext context) { return new BasicDBObject("$out", "test"); } }).withOptions(options);
mongo.aggregate(agg, sourceCollection, Test.class);
紅色部分是重點,構造這個agg可以將得到的結果導入插入到out中,並且不會有16M的限制問題。
如果要在聚合的時候增加一個常量,可采用以下形式
Aggregation agg = Aggregation.newAggregation( Aggregation.group( , OnofflineUserHistoryField.MAC , StalogField.UTC_CODE) .sum(OnofflineUserHistoryField.WIFI_UP_DOWN).as(OnofflineUserHistoryField.WIFI_UP_DOWN) .sum(OnofflineUserHistoryField.ACTIVE_TIME).as(OnofflineUserHistoryField.ACTIVE_TIME), Aggregation.project("mac","buildingId","utcCode",OnofflineUserHistoryField.ACTIVE_TIME, OnofflineUserHistoryField.WIFI_UP_DOWN).and( new AggregationExpression() { @Override public DBObject toDbObject(AggregationOperationContext context) { return new BasicDBObject( "$cond", new Object[]{ new BasicDBObject( "$eq", new Object[]{ "$tenantId", 0} ), 20161114, 20161114 }); } }).as("day").andExclude("_id"),
或者
and(new AggregationExpression() {
@Override
public DBObject toDbObject(AggregationOperationContext context) {
return new BasicDBObject("$add", new Object[] { 20141114 });
}
}).as("day").andExclude("_id"),
new AggregationOperation() { @Override public DBObject toDBObject(AggregationOperationContext context) { return new BasicDBObject("$out", "dayStaInfoTmp"); } }).withOptions(options);
紅色和棕色部分為聚合中增加常量的兩種方法。目前沒有找到更方便的聚合添加常量的方法。