首页 文章

LINQ按时间段聚合和分组

提问于
浏览
29

我试图理解如何使用LINQ按时间间隔对数据进行分组;然后理想地聚合每个组 .

找到具有明确日期范围的众多示例,我正在尝试按时段分组,例如5分钟,1小时,1天 .

例如,我有一个用Date包装DateTime的类:

public class Sample
{
     public DateTime timestamp;
     public double value;
}

这些观察结果包含在List集合中的一系列中:

List<Sample> series;

因此,按小时时段和平均值汇总,我正在尝试做类似的事情:

var grouped = from s in series
              group s by new TimeSpan(1, 0, 0) into g
              select new { timestamp = g.Key, value = g.Average(s => s.value };

这基本上是有缺陷的,因为它将TimeSpan本身分组 . 我无法理解如何在查询中使用TimeSpan(或表示间隔的任何数据类型) .

8 回答

  • 2

    您需要一个对时间戳进行舍入的函数 . 就像是:

    var grouped = from s in series
              group s by new DateTime(s.timestamp.Year, s.timestamp.Month,  
                    s.timestamp.Day, s.timestamp.Hour, 0, 0) into g
              select new { timestamp = g.Key, value = g.Average(s => s.value };
    

    适用于每小时的垃圾箱 . 请注意,结果中的时间戳现在将是DateTime,而不是TimeSpan .

  • 1

    您可以将时间戳四舍五入到下一个边界(即过去最接近最近的5分钟边界)并将其用作您的分组:

    var groups = series.GroupBy(x =>
    {
        var stamp = x.timestamp;
        stamp = stamp.AddMinutes(-(stamp.Minute % 5));
        stamp = stamp.AddMilliseconds(-stamp.Millisecond - 1000 * stamp.Second);
        return stamp;
    })
    .Select(g => new { TimeStamp = g.Key, Value = g.Average(s => s.value) })
    .ToList();
    

    以上通过在分组中使用修改的时间戳来实现,该分组将分钟设置为前5分钟边界并移除秒和毫秒 . 当然,相同的方法可以用于其他时间段,即小时和天 .

    Edit:

    基于这个组成的样本输入:

    var series = new List<Sample>();
    series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(3) });
    series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(4) });
    series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(5) });
    series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(6) });
    series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(7) });
    series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(15) });
    

    我为我制作了3组,其中一组分组时间戳为3:05,一组为3:10,另一组为下午3:20(您的结果可能因当前时间而异) .

  • 0

    我在这场比赛上的比赛已经很晚了,但是我在寻找其他东西时遇到了这个问题,我认为我有更好的方法 .

    series.GroupBy (s => s.timestamp.Ticks / TimeSpan.FromHours(1).Ticks)
            .Select (s => new {
                series = s
                ,timestamp = s.First ().timestamp
                ,average = s.Average (x => x.value )
            }).Dump();
    

    这是一个linqpad程序示例,您可以进行验证和测试

    void Main()
    {
        List<Sample> series = new List<Sample>();
    
        Random random = new Random(DateTime.Now.Millisecond);
        for (DateTime i = DateTime.Now.AddDays(-5); i < DateTime.Now; i += TimeSpan.FromMinutes(1))
        {
            series.Add(new UserQuery.Sample(){ timestamp = i, value = random.NextDouble() * 100 });
        }
        //series.Dump();
        series.GroupBy (s => s.timestamp.Ticks / TimeSpan.FromHours(1).Ticks)
            .Select (s => new {
                series = s
                ,timestamp = s.First ().timestamp
                ,average = s.Average (x => x.value )
            }).Dump();
    }
    
    // Define other methods and classes here
    public class Sample
    {
         public DateTime timestamp;
         public double value;
    }
    
  • 0

    对于按小时分组,您需要按时间戳的小时部分进行分组,这可以这样做:

    var groups = from s in series
      let groupKey = new DateTime(s.timestamp.Year, s.timestamp.Month, s.timestamp.Day, s.timestamp.Hour, 0, 0)
      group s by groupKey into g select new
                                          {
                                            TimeStamp = g.Key,
                                            Value = g.Average(a=>a.value)
                                          };
    
  • 4

    我建议使用 new DateTime()avoid 任何 issues with sub millisecond 差异

    var versionsGroupedByRoundedTimeAndAuthor = db.Versions.GroupBy(g => 
    new
    {
                    UserID = g.Author.ID,
                    Time = RoundUp(g.Timestamp, TimeSpan.FromMinutes(2))
    });
    

    private DateTime RoundUp(DateTime dt, TimeSpan d)
            {
                return new DateTime(((dt.Ticks + d.Ticks - 1) / d.Ticks) * d.Ticks);
            }
    

    注:我在这里按Author.ID进行分组以及舍入的TimeStamp .

    RoundUp函数取自@dtb,答案在这里https://stackoverflow.com/a/7029464/661584

    了解平等到毫秒并不总是意味着平等Why does this unit test fail when testing DateTime equality?

  • 0

    即使我真的很晚,这是我的2美分:

    我希望以5分钟为间隔向下舍入()时间值:

    10:31 --> 10:30
    10:33 --> 10:35
    10:36 --> 10:35
    

    这可以通过转换为TimeSpan.Tick并转换回DateTime并使用Math.Round()来实现:

    public DateTime GetShiftedTimeStamp(DateTime timeStamp, int minutes)
    {
        return
            new DateTime(
                Convert.ToInt64(
                    Math.Round(timeStamp.Ticks / (decimal)TimeSpan.FromMinutes(minutes).Ticks, 0, MidpointRounding.AwayFromZero)
                        * TimeSpan.FromMinutes(minutes).Ticks));
    }
    

    如上所示,shiftedTimeStamp可用于linq分组 .

  • 39

    我改进了BrokenGlass的答案,使其更具通用性并增加了保护措施 . 根据他目前的答案,如果您选择9的间隔,它将无法达到您的预期 . 任何数字60都不能被整除 . 对于这个例子,我使用9并从午夜(0:00)开始 .

    • 从0:00到0:08.999的所有内容都将按照您的预期放入0:00的组中 . 它会一直这样做,直到你进入从0:54开始的分组 .

    • 在0:54,它只会将事物从0:54分组到0:59.999而不是上升到01:03.999 .

    对我来说,这是一个大问题 .

    我不知道如何解决这个问题,但你可以添加安全措施 .
    变化:

    • 60%[interval]等于0的任何分钟都是可接受的间隔 . 以下if语句保证了这一点 .

    • 小时间隔也有效 .

    double minIntervalAsDouble = Convert.ToDouble(minInterval);
            if (minIntervalAsDouble <= 0)
            {
                string message = "minInterval must be a positive number, exiting";
                Log.getInstance().Info(message);
                throw new Exception(message);
            }
            else if (minIntervalAsDouble < 60.0 && 60.0 % minIntervalAsDouble != 0)
            {
                string message = "60 must be divisible by minInterval...exiting";
                Log.getInstance().Info(message);
                throw new Exception(message);
            }
            else if (minIntervalAsDouble >= 60.0 && (24.0 % (minIntervalAsDouble / 60.0)) != 0 && (24.0 % (minIntervalAsDouble / 60.0) != 24.0))
            {
                //hour part must be divisible...
                string message = "If minInterval is greater than 60, 24 must be divisible by minInterval/60 (hour value)...exiting";
                Log.getInstance().Info(message);
                throw new Exception(message);
            }
            var groups = datas.GroupBy(x =>
            {
                if (minInterval < 60)
                {
                    var stamp = x.Created;
                    stamp = stamp.AddMinutes(-(stamp.Minute % minInterval));
                    stamp = stamp.AddMilliseconds(-stamp.Millisecond);
                    stamp = stamp.AddSeconds(-stamp.Second);
                    return stamp;
                }
                else
                {
                    var stamp = x.Created;
                    int hourValue = minInterval / 60;
                    stamp = stamp.AddHours(-(stamp.Hour % hourValue));
                    stamp = stamp.AddMilliseconds(-stamp.Millisecond);
                    stamp = stamp.AddSeconds(-stamp.Second);
                    stamp = stamp.AddMinutes(-stamp.Minute);
                    return stamp;
                }
            }).Select(o => new
            {
                o.Key,
                min = o.Min(f=>f.Created),
                max = o.Max(f=>f.Created),
                o
            }).ToList();
    

    在select语句中添加您想要的任何内容!我输入了min / max,因为它更容易测试 .

  • 10

    我知道这并没有直接回答这个问题,但我正在谷歌上寻找一个非常类似的解决方案来汇总股票/加密货币的蜡烛数据,从较小的分钟期到较高的分钟期(5,10,15,30) . 您不能简单地从当前分钟返回X,因为聚合时段的时间戳将不一致 . 您还必须注意列表的开头和结尾有足够的数据来填充较大时段的完整烛台 . 鉴于此,我提出的解决方案如下 . (它假设较小周期的蜡烛,如rawPeriod所示,按升序时间戳排序 . )

    public class Candle
    {
        public long Id { get; set; }
        public Period Period { get; set; }
        public DateTime Timestamp { get; set; }
        public double High { get; set; }
        public double Low { get; set; }
        public double Open { get; set; }
        public double Close { get; set; }
        public double BuyVolume { get; set; }
        public double SellVolume { get; set; }
    }
    
    public enum Period
    {
        Minute = 1,
        FiveMinutes = 5,
        QuarterOfAnHour = 15,
        HalfAnHour = 30
    }
    
        private List<Candle> AggregateCandlesIntoRequestedTimePeriod(Period rawPeriod, Period requestedPeriod, List<Candle> candles)
        {
            if (rawPeriod != requestedPeriod)
            {
                int rawPeriodDivisor = (int) requestedPeriod;
                candles = candles
                            .GroupBy(g => new { TimeBoundary = new DateTime(g.Timestamp.Year, g.Timestamp.Month, g.Timestamp.Day, g.Timestamp.Hour, (g.Timestamp.Minute / rawPeriodDivisor) * rawPeriodDivisor , 0) })
                            .Where(g => g.Count() == rawPeriodDivisor )
                            .Select(s => new Candle
                            {
                                Period = requestedPeriod,
                                Timestamp = s.Key.TimeBoundary,
                                High = s.Max(z => z.High),
                                Low = s.Min(z => z.Low),
                                Open = s.First().Open,
                                Close = s.Last().Close,
                                BuyVolume = s.Sum(z => z.BuyVolume),
                                SellVolume = s.Sum(z => z.SellVolume),
                            })
                            .OrderBy(o => o.Timestamp)
                            .ToList();
            }
    
            return candles;
        }
    

相关问题