none
Get histogram from Database RRS feed

  • Question

  • I'm quite new to LINQ, and I'm looking for info on how to efficiently create a histogram from my database of events. I would like to create a function that takes a time period as input (the length of each bin) and returns a collection with the histogram data for each bin where the number of items is determined by (maxDate-minDate)/TimePeriod. E.g.:

    Each event in the database looks like this:

    [ID][DateTime][Type][..]

    E.g.: [1][01.Dec.2010 02:10:44.232][Up]

    I would like to make a function that e.g. returns the data for how many events have occured per day. I would also like to provide the size of the histogram bin programmatically, so I can easily change the bin sizes to 1 minute, 10 minutes, per day, 5 days, per month, etc. I manage to do this now by first querying for the first date and the last date in my database, then create a collection that divides this into e.g. 10 minute bins, then do a separate LINQ query for each bin.  

    <pre lang="x-c#">public int GetDataForPeriod(DateTime fromTime, DateTime toTime, string osName)
    {
      var db = new LinqDBDataContext();
      var binSize = (from k in db.Actions
              where k.OperatorStation == osName &&
                 k.IsNavigationEvent == true &&
                 k.TimeStamp >= fromTime &&
                 k.TimeStamp <= toTime
               select k).Count(); 
      return binSize;
    }
    
    For my set of data, this takes about 30 seconds when the collection holds 250 bins. I would like to find a more efficient query that does not run 250 LINQ queries to get this data. Any hints?
    Friday, November 19, 2010 2:58 PM

All replies

  • Kristoh,

    While this approach requires you to make multiple trips to DB and assuming that this information might finally get aggregaed as a report, I would suggest you to look at some more conventional ways like http://www.devx.com/dbzone/Article/38511.

    Let us know if this doesn't fit your requirements

    Regards,

    Jay[MSFT]

     

     


    Jay Akhawri
    Wednesday, November 24, 2010 5:32 PM
  • Thanks for the input, but it doesnt quite fit my needs. We're not yet sure what we are looking for in the data, so I want to try to build something where we can explore the data. I found a method that gets me a bit closer, where the data is grouped by date, month, etc into bins like this:

    public List<HistogramData> GetHistogramDataPerHour(string osName){
        var db = new LinqDBDataContext();
        var allEvents = from k in db.Actions
                where k.OperatorStation == osName &&
                   k.IsNavigationEvent == true
                select new {k.TimeStamp, isTabNav=false};
          
        var tabEvents = from k in db.Actions
               where k.OperatorStation == osName &&
                  k.IsNavigationEvent == true &&
                 (k.EventType == Event.EventTypes.TabClicked.ToString() ||
                  k.EventType == Event.EventTypes.DetailTabClicked.ToString())
               select new {k.TimeStamp, isTabNav = true};
    
        var all = allEvents.Concat(tabEvents);
    
        var groups = from k in all
          group k by k.TimeStamp.Value.Date into dateGroup
          select new HistogramData(dateGroup.Key, dateGroup.Count(), dateGroup.Count(info=>info.isTabNav));
                 
       return groups.ToList();
    }
    

    This works quite well and a lot quicker than extracting data for each bin separately. However, I would like to be able to specify a different period, e.g. 8-hour periods starting at 7 in the morning on the first day. Is there any way to modify the above method to accomplish this?

    Friday, November 26, 2010 2:42 PM
  • Krishtoh,

    I apologize as I am not able to find an exisiting solution for you to achieve this. You may utilize an advisory support case with product support group to get a hold on it.

    Regards,


    Jay Akhawri
    Monday, December 20, 2010 8:51 PM