none
Optimization / Linq to SQL / First & Count RRS feed

  • Question

  • This is a simple question, but I can't seem to find an answer for it:

    Assume that I want the first record in a table by ID.  The following code does it of course but is brutally slow.  Is there guidance as to the correct way to do this?  Count() is iterating over every record, and First() might be as well.

    This is an extremely simplistic function.  Just pass in an ID, get the Linq object (a record) and pass it back, that's it, but it is breathtakingly slow.

    Thanks in advance for any help.


    internal static Blob GetBlobById(int id)
    {

    Blob blob = null;

    var blobs = from d in context.Blobs  where d.BlobID == id select d;
     

    if (blobs.Count() > 0)

    {

    blob = blobs.First();

    }
    return blob;
    }

     

    • Edited by Alex Barimo Thursday, April 22, 2010 6:42 PM spelling
    Thursday, April 22, 2010 6:41 PM

Answers

  • Hi Alex

    Let me explain...

    nternal static Blob GetBlobById(int id)
    {

    Blob blob = null;

    var blobs = from d in context.Blobs  where d.BlobID == id select d;  -> this is probably using the correct index, right?, otherwise that could be slowing things down! (remember nothing has happened yet in the db)

    if (blobs.Count() > 0) -> here you are creating the connection and retrieving the above statement, you are doing a full table scan and that is of course slow, but you are probably limiting this by the above id, but in theory there could be millions of rows (well ohh you get the picture *S*)

    {

    blob = blobs.First(); -> and now you are creating the connection once again :-|

    }
    return blob;
    }

     

    Use blobs.FirstOrDefault(); and forget about the counting :)

     

    Maybe shorthand:

    internal static Blob GetBlobById(int id)
    {

    return ( from d in context.Blobs  where d.BlobID == id

    select new Blob(){d.FirstOrDefault()});  //implicit casting apparantly, as in your code ? not sure about this

    }

     

    Thursday, April 22, 2010 10:46 PM
  • Or even better forget about writing a function and just do:

    Blob blob = context.Blobs.FirstOrDefault(b => b.BlobID == id);

    [)amien

    Friday, April 23, 2010 3:10 AM
    Moderator
  • Hi Alex,

    Perhaps I am misunderstanding something.  Given the below statement, would using Single / SingleOrDefault be more appropriate?  If the field is set as the primary key there should only be one instance of the given ID in the table.

     

    "Yes, the BlobID is indexed, and it is set to Primary key and Identity Specification is Yes for this field."

     

    -Rob

    • Marked as answer by Alex Barimo Friday, April 30, 2010 5:48 PM
    Wednesday, April 28, 2010 8:57 AM

All replies

  • Using ToList() appears to help somewhat.

     

    Thursday, April 22, 2010 7:18 PM
  • Hi Alex

    Let me explain...

    nternal static Blob GetBlobById(int id)
    {

    Blob blob = null;

    var blobs = from d in context.Blobs  where d.BlobID == id select d;  -> this is probably using the correct index, right?, otherwise that could be slowing things down! (remember nothing has happened yet in the db)

    if (blobs.Count() > 0) -> here you are creating the connection and retrieving the above statement, you are doing a full table scan and that is of course slow, but you are probably limiting this by the above id, but in theory there could be millions of rows (well ohh you get the picture *S*)

    {

    blob = blobs.First(); -> and now you are creating the connection once again :-|

    }
    return blob;
    }

     

    Use blobs.FirstOrDefault(); and forget about the counting :)

     

    Maybe shorthand:

    internal static Blob GetBlobById(int id)
    {

    return ( from d in context.Blobs  where d.BlobID == id

    select new Blob(){d.FirstOrDefault()});  //implicit casting apparantly, as in your code ? not sure about this

    }

     

    Thursday, April 22, 2010 10:46 PM
  • If BlobID is indexed, I don't see why the above should be slow (unless the code is simplified or has had any vital elements removed). Do you do any joins, sorts etc in your real world query or is the above code sample exactly what you're doing?

    As Janus007 says you're doing two roundtrips instead of one, so you can always eliminate one, but start by checking that BlobID is really indexed.

    Also, if using ToList helps then something is not right. ToList will execute the query and bring all records to the client. If there can be many matching records that is not very efficient.


    Kristofer - Huagati Systems Co., Ltd.
    Cool tools for Linq-to-SQL and Entity Framework:
    huagati.com/dbmltools (add-in with new features for the L2S and EF designers in VS2008 and VS2010)
    huagati.com/L2SProfiler (Query profiler for Linq-to-SQL and LLBLGen Pro)
    Friday, April 23, 2010 1:47 AM
    Answerer
  • Or even better forget about writing a function and just do:

    Blob blob = context.Blobs.FirstOrDefault(b => b.BlobID == id);

    [)amien

    Friday, April 23, 2010 3:10 AM
    Moderator
  • Thanks all very much for your replies.  I really appreciate it.

    Yes, the BlobID is indexed, and it is set to Primary key and Identity Specification is Yes for this field.

    Since the BlobID is unique it will return one and only one record.  ToList() therefore brings in one record only.

    ToList() is substantially faster than .Count() > 0, then First().

    I will try Blob "blob = context.Blobs.FirstOrDefault(b => b.BlobID == id);" and profile it as well in VS2010 and will let you know the result as compared with ToList().

    Thanks again.

     

    Friday, April 23, 2010 1:29 PM
  • OK, FYI

    For the above case, i.e. BlobID is indexed and is the primary key and where there is only one record for a particular ID, using .ToList() to get the result and then return the first record if it exists takes about as long as .FirstOrDefault().

    FirstOrDefault uses less code, so it's of course better than using .ToList() in this instance (trying to get one record, where there is one record only).

    The above are much faster than .First() and .Count().  The replies to the question seem to indicate that this is because two connections are made.  I wouldn't expect this to happen however.

    Anyway thanks very much again for taking the time to respond.

    Friday, April 23, 2010 2:17 PM
  • Hi Alex,

    Perhaps I am misunderstanding something.  Given the below statement, would using Single / SingleOrDefault be more appropriate?  If the field is set as the primary key there should only be one instance of the given ID in the table.

     

    "Yes, the BlobID is indexed, and it is set to Primary key and Identity Specification is Yes for this field."

     

    -Rob

    • Marked as answer by Alex Barimo Friday, April 30, 2010 5:48 PM
    Wednesday, April 28, 2010 8:57 AM
  • Certainly.. One should use Single if the field is pk :)

    It would then be:

    Blob blob = context.Blobs.Single(b => b.BlobID == id);

     

     

    Wednesday, April 28, 2010 12:32 PM
  • Hi Rob.

    There is one record only, so SingleOrDefault would also work.

    I'll check to see if this is any faster.

    Thanks for your reply.

     

    Friday, April 30, 2010 5:48 PM