none
Memory leak with linq to sql RRS feed

  • Question

  • I have a program which will read 1 million records from the database and insert the records into another table after some processing. So, the program calls a DAO API to get 1 million records and loop through them in memory. The insertion is also using a DAO API to do the insert. Each DAO API part is implemented as

        public static void Main(string[] args)
       
    {
           
    List<TableA> tableAs = GetTableAs();
           
    TableB tableB = null;
           
    foreach (var tableA in tableAs) {
               tableB
    .id = tableA.id;
               
    // ...
               
    // here I copy/modify records from table A
               
    // ...
               
    InsertTableB(tableB);
           
    }
       
    }

       
    public static List<TableA> GetTableAs()
       
    {
            using
    (var ctx = new TableDataContext())
           
    {
               
    var entities = from e in ctx.tableAs
                               
    select e;
               
    return new List<tableA>(entities);
           
    }
       
    }

       
    public static void InsertTableB(TableB tableB)
       
    {
            using
    (var ctx = new TableDataContext())
           
    {
                ctx
    .tableBs.InsertOnSubmit(tableB);
                ctx
    .SubmitChanges();
           
    }
       
    }

    My program will experience "Out of memory" exception after 500k inserts and it is very consistent. And i notice that the memory usage keeps increasing while looping. I even force garbage collection also cannot reclaim any of the memory. I have run the program with the vs2010 .net memory profiler and it show alot of TableDataContext, TableA and TableB instances not garbage collected. Is there anything wrong with my interact with LINQ so that causes the memory hold up and not being released. Any help will be much appreciated.

    Friday, May 11, 2012 1:18 AM

Answers

  • First of all, if you are copying a large amount of rows from one table to another, this is a very inefficient way to do that. First you are retrieving *all* records from "Table A", bringing them across the network (or process boundary if local machine), and creating one entity object [with change tracking] for each record. Next, looping through and inserting them into "Table B" with one db roundtrip + transaction per record.

    This kind of work can (and should) be done db-side.

    A better solution to this would be a stored procedure (or ad-hoc SQL-DML statement) that simply do a:

    insert into TableB (col1, col2, col3....)
    select colA, colB, colC
    from TableA
    where ....

    That said, if there is some very good reason for doing this with a 1,000,001 db roundtrips, you may get a little bit further if you add a ctx.DeferredLoadingEnabled = false; after instantiating the datacontext (esp. in "GetTableAs"). That will turn off lazy loading, so should make it use [a little bit] less memory.


     

       Cool tools for Linq-to-SQL and Entity Framework 4:
     huagati.com/dbmltools - Visual Studio add-in with loads of new features for the Entity Framework and Linq-to-SQL designers
     huagati.com/L2SProfiler - Runtime SQL query profiler for Linq-to-SQL and Entity Framework v4


    Monday, May 14, 2012 1:04 AM
    Answerer