Unanswered Linq to Entity is horribly slow on large database

  • 7. března 2012 11:51
     
      Obsahuje kód

    Hi,

    we are querying data from a large oracle database having 9 million records using LINQ to Entities in a silverlight project.

    we have a complex view created in the database which joins 7 tables to retrieve the records. When we run the oracle sql queries in the oracle sql developer to retrieve first 100 records it returns within 1 second but when we run the similar query using LINQ it did not return even after one hour. I am using Take(100) to retrieve the initial 100 records.

    My query looks like this

    Entities1 bookingEntities = new CMSDataService(new Uri(((App)Application.Current).m_strServiceBaseUri + ClientDefs.strsvcCMSMultitableService));
                        query = (from c in bookinEntities.V_MYVIEW select c).Take(100);       
    
    var dsQuery = (System.Data.Services.Client.DataServiceQuery<V_MYVIEW>)query;
                    dsQuery.BeginExecute(new AsyncCallback(c =>
                    {
                        IEnumerable<V_MYVIEW> result = dsQuery.EndExecute(c);
                        ((App)Application.Current).ListBookingRecords = new ObservableCollectionEx<V_MYVIEW>(result);                  
                    }), dsQuery);           

    How does Take() work in LINQ?

    The oracle query which gives instantaneous result in oracle Sql developer is

    "select * from V_MYVIEW where rownum < 101"

    Do I need to change some setting in the entity class. We are using Devart DotConnect for Oracle to generate the entity classes for oracle.


    Senior Software Developer

Všechny reakce

  • 7. března 2012 14:21
     
     

    You could try this to debug...

    Ditch the view for now.  Write a LINQ Query for exactly what you are looking for... 

    var stuff = bookinEntities.table.where(p=>p.rownum<101);

    Then execute that query and determine the time...

    From there you have just a few options which is 1) To Determine how LINQ converts the QUERY to SQL and

    2) Determining the source of the delay... 

    As far as how does TAKE work....  I think but am not 100% on this that the entire query returns only to take 100 records on client side.  I could be wrong but I think I discovered that a while back.


    JP Cowboy Coders Unite!

    • Navržen jako odpověď LimitedDeadFat 9. března 2012 3:25
    • Zrušeno navržení jako odpověď Javed937 9. března 2012 4:28
    •  
  • 9. března 2012 4:30
     
     

    Hi Javaman,

    How to see the internal SQL query, which LINQ creates?


    Senior Software Developer

  • 9. března 2012 5:37
     
      Obsahuje kód

    Hi,

    Call the method "ToString()" on your query object in order to get hold on the generated SQL statement. Here is a small example:

     using (HumanResourcesEntities context = new HumanResourcesEntities())
                {
    
                    var qry = from e in context.EMPLOYEES
                              join d in context.DEPARTMENTS
                                   on e.DEPARTMENT_ID equals d.DEPARTMENT_ID
                              select new
                               {
                                   Department = d.DEPARTMENT_NAME,
                                   ID = e.EMPLOYEE_ID,
                                   Firstname = e.FIRST_NAME,
                                   Lastname = e.LAST_NAME
                               };
    
                    // This shows the SQL generated by the query
                    Console.WriteLine(qry.ToString());
                    
                    Console.ReadLine();
    
                    // Here the query is executed
                    foreach (var d in qry)
                    {
                        Console.WriteLine("No. {0} : {1} {2}", d.ID, d.Firstname, d.Lastname);
                    }
                    System.Console.ReadLine();
                }

    Cheers,

    Stefan

  • 9. března 2012 6:49
     
     


    For my below query

    var query = (from d in bookingEntities.V_MYVIEW select d).Take(100);

    query.ToString() is giving following string because I am using a View

    "http://localhost:8080/CMSDataService_Multitable.svc/V_MYVIEW()?$top=100"

    also In silverlight client query execution is asynchronous. so we cannot examine the execution also. See my top posting.


    Senior Software Developer

  • 9. března 2012 13:44
     
     

    So go to your service and determine what the V_MYVIEW()?$top=100" is doing.

    If you want I can take you down the path of analyzing turn-around times using Wireshark...  This will tell you which end is having delays.  My guess is that given the fact the parm is passing in TOP 100 only that the problem is on that side. 

    In the instance where you just can't get to "the other side" to change to code, you can alter your client side by implementing Asychronous support so that the callback just updates OBSERVABLECOLLECTIONs.  Or, you can convert IENUMERABLEs to IOBSERVABLEs using Reactive Extensions in what is refered to as a PUSH based service.  Not that going ASYNC will speed things up, it just gives the end user the impression something is happening because they see each row updating in real time.  In addtion the GUI is never frozen.  So if you plan this right the USER can cancel at anytime.  OR stop updates. 


    JP Cowboy Coders Unite!

  • 12. března 2012 5:37
     
     

    Yes please help me determining the slow path using Wireshark. I have never used this but networking tool.

    Also if you have any sample to make the LINQ execution asynchronous using Observable collections please share that also.


    Senior Software Developer

  • 12. března 2012 13:40
     
     
    Download and install Wireshark.  Go to menu option Interface and start the trace on the interface the packets go out.  Recreate the error and stop the trace.  Find the packets to the HOST and right click on any of those packets, choose the option FOLLOW TCP STREAM.  This will be the converstation.  Then what you want to do is attempt to see where the time delay is found within the trace.  You can do this a number of ways, as there are built-in analyzers but mostly you can see the times between when the packets are sent and responded to from the client.  You can see relative times on far left hand side of trace.  If you see GAPS in time between a QUERY sent and the results returned then you know the problem is on HOST side.  Try that first.

    JP Cowboy Coders Unite!

  • 12. března 2012 16:46
     
     
     
    How many tables are on the model, if using model first approach?
     
  • 14. března 2012 11:09
     
     

    Hi Javaman

    I analyzed my query on a relatively smaller database consisting 0.1 million records. On oracle sql developer it took 0.2 secs but in my application it took 50 secs. When I saw the trace in Wireshark the time lag was around 50 secs between the client request and server response. It means LINQ is sending the query immediately to the server but oracle server took almost 50 secs to deliver the data to client (including transporting the data to client machine through network).

    Now the question is why oracle sql developer is retrieving the data so fast when a standalone SQL query is run on it? The only reason which I guess is LINQ is creating some wrong or complex query behind the seen and sending to oracle.

    Is there anyway by which I can see those LINQ created query?


    Senior Software Developer

  • 14. března 2012 11:13
     
     

    Hi darnold924,

    There are 18 tables and 10 views in my modal. But I am querying the data from a view which gets the data from 6 tables only.


    Senior Software Developer

  • 14. března 2012 19:34
     
     
    On 3/14/2012 7:09 AM, Javed937 wrote:
    > Hi Javaman
    >
    > I analyzed my query on a relatively smaller database consisting 0.1
    > million records. On oracle sql developer it took 0.2 secs but in my
    > application it took 50 secs. When I saw the trace in Wireshark the time
    > lag was around 50 secs between the client request and server response.
    > It means LINQ is sending the query immediately to the server but oracle
    > server took almost 50 secs to deliver the data to client (including
    > transporting the data to client machine through network).
    >
    More is happening than that, like the objects must be materialized using
    Linq and sent to the client. It's not a query for data and send it back
    with direct P-SQL with the resultset sent back to the client.
     
     
    > Now the question is why oracle sql developer is retrieving the data so
    > fast when a standalone SQL query is run on it? The only reason which I
    > guess is LINQ is creating some wrong or complex query behind the seen
    > and sending to oracle.
     
    You use a profiler and you go look. Maybe you are doing something that
    is making the query read data more than one time.
     
  • 15. března 2012 12:39
     
     

    One caveat on the trace, if the data is TEXT based and there is no compression, chances are that part of the delay in showing data in client can be due to too slow transfers times.  If the data is compressed you can get up to 8 times better perormance.  (Just found this same type of issue yesterday with another client)

    You are on right track to determine if LINQ formed query is adequate.  You can see the SQL Queries in LINQ if you use ILDASM on the binary exe to dissassemble the code.  I think that's the only way.  There are some plug ins for VS2010 but I haven't been able to get them to work.

    Keep up the good hunt!!!!  Kind of fun once you get on the trail isn't it?


    JP Cowboy Coders Unite!

  • 19. března 2012 7:40
     
     

    Hi Javaman,

    I am creating an exe to use ILDASM to see the linq query as my present project is silverlight web based project.

    Darnold,

    Can you suggest me a profiler which will work for me. I am using oracle. And also how to use that.


    Senior Software Developer

  • 20. března 2012 10:25
     
      Obsahuje kód

    ILDASM is not showing any concrete query. What it is showing is the query expressions.

    Below is the assembly code for the OnGridLoaded() function which is defined as follow

    private void OnGridLoaded(object sender, RoutedEventArgs e)
            {
                var context = new Entities();
                var query = (from a in context.V_TXS_BOOKING select a).Take(100);                        
                var result = query.ToList();           
            }
    ..method private hidebysig instance void  OnGridLoaded(object sender,
                                                          class [PresentationCore]System.Windows.RoutedEventArgs e) cil managed
    {
      // Code size       76 (0x4c)
      .maxstack  5
      .locals init ([0] class WpfApplication1.Entities context,
               [1] class [System.Core]System.Linq.IQueryable`1<class WpfApplication1.V_TXS_BOOKING> query,
               [2] class [mscorlib]System.Collections.Generic.List`1<class WpfApplication1.V_TXS_BOOKING> result,
               [3] class [System.Core]System.Linq.Expressions.ParameterExpression CS$0$0000,
               [4] class [System.Core]System.Linq.Expressions.ParameterExpression[] CS$0$0001)
      IL_0000:  nop
      IL_0001:  newobj     instance void WpfApplication1.Entities::.ctor()
      IL_0006:  stloc.0
      IL_0007:  ldloc.0
      IL_0008:  callvirt   instance class [System.Data.Entity]System.Data.Objects.ObjectSet`1<class WpfApplication1.V_TXS_BOOKING> WpfApplication1.Entities::get_V_TXS_BOOKING()
      IL_000d:  ldtoken    WpfApplication1.V_TXS_BOOKING
      IL_0012:  call       class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle(valuetype [mscorlib]System.RuntimeTypeHandle)
      IL_0017:  ldstr      "a"
      IL_001c:  call       class [System.Core]System.Linq.Expressions.ParameterExpression [System.Core]System.Linq.Expressions.Expression::Parameter(class [mscorlib]System.Type,
                                                                                                                                                     string)
      IL_0021:  stloc.3
      IL_0022:  ldloc.3
      IL_0023:  ldc.i4.1
      IL_0024:  newarr     [System.Core]System.Linq.Expressions.ParameterExpression
      IL_0029:  stloc.s    CS$0$0001
      IL_002b:  ldloc.s    CS$0$0001
      IL_002d:  ldc.i4.0
      IL_002e:  ldloc.3
      IL_002f:  stelem.ref
      IL_0030:  ldloc.s    CS$0$0001
      IL_0032:  call       class [System.Core]System.Linq.Expressions.Expression`1<!!0> [System.Core]System.Linq.Expressions.Expression::Lambda<class [mscorlib]System.Func`2<class WpfApplication1.V_TXS_BOOKING,class WpfApplication1.V_TXS_BOOKING>>(class [System.Core]System.Linq.Expressions.Expression,
                                                                                                                                                                                                                                                        class [System.Core]System.Linq.Expressions.ParameterExpression[])
      IL_0037:  call       class [System.Core]System.Linq.IQueryable`1<!!1> [System.Core]System.Linq.Queryable::Select<class WpfApplication1.V_TXS_BOOKING,class WpfApplication1.V_TXS_BOOKING>(class [System.Core]System.Linq.IQueryable`1<!!0>,
                                                                                                                                                                                                class [System.Core]System.Linq.Expressions.Expression`1<class [mscorlib]System.Func`2<!!0,!!1>>)
      IL_003c:  ldc.i4.s   100
      IL_003e:  call       class [System.Core]System.Linq.IQueryable`1<!!0> [System.Core]System.Linq.Queryable::Take<class WpfApplication1.V_TXS_BOOKING>(class [System.Core]System.Linq.IQueryable`1<!!0>,
                                                                                                                                                          int32)
      IL_0043:  stloc.1
      IL_0044:  ldloc.1
      IL_0045:  call       class [mscorlib]System.Collections.Generic.List`1<!!0> [System.Core]System.Linq.Enumerable::ToList<class WpfApplication1.V_TXS_BOOKING>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>)
      IL_004a:  stloc.2
      IL_004b:  ret
    } // end of method MainWindow::OnGridLoaded



    Senior Software Developer

  • 20. března 2012 11:10
     
     

    OMG I didn't realize that:  Well the good news is that you get to learn Expressions which is really hard to learn....

    Try this from MSFT...http://msdn.microsoft.com/en-us/library/bb386961.aspx

    My apologies for steering you down wrong path.

    Seriously you must look into Reactive Extensions as it turns IENURMERALBEs into IOBSERVABLES.  It's 100% asynchronous and IT's PUSHed based so the updates are PUSHED to the user.  This then improves response persception to the user 100%.


    JP Cowboy Coders Unite!

  • 20. března 2012 11:14
     
     

    Here's a solution for you from one of my threads:

    http://social.msdn.microsoft.com/Forums/en-US/rx/thread/69326634-c23b-4bee-8a32-f3148c2b4b62


    JP Cowboy Coders Unite!

  • 21. března 2012 6:44
     
     
    Let me try this reactive extension approach. But I am worried how to do this in silverlight as in silverlight all query operation are already asynchronous.

    Senior Software Developer


    • Upravený Javed937 21. března 2012 6:49
    •  
  • 21. března 2012 9:57
     
     

    "Try this from MSFT...http://msdn.microsoft.com/en-us/library/bb386961.aspx "

     Log property is available for LINQ to SQL and not for LINQ to EF.

    Senior Software Developer

  • 22. března 2012 6:51
     
      Obsahuje kód

    I am able to make this asynchronous approach work for a WPF application but for Silverlight I could not. It is throwing following exception at the call of

    thing.Subscribe(act);

    in the following code

    private void OnGridLoaded(object sender, RoutedEventArgs e)
            {
                Entities1 ad = new Entities1(new Uri(@"http://localhost:4238/WcfDataService1.svc/"));
                var stuff = ad.V_TXS_BOOKING;
                var os = stuff.Where(p => p.LASTNAME == "LOPEZ");
    
                Task taska = Task.Factory.StartNew(() =>
                {
                    var thing = os.ToObservable();
                    thing.Subscribe(act);
                });
            }
    
            private void _act(V_TXS_BOOKING tpd)
            {
                XTBRowCount.Dispatcher.BeginInvoke((Action)delegate()
                {
    
                    XTBRowCount.Text = col.Count().ToString();
                    col.Add(tpd);
                }, null);
            }

    "Silverlight does not enable you to directly enumerate over a data service query. This is because enumeration automatically sends a synchronous request to the data service. Because Silverlight only supports asynchronous operations, you must instead call the BeginExecute and EndExecute methods to obtain a query result that supports enumeration."

    Is it possible for you to test for silverlight also as you did for WPF?

    Also I am on vacation for 2 weeks so will follow on after returning.


    Senior Software Developer

  • 22. března 2012 9:25
     
     

    Javed;

      Post that entire last post over on Reactive Extensions forum.  A person by name of Dave Sexton may be able to help you.  I have never used RX in Silvelight but have read many posts over there that they are getting it working.  Does SL, RX and EF all work together? I don't know...


    JP Cowboy Coders Unite!


  • 22. března 2012 9:27
     
     
    BTW did Rx appear to be faster from user perspective?

    JP Cowboy Coders Unite!

  • 17. května 2012 23:30
     
     
    Sorry this may have already been mentioned, but you can use linqpad ( http://www.linqpad.net/) to not only test your linq but to see the underlying SQL generated.
  • 18. května 2012 11:00
     
      Obsahuje kód

    Intellitrace can show you the generated SQL or call ToTraceString to a query casted to a ObjectQuery (look this) in Linq2Entities.

    NB! i'm hoping Take is not using rownum because in Oracle rownum came before order by so a query like 

    bookinEntities.V_MYVIEW.OrderBy(mv => mv.SomeProperty).Take(100)

    would result in an incorrect resultset back from dB.

    Max


  • 18. května 2012 13:22
     
     
    On 3/7/2012 6:51 AM, Javed937 wrote:
    > Hi,
    >
    > we are querying data from a large oracle database having 9 million
    > records using LINQ to Entities in a silverlight project.
    >
    > we have a complex view created in the database which joins 7 tables to
    > retrieve the records. When we run the oracle sql queries in the oracle
    > sql developer to retrieve first 100 records it returns within 1 second
    > but when we run the similar query using LINQ it did not return even
    > after one hour. I am using Take(100) to retrieve the initial 100 records.
    >
    > My query looks like this
    >
    > Entities1 bookingEntities = new CMSDataService(new Uri(((App)Application.Current).m_strServiceBaseUri+  ClientDefs.strsvcCMSMultitableService));
    >                      query = (from c in bookinEntities.V_MYVIEW select c).Take(100);
    >
    > var dsQuery = (System.Data.Services.Client.DataServiceQuery<V_MYVIEW>)query;
    >                  dsQuery.BeginExecute(new AsyncCallback(c =>
    >                  {
    >                      IEnumerable<V_MYVIEW>  result = dsQuery.EndExecute(c);
    >                      ((App)Application.Current).ListBookingRecords = new ObservableCollectionEx<V_MYVIEW>(result);
    >                  }), dsQuery);
    >
    > How does Take() work in LINQ?
    >
    > The oracle query which gives instantaneous result in oracle Sql developer is
    >
    > "select * from V_MYVIEW where rownum < 101"
    >
    > Do I need to change some setting in the entity class. We are using
    > Devart DotConnect for Oracle to generate the entity classes for oracle.
    >
     
    What you shouldn't be doing is using 7 joins.