none
Entity Framework breaks when connecting to a mirrored SQL Server environment? RRS feed

  • Question

  • Dear all,

    Recently one of our clients switched to a mirrored SQL Server setup, which caused our application to fail. We have narrowed it down to instances where Entity Framework is used. It seems multiple datareaders are opened when retrieving or iterating through some entities. This has always worked in a none-mirrored environment, but now fails.

    We have tried setting the Multipleactiveresultsets (MARS) setting to TRUE but this does not help. The only workaround we have found so far is to force the application to create a new SQL connection for each iteration by slightly changing the connection string. This is obviously not a sturdy solution: it might spawn an unwantedly high number of SQL connections, moreover we would have to adapt the applications logic in each instance where multiple readers are created by EF, and there's really no way of knowing beforehand when this occurs.

    Is there a setting (either server- or client-side) that either forces EF to only open one reader at a time, or allow multiple
    datareaders simultaneously without throwing exceptions?

    Any suggestions are highly appreciated

    Sven Eric



    • Edited by SvenEric Friday, January 31, 2014 7:07 AM weird xml tags
    Friday, January 31, 2014 7:06 AM

Answers

  • Cool.  That's a repro.

    And congratulations, I think you have found a bug.

    Here's a simplified repro:

    using System;
    using System.Collections.Generic;
    using System.Data.SqlClient;
    using System.Linq;
    using System.Text;
    using System.Threading.Tasks;
    
    namespace connectionFailureRepro
    {
      class Program
      {
        static void Main(string[] args)
        {      
          var cb = new SqlConnectionStringBuilder();
          cb.DataSource = "sql.dbrowne.lab";
          cb.InitialCatalog = "AdventureWorks2012";
    
          //rConnectionString.IntegratedSecurity = True;
          cb.UserID = "sa";
          cb.Password = "xxxxx";
    
          cb.ConnectTimeout = 0;
          cb.AsynchronousProcessing = true;
          cb.ApplicationName = "AdventureWorks";
          cb.WorkstationID = "TestStation";
    
          var constr = cb.ConnectionString;
    
          var con1 = new SqlConnection(constr);
          con1.Open();
            
          var con2 = new SqlConnection(constr);
          con2.Open();
    
        }
      }
    }

    The issue is somewhere in the SqlClient connection logic for mirrored databases, and only repros with ConnectTimeout = 0 (infinite).

    I'll submit this repro to the team that owns SqlClient, and in the mean time, just change your ConnectTimeout to any non-zero value.  120 (2 minutes) is about as log as you would ever want anyway. 

    And thank you very much for sticking with this and getting it to a repro.  It's people like you that make the platform better.

    David


    David http://blogs.msdn.com/b/dbrowne/




    Monday, March 10, 2014 6:05 PM

All replies

  • Hello,

    What is the error message? Could you please share it with us?

    >> Is there a setting (either server- or client-side) that either forces EF to only open one reader at a time, or allow multiple datareaders simultaneously without throwing exceptions?

    Setting the Multipleactiveresultsets (MARS) setting to true should be ok if there are multiple data retrieval commands executed on single connection (or single command with multiple selects) while next DataReader is executed before first one has completed the reading.

    Regards.


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.


    Monday, February 3, 2014 2:48 AM
    Moderator
  • Dear Fred,

    Apologies for my late reply, the alert I set on this query ended up in my spam folder for some reason. Unfortunately there is no error message, the call just hangs without ever returning (or even timing out).

    We could try to setup a code example using the adventureworks database, would this help?

     

    Thanks for your time.

    Best regards,

     

    Sven 

    Thursday, February 20, 2014 3:00 PM
  • >We could try to setup a code example using the adventureworks database, would this help?

    Yes.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Thursday, February 20, 2014 3:47 PM
  • Dear David,

    To reproduce the error based on the Adventure Works database I have done the following:

    1. Installed a second instance of SQL Server 2012

    1. Downloaded the MDF file "AdventureWorks2012" from: http://msftdbprodsamples.codeplex.com/releases/view/55330

    2. Restored the database as "AdventureWorks2012" on Instance1

    3. Changed the "Recovery model" from "Simple" to "Full" for this database

    4. Created a Full backup of this database

    5. Created a Transaction Log backup of this database

    6. Restored the Full backup on Instance2 as "AdventureWorks2012" with the option Recovery State "RESTORE WITH NORECOVERY"

    7. Restored the Transaction Log backup (also with the option Recovery State "RESTORE WITH NORECOVERY")

    8. Created a Mirror setup between Instance1 and Instance2 using the Wizard (select the database: right click Tasks > Mirror...)

    9. Created a Windows form application Project. You can download the project here: https://onedrive.live.com/redir?resid=47417E863479811F%21107

    This project selects some data and dumps it in a text file. If you start the project you see a simple form with 3 textboxes:

    - Server: the name of the SQL Server for Instance1

    - Database: the name of the database

    - Export location: the location and filename for the exportfile to be created (e.g. c:\temp\ExportFile.txt)

    In a mirrored setup the problem occurs on the code line "d = aw.Department.Where(Function(dep) dep.D_DepartmentID = 12).Single".

    The call will block without ever returning. If I remove the mirror and try again the code returns immediately (with a valid result). We have tried to rule out machine specific issues by testing this on our production SQL servers and on local instances of SQLServer on one of our developers workstation. The results were the same, (un)fortunately.

    Any help would be much appreciated.

    Kind regards,

    Sven

    Monday, February 24, 2014 2:28 PM
  • coded the connection string correctly in the App.Config and for m_sConnection and m_oConnection and it worked fine.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Tuesday, February 25, 2014 6:21 PM
  • Dear David,

    Thank you very much for all your efforts up to this point. Could you please explain what you mean by "coded the connection string correctly"? Is there a property we missed in our EfSqlClientConnectionString function? Could you perhaps post your connection string and App.Config file (without any usercredential information, obviously). I'm wondering what you changed in the app.config as the connection string there does not (AFAIK) get used by the application as we use a function to get the connection properties from the textboxes on the form.

    I'm interested to see the differences from our file and connection string. If there are no differences that would mean that we AND the client independently from each other made the same mistake(s) when configuring the mirror setup, because that's the only variable that might be different between your and our attempts.

    Thanks again!

    Sven

    Thursday, February 27, 2014 7:35 AM
  • My post somehow got truncated.  Here's the app.config entry that worked for me

      <connectionStrings>
        <add name="AdventureWorks2012Entities" connectionString="metadata=res://*/AdventureWorks.csdl|res://*/AdventureWorks.ssdl|res://*/AdventureWorks.msl;provider=System.Data.SqlClient;provider connection string=&quot;data source=10.0.0.20;initial catalog=AdventureWorks2012;integrated security=False;User Id=xxxx;Password=xxxxxxx;MultipleActiveResultSets=True;App=EntityFramework&quot;" providerName="System.Data.EntityClient" />
      </connectionStrings>

    David


    David http://blogs.msdn.com/b/dbrowne/

    Thursday, February 27, 2014 2:30 PM
  • David,

    Thanks for your reply. I'am a colleague of Sven and I tested your solution.

    The first time the code runs it works fine, but if you hit the button "Get the data" for the second time we still have the same problem...

    Pieter.


    SSIS developer

    Tuesday, March 4, 2014 6:33 AM
  • Well, the code you posted is a mess. 

    Try this:

    Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
        Try
    
          m_sConnection = "Server=10.0.0.20;Database=AdventureWorks2012;Integrated Security=false;User Id=xxx;Password=xxxxx"
          m_rConnection = New SqlConnection(m_sConnection)
    
          Dim aw As New AdventureWorks2012Entities(EfSqlClientConnectionString("AdventureWorks"))
          Dim em As Employee = aw.Employee.Where(Function(emp) emp.E_BusinessEntityID = 9).Single
    
          Dim FileWriter As StreamWriter = IO.File.CreateText(txtFile.Text)
    
          Dim rParams As New List(Of SqlParameter)
          rParams.Add(New SqlClient.SqlParameter("@BusinessEntityID", 9))
          Dim rData As SqlDataReader = OpenReader("[dbo].[uspGetEmployeeManagers]", rParams)
    
          'Get first row for validations
          rData.Read()
    
          Dim d As Department = aw.Department.Where(Function(dep) dep.D_DepartmentID = 12).Single
    
          Do
            FileWriter.WriteLine(rData(2).ToString)
          Loop While rData.Read
    
          rData.Close()
          FileWriter.Close()
    
    
        Catch ex As Exception
          MessageBox.Show(ex.Message.ToString)
        Finally
    
        End Try
      End Sub

    David


    David http://blogs.msdn.com/b/dbrowne/

    Tuesday, March 4, 2014 3:24 PM
  • Hi David,

    Thanks for all your effort. I know this isn't a pricewinning code project. I just copied and pasted some code from our main project to point out our problem.

    I think our problem is if we use the same connectionstring for the OpenReader function (without adding the timestamp to make the connectionstring unique in our example), everytime we hit the button and we want to retrieve data from our Enitity Framework model within the Reader loop (based on values from the Reader) the code will stil block without ever returning...


    SSIS developer

    Tuesday, March 4, 2014 7:17 PM
  • That sounds like you are leaking connections in your code and exhausting the connection pool.  You need to scope your SqlConnections and ObjectContext instances either in a USING block or tied to the lifetime of a form (or other IDisposable).

    David


    David http://blogs.msdn.com/b/dbrowne/

    Tuesday, March 4, 2014 7:59 PM
  • Hi David,

    As Pieter (one of our developers) pointed out, this is just some copy and pasted code from around the web. You are correct in stating it needs to be more efficient in handling its resources. However, I don't think exhausting the connection pool is the issue here, as the code freezes on the first iteration of the loop. Moreover, if it were an issue with the connection pool being exhausted, I would expect an error message ("System.InvalidOperationException: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.") after a certain time-out period was reached. In this case the code just hangs indefinitely.

    I still feel this is a configuration issue on the server side with regards to how the mirror is set up. Did you use a machine that already had the mirror pre-configured or did you follow the exact same  steps we used in my/our 3rd post in setting up the mirror?

    Again, thanks for all your help.

    Sven

    Friday, March 7, 2014 6:47 AM
  • I followed those steps.  Perhaps if you simplified the code to repro the issue more cleanly the root cause would be easier to see.

    If there is an issue in the server, it shouldn't affect just EF code.  Can you run uspGetEmployeeManagers in SSMS?

    David

     

    David http://blogs.msdn.com/b/dbrowne/

    Friday, March 7, 2014 12:04 PM
  • I've read these posts and don't mean to butt in; however, if we read this description of SQL Mirroring:

    http://technet.microsoft.com/en-us/library/ms189852.aspx

    We can see the discussion revolve around two physical servers.  Then we can see that each (The primary and the mirror) can actually swap roles.  What I don't understand is when do they swap roles?

    This problem boils down to just a few possible root causes:

    • 1 connection string but two ip addresses if role swapping is happening
    • Substantial change in DB server performance (after all the application didn't change).

    If it were my problem I would run a wireshark trace to see what's happening on the hang.  From that trace you'll be able to see the data flow and which side is mis-behaving.  I'm pretty sure you'll find the server side is rending the application useless. 

    So try to run a trace and tell us the last thing that happened data wise prior to the hang.  I suspect you will say "The client sent a outbound request, and we didn't get any data, however, the session went into keep alive mode" 


    JP Cowboy Coders Unite!

    Friday, March 7, 2014 10:41 PM
  • Dear Mr. Javaman,

    In our test setup both instances of SQL server run on the same machine, but I guess that should not be an issue. I've ran our test application and was able to determine that the application only appears to hangs: there is still network traffic going on. I have attached the grab from wireshark, which I hope you can make more sense of than I. The packets/lines in blue are when the application is making the first connection and still receiving data. As soon as the application appears to hang when trying to get data from the reader that called the uspGetEmployeeManagers stored procedure, the yellow lines 'Negotiate Protocol Request' pop up, and continue on ad infinitum.

    I hope this will shead enough light on our situation for you to come up with a solution!

    Thank you very much for your efforts thusfar.

    Sven


    • Edited by SvenEric Monday, March 10, 2014 12:30 PM
    Monday, March 10, 2014 10:23 AM
  • Ok here's a clue...

      The client is not supporting SMB2 protocol and RST (resets) the session.  You mentioned earlier there was some work on the server.  You may want to ask them if they upgraded to Server 2008 or above because MSFT turned on SMB2 in 2008 as the default protocol.  The problem with that is that clients that don't support it can't communicate.

    You only have two options here 1) Figure out a way to support SMB2 from the client or 2) Have the server people turn on SMB.  In the communications world multiple protocols are allowed and here's how it works:

     
    Client (Start   Session)--->
    <--Host responds OK
    Client (Acks the session response from   Host)-->
    <---Host says this ip address/port (Socket)   uses SMB2
    Client either doesn't respond or says NO to SMB2--->
    <---Host says OK how about SMB
    Client says Yes I speak SMB--->

    <----Both sides are in session now   ---->

    That's how multiple protocol support is negotiated.

    To me the hang is a bug in MSFT code but one of the things MSFT does not do well on is retroactively testing things like what happens when SMB2 tries to negotiate with SMB.  Their thoughts are it's the developer and system's people jobs to figure out backward compatibility.

    Now that you know that, there could be something going on in the client code where after it sends the RST it's not bubbling up the exception or is in the wrong state still attempting to get the session up and running which will NEVER happen.  The code most likely creating this situation is here:

    Do
      FileWriter.WriteLine(rData(2).ToString)
    Loop While rData.Read



    JP Cowboy Coders Unite!



    Monday, March 10, 2014 1:31 PM
  • You might be asking what is SMB and SMB2?  It is the protocol MSFT uses to do things like get file data from shared folders etc.  It is also used sometimes for sending Printer streams from a client to a print server.  It is MSFT's preferred protocol to support all things related to file and print sharing amongst who knows what else.  I've dug into it in the past and find it to be poorly documented and somewhat of a cryptic protocol; however, if you look at it enough you kind-of-sort-of can figure it out.  I personally don't like or see SMB as a "world class protocol", but then again who am I?

    JP Cowboy Coders Unite!

    Monday, March 10, 2014 1:42 PM
  • Verify that you can connect to the SQL instance using SSMS from the same client computer.

    Look at the SQL Server log on the server and see if it contains login failures corresponding to these messages.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Monday, March 10, 2014 2:18 PM
  • Hi Guys,

    I feel we're really close! However, I fear your suggestions about upgrading the server will not do the trick. We're testing this on one of our developers machines: a Windows 8 laptop (so OS version wise newer that W2K8) with two instances of SQL2012 installed (also tested it on a W7 workstation, with similar results) so I don't think OS version is the issue. BTW in both instances (W8 en W7) we are running the client application and the SQL server instances on the same machine (to rule out firewall stuff and what not)

    Also, in our case the code that hangs is this line : d = aw.Department.Where(Function(dep) dep.DepartmentID = 12).Single

    So:

    1. get data using EF, fine;

    2.  open second connection, use datareader, call read() once, fine;

    3. get data using EF, hangs

    4. does not even get to looping through the reader.

    The thing that's strange is, that the codes works fine as soon as we break the mirror. Does breaking the mirror mean the OS switches back from SMB2 to SMB, that seems highly unlikely.

    thanks again.

    Sven


    • Edited by SvenEric Monday, March 10, 2014 3:37 PM
    Monday, March 10, 2014 3:37 PM
  • Can you post the code that repros the issue?  Are all these steps using the same connection?

    David


    David http://blogs.msdn.com/b/dbrowne/

    Monday, March 10, 2014 4:01 PM
  • Hi David,

    You can find the code that reproduces the issue here: https://onedrive.live.com/redir?resid=47417E863479811F%21107

    Also, upon further investigation the code does not hang indefinitely, the time-out is just extremely long (~20minutes). Below is the exception that is eventually raised.

    Does this help?

    Sven 

    • Edited by SvenEric Monday, March 10, 2014 4:27 PM
    Monday, March 10, 2014 4:15 PM
  • Sven;

     I reexamined the trace above and can only see one side of the conversation!  Did you filter it out to show only the Source side?  We should see the traffic in both directions.  Sometimes we see this if the client has two nic cards as the inbound comes in on one adapter the outbound on the other.  Still, if you find out why the RST occurs you solve the problem.  There's only 7 seconds in the trace to the first reset, this typically proves it is not a time-out situation.  Rather it's either something in the protocol that is not acceptable OR it's an application layer issue. 

    My gut tells me to work from problem backward, which tells me that the SMB2 protocol or something in it is rejected.  Why?  Because the SMB2 was the last thing done before the reset.


    JP Cowboy Coders Unite!

    Monday, March 10, 2014 4:41 PM
  • Cool.  That's a repro.

    And congratulations, I think you have found a bug.

    Here's a simplified repro:

    using System;
    using System.Collections.Generic;
    using System.Data.SqlClient;
    using System.Linq;
    using System.Text;
    using System.Threading.Tasks;
    
    namespace connectionFailureRepro
    {
      class Program
      {
        static void Main(string[] args)
        {      
          var cb = new SqlConnectionStringBuilder();
          cb.DataSource = "sql.dbrowne.lab";
          cb.InitialCatalog = "AdventureWorks2012";
    
          //rConnectionString.IntegratedSecurity = True;
          cb.UserID = "sa";
          cb.Password = "xxxxx";
    
          cb.ConnectTimeout = 0;
          cb.AsynchronousProcessing = true;
          cb.ApplicationName = "AdventureWorks";
          cb.WorkstationID = "TestStation";
    
          var constr = cb.ConnectionString;
    
          var con1 = new SqlConnection(constr);
          con1.Open();
            
          var con2 = new SqlConnection(constr);
          con2.Open();
    
        }
      }
    }

    The issue is somewhere in the SqlClient connection logic for mirrored databases, and only repros with ConnectTimeout = 0 (infinite).

    I'll submit this repro to the team that owns SqlClient, and in the mean time, just change your ConnectTimeout to any non-zero value.  120 (2 minutes) is about as log as you would ever want anyway. 

    And thank you very much for sticking with this and getting it to a repro.  It's people like you that make the platform better.

    David


    David http://blogs.msdn.com/b/dbrowne/




    Monday, March 10, 2014 6:05 PM
  • @David,

    Thanks for helping us out, setting the time-out to anything other than 0 works on our end aswell. I'll keep an eye out for the release notes for the next .NET framework and/or patch for an 'honourable mention' ;-)

    @Mr JavaMan,

    I pursued the network angle a little further and notices that registering the Service Principal Name (SPN) for the mirrored instance failed. The SPN for the principal instance (confusing name that) on the other hand was registered correctly. I tried to manually add an SPN for the mirror but did not have enough rights to do so. Apparently only domain admins can do this, which confused me even more, because how did the principal get registered correctly then? Our domain admin did not install SQL Server on my workstation, I did... Anyway, from what I understand from reading about this, is that the SPN is only needed if you use KERBEROS, and both my instances of SQLServer showed they used NTLM, so this probably was not going to solve our problems.

    Again thank you very much for all your efforts!

    Sven


    • Edited by SvenEric Wednesday, March 12, 2014 9:46 AM
    Tuesday, March 11, 2014 6:34 AM