locked
Data Sync: Server failed to authenticate the request RRS feed

  • Question

  • Hi there,

    I had issues trying to re-provision an on-premise database. After the re-provisioning succeeded, I am now getting the following error:

    Sync failed with the exception "Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
    Inner exception: The remote server returned an error: (403) Forbidden.
    "

    For more information, provide tracing id ‘0d0121e0-6407-4d64-a74b-c823f7eaed3d’ to customer support.

    Sync Group Id: 8ded819b-8456-4a4a-bcba-995dbf58b639
    Server Id: e4e38c20-9487-435f-94ae-1ded32731eaa
    Region: Wes Europe

    Trace output: http://sdrv.ms/P03BtM

    I also ran a trace on the agent server it seems that there seems to be an issue uploading data to cloud storage.

    Any ideas on what can be wrong? Only answers I saw was on time settings etc.

    Wednesday, September 5, 2012 7:54 PM

Answers

  • HI Marais,

    After some investigation, I feel your problem might be caused by blob SharedAccessKey expire.

    It might be caused by network latency or slow file reading.

    And I need your help to find out root cause.

    Can you do following things?

    1. Download Process Monitor (http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx)

    2. Set filter to only monitor LocalAgentHost process

    3. Make sure File & Network activity are monitored

    4. start a sync and wait it failed 

    5. Open the agent log and search out the event "id:SyncController_BatchSpooledEvent" which happen during this sync

    e.g.

    SyncController 9/8/2012 2:01:47 PM  id:SyncController_BatchSpooledEvent, rId:, sId:2a0c4b47-3c2b-4e0e-9fb4-69c36de6a4c5, tracingId:6aeb14b8-3917-4d76-a5aa-f0e2bd57c999, agentId:6b28a3e2-9e3c-4904-b99e-ba735021a242, agentInstanceId:812b1345-6e0a-4226-9c1e-48b253465dee, syncGroupId:dc4bd576-f058-41b5-9366-3f6e33096601, memberDbId:72882f3d-e668-4837-bc27-0810b4727bd0, hubDbId:a359ab16-843b-40d1-8021-ff3044236a42, currentBatchNumber:1, totalBatchesSpooled:1, batchFileName:0463ce7b-be40-4d12-9174-42895577e4a8.batch

    6. Find corresponding write file event in Process Monitor and capture all the followed event log until next agent log written event for "id:ClientRequestHandler_ExceptionProcessingRequest"

    e.g.  (id:SyncController_BatchSpooledEvent written at 9/8/2012 2:01:47 PM UTC

            id:ClientRequestHandler_ExceptionProcessingRequest written at 9/8/2012 2:01:50 PM UTC )

    Then you can copy the process monitor log and related local agent log and send to me.

    Thanks,

    Jarod Zhu

    • Marked as answer by Iric Wen Wednesday, September 19, 2012 2:16 AM
    Saturday, September 8, 2012 3:26 PM

All replies

  • HI Marais,

    Thanks for providing the trace. We will further investigate and go back to you.

    And I noticed that you once hit a re-provision issue on the same sync group(http://social.msdn.microsoft.com/Forums/en-US/ssdsgetstarted/thread/bf692b83-0d5d-4b3f-893d-ea95ce167667/#95278d30-204e-4583-88fb-4c3193fad52f) and whether you got re-provision successfully just after you created the  'DataSync.provision_marker_dss' table by yourselves?

    Regards,

    Jarod Zhu


    Thursday, September 6, 2012 4:29 PM
  • Hi Jarod,

    Thank you. Yes, I had to manually create the provision_marker_dss table inorder to get the sync to start up.

    I don't know if that is related to my current problem, but I suppose you should be able to figure that out on your side.

    Regards,

    Marais

    Thursday, September 6, 2012 5:37 PM
  • on the machine where you installed the Data Sync Agent, can you check the timezone settings? what's the timezone settings?

    Friday, September 7, 2012 1:43 AM
    Answerer
  • HI June,

    It was origanally set to GMT+2 (Jhb, Pretoria), but I have changed it to Dublin time this morning and ran the sync again, but the exception is still there.

    Friday, September 7, 2012 5:14 AM
  • HI Marais,

    After some investigation, I feel your problem might be caused by blob SharedAccessKey expire.

    It might be caused by network latency or slow file reading.

    And I need your help to find out root cause.

    Can you do following things?

    1. Download Process Monitor (http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx)

    2. Set filter to only monitor LocalAgentHost process

    3. Make sure File & Network activity are monitored

    4. start a sync and wait it failed 

    5. Open the agent log and search out the event "id:SyncController_BatchSpooledEvent" which happen during this sync

    e.g.

    SyncController 9/8/2012 2:01:47 PM  id:SyncController_BatchSpooledEvent, rId:, sId:2a0c4b47-3c2b-4e0e-9fb4-69c36de6a4c5, tracingId:6aeb14b8-3917-4d76-a5aa-f0e2bd57c999, agentId:6b28a3e2-9e3c-4904-b99e-ba735021a242, agentInstanceId:812b1345-6e0a-4226-9c1e-48b253465dee, syncGroupId:dc4bd576-f058-41b5-9366-3f6e33096601, memberDbId:72882f3d-e668-4837-bc27-0810b4727bd0, hubDbId:a359ab16-843b-40d1-8021-ff3044236a42, currentBatchNumber:1, totalBatchesSpooled:1, batchFileName:0463ce7b-be40-4d12-9174-42895577e4a8.batch

    6. Find corresponding write file event in Process Monitor and capture all the followed event log until next agent log written event for "id:ClientRequestHandler_ExceptionProcessingRequest"

    e.g.  (id:SyncController_BatchSpooledEvent written at 9/8/2012 2:01:47 PM UTC

            id:ClientRequestHandler_ExceptionProcessingRequest written at 9/8/2012 2:01:50 PM UTC )

    Then you can copy the process monitor log and related local agent log and send to me.

    Thanks,

    Jarod Zhu

    • Marked as answer by Iric Wen Wednesday, September 19, 2012 2:16 AM
    Saturday, September 8, 2012 3:26 PM
  • Hi Jarod,

    we are experiencing the exact same problem:

    Sync failed with the exception "Server failed to authenticate the request.
    Make sure the value of Authorization header is formed correctly including the signature.
    Inner exception:
    The remote server returned an error: (403) Forbidden.
    For more information, provide tracing id ‘537b28ad-619c-48c8-a679-67ec6ccee748’ to customer support.

    This happens while syncing a table with about 200k records from on-premise to the cloud.

    I was trying to collect the data you suggest, but failed to find the Agent Log or a way to enable logging.
    The cloud log does not provide the information you request (id:SyncController_BatchSpooledEvent) and the local log folder (C:\Program Files (x86)\Microsoft SQL Data Sync\logs) is empty.

    The version of the SQL Data Sync Agent is 4.0.61.0 and should be the most recent one.

    Best regards,
    Oskar Mayr

    EDIT: Some more detail

    The amount of data that should be synced is about 11Mb. After starting a sync process, on our firewall we see an upload of roughly this amount of data. It seems as if the error occurs immediately after the upload process. It takes aproximately 3-4 minutes to upload 11Mb of data with our ADSL connection.

    NB: If we reduce the amount of data (say 200 records instead of 200k) the sync process completes successfully!

    • Edited by oaksboard Friday, September 21, 2012 9:17 AM additional information
    Thursday, September 20, 2012 4:12 PM
  • hi oaksboard,

    Please use the following steps to enable logging in local agent

    1. Open LocalAgentHost.exe.config under [Installation location]\bin. You can use notepad or any other text editor

    2. Uncomment the section that is currently commented
        <!-- 
        <switches>
          <add name="SyncAgentTracer" value="4" />
        </switches>
        <trace autoflush="true" indentsize="4">
          <listeners>
            <add name="myListener" type="Microsoft.SqlAzureDataSync.ClientLogging.DSSClientTraceListener, Microsoft.SqlAzureDataSync.ClientLogging, Version=4.0.0.0" initializeData="..\logs\DSSAgentOutput.log" />
            <remove name="Default" />
          </listeners>
        </trace>
        -->

    3. You will be able to find log files under [Installation location]\logs

    Tuesday, September 25, 2012 2:27 AM
  • Thank you, Yide Tu!

    With your hint, I was able to collect the info requested by Jarod.

    You can find the SyncAgent and ProcMon logfiles here:

    https://www.box.com/s/o49bens3k3riz06f8ddl

    Best regards,

    Oskar Mayr

    Wednesday, September 26, 2012 9:13 PM
  • Thanks, Oskar.

    We will further investigate this issue.

    Regards,

    Jarod Zhu

    Thursday, September 27, 2012 5:55 AM
  • Hello Jarod,

    are there any news?

    Is there some more data you could need or may I help in some other way?

    Best regards,

    Oskar Mayr

    Tuesday, October 2, 2012 10:35 AM
  • HI Oskar,

    From the log, we found out that your agent spent about 10 mins to continuously upload data to storage blob  with no response until uploading failed finally with blob access key expired. We are still working with storage team to figure out the root cause while the issue might also be related to your machine network environment. 

    One thing you can further try is to reboot your machine and see whether the issue continues to happen.

    Also to mention that, SQL Data Sync product team is currently in holiday until the end of this week.

    We will go back to you ASAP when we get more clear after the holiday.

    Regards

    Jarod Zhu

    Tuesday, October 2, 2012 2:44 PM
  • Hi Jarod,

    we did the tests from two different machines to two different azure subscriptions with the exact same error. We already did restart the machines multiple times. Both machines are in the same network infrastructure and our upload capacity is limited to 512kbit/s.

    In some previous test the elapsed time until the error occured was around 3 to 4 minutes. Monitoring network traffic it seems that the whole data (~ 11 Mb) is uploaded. The error occurs immediately after the upload process has finished.

    Best regards,

    Oskar Mayr

    Tuesday, October 2, 2012 4:31 PM
  • Hi oaksboard,

    I check the process monitor log you provided, the upload speed is about 300kbit/s. The upload failed because it's not finished within the connection lifetime. So I would suggest you install the SQL Data Sync Agent on a machine with faster network connection. If it still keep failling, we will further investigation this. Sorry for the issue.

    Best Regards.

    Chao


    panchao

    Friday, October 12, 2012 6:50 AM
  • Hello Chao,

    I did not respond earlier to your post because we were hoping that the new Data Sync Agent may solve the problem. This was not the case.

    Coming back to your reply: does that mean, that if you have to sync large amounts of data it depends on the network speed, if the sync will be successful or not? Is SQL Data Sync only suitable for small amounts of data? I cannot belive that this is the case.

    What is the connection lifetime? Is there an initial sync option to upload all the data? Or is there a manual way to upload the data. Our problem seems to be the first sync of the table with aprox. 300000 records. Subsequent sync operations would never be that heavy.

    We could try with a faster network connection, but as we are developing a solution for our customers, and their network connection at best is on par with our test environment, we have to make sure SQL Sync will work on slow connections, too.

    Best regards,

    Oskar Mayr

    • Edited by oaksboard Tuesday, November 13, 2012 10:07 AM additional information
    Monday, November 12, 2012 9:15 AM
  • Hello again,

    I did some additional testing and found out something interesting:

    I deleted all 300000 records from the table that is causing the sync issues with "DELETE FROM TABLE". Doing so did not delete the corrisponding records from the "DataSync.TABLE_dss_tracking" table. I know this is not a recommended way of doing things, but it brought me one step forward:

    Sync still did upload for ~10 minutes and then quit with the "server failed to authenticate" error (but now we are without data to sync).

    This seems to be because of the DataSync.TABLE_dss_tracking still containing the "sync timestamp" of the 300000 records.

    As if Data Sync was trying to upload the whole 300000 DataSync timestamps/records and reaching the timeout limit in doing so (due to our limited upload bandwidth). When this happens, Sync gives back the "server failed to authenticate" error (maybe because a timeout syncing the sync-tables isn't supposed to happen?).

    We have to make sure, SQL Data Sync will work on slower network connections, too, because as I said before, our customers do not have fast internet. Most of them are on 4Mbit/256kbit/s ADSL lines.

    Best regards,

    Oskar Mayr

    Tuesday, November 13, 2012 11:37 AM
  • Thanks for the feedback oaksboard. We are keep improving the SQL Data Sync to provider better experience for our customer.

    Per your questions:

    SQL Data Sync can support very large data set. There's customer who use it to sync tens of GB data. Manually upload data and other options to help accelerate the init sync are all in our plan. But currently, we do not have timeline when this will be avaliable.

    For the slow network, we have done some change in current developing service update which should be helpful to address this issue. It's not in the latest local agent you had upgraded but will be coming in the near future. For the network speed SQL Data Sync need to support, we are still in planning phase, so your feedback is quite important for us. Really appreciate.


    panchao

    Wednesday, November 14, 2012 2:10 AM
  • Hello Panchao,

    we are now using the latest Sync Agent and the problem seems to have gone!

    We were able to sync the entire amount of data without getting the error anymore (on all of our test machines).

    Thank you very much and best regards,

    Oskar Mayr

    Thursday, January 31, 2013 4:54 PM