none
Functions failing with managed identity (Production) RRS feed

  • Question

  • Functions Version: 2.0.12590.0
    Region: UK South
    Plan: Consumption

    Hi,

    Our functions app has been happily running for over 6 months in production.  We use managed identity to allow the app 'Get Secret' permission to a key vault.

    A few weeks ago (July 13th) every function started failing over the weekend.  I initially put it down to a glitch as a simple restart of the functions app resolved the issue.

    Unfortunately this occurred again last weekend (Aug 3rd) so I've been investigating today.  The error we are seeing is:

    Exception while executing function: xxxx The type initializer for 'Functions.Common' threw an exception. One or more errors occurred. (Parameters: Connection String: [No connection string specified], Resource: https://vault.azure.net, Authority: https://login.windows.net/xxxxx. Exception Message: Tried the following 3 methods to get an access token, but none of them worked.
    Parameters: Connection String: [No connection string specified], Resource: https://vault.azure.net, Authority: https://login.windows.net/xxxxx. Exception Message: Tried to get token using Managed Service Identity. Unable to connect to the Managed Service Identity (MSI) endpoint. Please check that you are running on an Azure resource that has MSI setup.
    Parameters: Connection String: [No connection string specified], Resource: https://vault.azure.net, Authority: https://login.windows.net/xxxxx. Exception Message: Tried to get token using Visual Studio. Access token could not be acquired. Visual Studio Token provider file not found at "D:\local\LocalAppData\.IdentityService\AzureServiceAuth\tokenprovider.json"
    Parameters: Connection String: [No connection string specified], Resource: https://vault.azure.net, Authority: https://login.windows.net/xxxxx. Exception Message: Tried to get token using Azure CLI. Access token could not be acquired. 'az' is not recognized as an internal or external command,
    operable program or batch file.
    .....

    This time I looked at the environment variables and the MSI_ENDPOINT & MSI_SECRET all looked correct so not sure how to progress this?  Again a restart fixed the issue.

    Other info that may be relevant:

    1. The same managed identity code runs in various web apis on app service and we don't see any issues there.
    2. Production code currently using Microsoft.Azure.Services.AppAuthentication v1.0.3 (I see 1.3.0 is out so we could try upgrading to that)
    3. Some invocation ids from the failures this weekend: a194ffab-4b5a-430e-a5e5-aace9efa1cfe, bc8c078f-b8d1-48d3-bf4c-6b390edf37e3

    Code is straightforward:

    var azureServiceTokenProvider = new AzureServiceTokenProvider();
    var keyVaultClient = new KeyVaultClient(new KeyVaultClient.AuthenticationCallback(azureServiceTokenProvider.KeyVaultTokenCallback));
    

    Thanks for any help, much appreciated.

    Monday, August 5, 2019 3:41 PM

All replies

  • This is likely a known rare race condition that we were recently fixing (might not deploy public yet).  Will update the fix availability.   Unfortunately, restart is the mitigation till then (we understand not the best one).

    Internal Ref:  Function App (dn**uth), UTC: 2019-08-03 20:03, Scale Unit: LN1-007.



    Suwatch

    Monday, August 5, 2019 4:35 PM
  • To update, it is confirmed to be caused by a known race condition.   We have fixed the issue and the fix will be available in four weeks worldwide.   Apology for the inconvenience.

    Suwatch

    Monday, August 5, 2019 5:46 PM
  • That's great news.  Thank you very much for confirming.
    Monday, August 5, 2019 7:38 PM
  • Hi,

    This issue occurred again last weekend.  When will this fix be applied to the UK South region? Is there any way that I can check.

    Thanks.

    Friday, September 6, 2019 4:56 PM
  • Hi,

    Any updates please? Has the fix been applied?

    Thanks

    Friday, September 13, 2019 3:40 PM
  • Please provide the latest UTC time and site name - we can take a look.

    Suwatch

    Friday, September 13, 2019 6:52 PM
  • Thank you

    We had a huge number of failures on 4th September.

    One such error was at 21:32:57, invocation id be4043a9-73a4-4280-bcfc-670191f15548. 

    Site name is  dnafunctions-uksouth

    Friday, September 13, 2019 8:14 PM
  • The issue that you ran into on the 4th of September was a different issue that we have fixed.   However, the scale unit that hosts your function was running behind by a few days (not being fixed).   It should be fixed now.

    Suwatch

    Tuesday, September 17, 2019 5:40 PM
  • Thank you for investigating.
    Tuesday, September 17, 2019 7:55 PM