Dealing with PII or Sensitive Data Captured by Application Insights RRS feed

  • Question

  • Hi, I'm wondering about what others do about capturing perhaps non-intended PHI/PII or sensitive data inside of Application Insights.

    Here is an example show how a SQL error message that contains potentially PII (social security number or unique identifier).

    CREATE TABLE ConstraintViolation
    (SSN int, PatientNM VARCHAR(255), CONSTRAINT pkConstraintViolation PRIMARY KEY CLUSTERED (SSN))
    INSERT INTO ConstraintViolation VALUES ('252238923', 'Patient Paul')
    INSERT INTO ConstraintViolation VALUES ('252238923', 'Patient Pascal')

    Msg 2627, Level 14, State 1, Line 5 Violation of PRIMARY KEY constraint 'pkConstraintViolation'. Cannot insert duplicate key in object 'dbo.ConstraintViolation'. The duplicate key value is (252238923).

    If this identifier gets captured in an exception message what have people done to secure this information?

    • Edited by Cat-Man-Do Friday, March 23, 2018 10:51 PM
    Friday, March 23, 2018 7:19 AM

All replies

  • There are couple ways currently to address non-intended PII:

    - Once you found out that something like this happens, you can code Telemetry Initializer or Telemetry Processor to either modify the field with PII based on certain condition or to drop such telemetry item before it's sent.

    For instance, you can run a Regex on fields you suspect PII in and replace it with ***** before the item gets sent to AI from the application. Since Regex maybe heavy and Telemetry Initializers are run synchronously with the code, you may pick Telemetry Processor approach but instead of dropping the item (the main use case of Processors), just modify it like you would in Telemetry Initializer.

    Dmitry Matveev

    Friday, March 23, 2018 8:25 PM
  • @Dmitry - I am interested in learning more about how to address non-intended PII. What are the pros and cons in your experience with the options that mentioned above? Thanks in advance. Tiffani
    Friday, June 15, 2018 7:29 PM
  • Hello Tiffani,

    I'd say the main cons of both Telemetry Initializer and Telemetry Processor approaches is that you may not be able to guess all the possible content matches and regexes upfront - some new exception with some new PII details may pop-up later one, especially if some new code around exception handling or PII data handling is checked-in/committed into the monitored application.

    That means that the first few telemetry items before the issue is recognized will still be saved into AI backend. You'd need to execute a purge request to get this data removed, new one will not arrive as soon as you add new field to your exclusion list in Telemetry Processor or Initializer.

    Speaking of the pros/cons of Initializer vs. Processor - I personally think that Processor is more suitable for heavy-lifting operations like matching multiple string patterns or regexes as it's executed on the separate thread (at least it should be) and won't block the execution of the application code (at least directly). Processor were meant as filters, so it's a bit counter-intuitive to use them as enrichment/modification mechanism - Initializers were meant for this. However; once this modification becomes "heavy enough", I'd vote to do that in Processors.

    Hope this helps.

    Dmitry Matveev

    Friday, June 15, 2018 11:25 PM
  • @Dmitry-Matveev Hello Dmity, we have a PII masking requirement and I happen to come across your post here. My question is how we do this for API Management (APIM)? As I am not sure how to associate the Custom Telemetry with API Management. As right now its all configured thru Azure Portal and no custom telemetry to it, our backend services (API) do use Custom Telemetry but in Azure portal the PII data is marked as coming from APIM and not the APIs itself. Any help?


    Friday, September 20, 2019 12:30 PM
  • Thanks Ketaanh Shah for reaching out. 

    With help of Dmitry and other internal teams for APIM below are the details gathered , so sharing the same.

    Application Insights cannot control on what telemetry APIM instances would send to Application Insights, this is something need to be controlled from APIM stand point.

    Hopefully you have request/response body logging enabled in APIM. Can you please check what are the bytes of body setting setup with in APIM and please make sure its specified as 0 (zero).


    Additionally you can also check out the purge functionality which can be leveraged to purge the data which is already residing in Application Insights based on user defined filters.

    Hope the above information helps. Please revert back if you have any further queries. 

    Friday, September 20, 2019 9:29 PM
  • Aah ok I re-READ it and what you are saying is SET THE DIAGNOSTICS LOGS settings for REQUEST and RESPONSE to ZERO (0) that way nothing is getting logged to Application Insights as part of the Request/Response, but the issue may happen when someone wants to debug the APIM calls and the logs will not have anything in the Request/Response body which may cause issue for Support team as they will not see wht data came thru and was passed from APIM. Unless I am not thinking the right way please correct me.


    Friday, September 20, 2019 9:36 PM
  • The request telemetry (along with the dependency telemetry, where applicable) will be there regardless of the body/header logging settings. If in order to investigate something your support team needs to look into the request/response bodies of the requests , currently there are no options to control requests body data flowing to Application Insights from APIM (apart from specifying the number of bytes to log). 

    APIM Team is currently working on a feature , which is an extension to existing trace policy and will allow the customers to emit trace telemetry, where customers can put any data they wish and the feature would be rolling out in next 3-4 weeks.

    If none of the options listed above are helping your scenario, you can always open a feature request with our team. 

    Hope the above information helps. Thank you

    • Proposed as answer by Ketaanh Shah Friday, September 20, 2019 11:02 PM
    Friday, September 20, 2019 10:27 PM
  • Thank you Bharat, I really appreciate your complete follow up on this request. I will try and talk with my home team and check if that is feasible and if not we will look into opening a new Feature request.

    And I think if the trace policy would help in debugging purpose, then even if App Insights is missing the Request/Response body I think it should be fine as not always that the team would like to debug every request but its only upon exception.


    Friday, September 20, 2019 11:04 PM
  • Than you Ketaanh, glad to hear the information was helpful. 
    Monday, September 23, 2019 3:25 AM
  • Hello @BharathN,

    Hey any update if the tracing Feature was released already, as around 5 months back you had mentioned so thought of checking if we have any update on it and if we can use the TRACES table to debug it further and get more details:-

    "APIM Team is currently working on a feature , which is an extension to existing trace policy and will allow the customers to emit trace telemetry, where customers can put any data they wish and the feature would be rolling out in next 3-4 weeks."



    Thursday, February 13, 2020 1:53 PM
  • Thank you Keetanh Shah for the follow up , the feature is out already. 

    The tracepolicy adds a custom trace into the API Inspector output, Application Insights telemetries, and/or Diagnostic Logs.

    • The policy adds a custom trace to the API Inspector output when tracing is triggered, i.e. Ocp-Apim-Tracerequest header is present and set to true and Ocp-Apim-Subscription-Keyrequest header is present and holds a valid key that allows tracing.
    • The policy creates a Trace telemetry in Application Insights, when Application Insights integration is enabled and the severitylevel specified in the policy is at or higher than the verbositylevel specified in the diagnostic setting.
    • The policy adds a property in the log entry when Diagnostic Logs is enabled and the severity level specified in the policy is at or higher than the verbosity level specified in the diagnostic setting.

    Please refer to the document for additional information and feel free to revert back if you have further queries. 

    Wednesday, February 19, 2020 6:21 PM