locked
Able to post malformed JSON from REST API, can't retrieve it RRS feed

  • Question

  • So, in Data Explorer the other day, I was surprised to find I couldn't retrieve a specific field in one of my documents. Back-tracing it, I noticed that the JSON wasn't entirely well formed when it was loaded. Somehow however, it was good enough to get into the database, but I can't get it out.

    This causes me a problem, because PowerBI is failing to refresh its data as a result and aborts the load. I was able to clear the single error by paging my way (via offset/limit) through the dataset in DataExplorer, and drilling down to the required record, and deleting it.

    However, that's not ideal and it turns out I have a few records like that. And in different fields. This manual cleanup will take some time and of course it'll be expensive.

    What are my options here? Is there any way I can have CosmosDB "validate documents" globally perhaps, or generate some sort of useful error in terms of identifying the id/_rid along with the field of any documents that do have issues?


    • Edited by andreww Tuesday, March 3, 2020 10:21 AM
    Tuesday, March 3, 2020 10:20 AM

All replies

  • Hi Andrew,

    Are you using Mongo API or SQL API? If you are using Cosmos DB for Mongo API, you can specify a schema requirement via $jsonSchema. It looks like you already made a post to the feedback item: Support for document validation, using JSON Schema

    Let me raise this to the product group to get an update.

    Regards,

    Mike

    Wednesday, March 4, 2020 2:13 AM
  • Using SQL API thanks Mike, but any suggestions on how I validate existing records - or a better way to identify ones that Data Explorer (and PowerBI) won't touch?
    Wednesday, March 4, 2020 9:35 AM
  • Hi Andrew,

    Do you have an example of this behavior? In some cases, the system properties are not created and these records will not behave nice with the Data Explorer. In your case, is it that the json is created with all the system properties but the document properties are incomplete, which makes the query being used by PowerBI or in the Data Explorer not include all records? I escalated this to the product group and there is the json validator being rolled out for the MongoDB API but that is awhile out. It would be supper helpful to have a good example of your issue. 

    ~Mike

     

    Tuesday, March 10, 2020 12:26 AM
  • Hi Mike

    The system properties were complete - if my query excluded the particular document property, then I could retrieve the document. But if I included it by name, of via "Select *", then it failed. Unfortunately, this was on a trial subscription which expired the other day so I destroyed the datasaet. It's possible I might be able to find the original document that was uploaded however, will have a hunt about.

    My problem is I have so many clients, and so many systems, I can't just reach to a central place - I'd need to go looking...

    A


    • Edited by andreww Friday, March 13, 2020 9:17 AM
    Thursday, March 12, 2020 12:54 PM
  • Hi Andrew,

    Yes, there exists the opportunity to write document instances to a collection that do not meet a specific data model or query syntax. It is important to model the data accordingly as there is 

    Data modeling in Azure Cosmos DB

    Azure Cosmos DB: Designing your data structure

    "Modeling data in a document database is equally important as a relational database. It's slightly different than a relational data model."

    "At first, model your data and run your queries, see what will be the cost for your queries, It it's expensive remodel your data until it comes to a certain level. When your application grows with data, you may want to restructure your data again."

    "In a document database, schema enforcement needs to handle by the developers."

    In a traditional SQL database, the schema is determined on WRITE. In a NoSQL database, the schema is determined on READ. To maintain consistency, the client will need to ensure the document is compliant. MongoDB has a document validator, as mentioned. In the case of the other Cosmos DB APIs, the use of Azure Logic Apps can validate JSON: Validating Json Schema in Azure Logic Apps

    Regards,

    Mike

    PS - I am going to mark this as the answer, as at this time there is no automatic means to ensure JSON validation is enforced directly in the Cosmos DB API. 

    Monday, March 23, 2020 4:07 AM