locked
Azure Search SDK Index Document Search Format RRS feed

  • Question

  • Hello Azure Search Team,

    I am using the Azure Search SDK 3.0.  Quick snippet below:

    public DocumentSearchResult Search(string q)
    {
    string searchServiceName = ConfigurationManager.AppSettings["SearchServiceName"];
    string apiKey = ConfigurationManager.AppSettings["SearchServiceApiKey"];
    string indexName = ConfigurationManager.AppSettings["SearchServiceIndex"];

    _searchClient = new SearchServiceClient(searchServiceName, new SearchCredentials(apiKey));
    _indexClient = _searchClient.Indexes.GetClient(indexName);

    SearchParameters sp = new SearchParameters()
    {
       SearchMode = SearchMode.All
       //some other parameters left out
    };
    return _indexClient.Documents.Search(q, sp);
    }

    One of the field defined in the index is a Collection(Edm.String).  I have an indexer populating this field.  The source data in SQL Server looks like this: [{"c":"White","v":"#FFFFFF"},{"c":"Black","v":"#000000"}]

    The SDK returns the data in this format when view in Chrome:


    1. "{ ↵ "c": "White", ↵ "v": "#FFFFFF" ↵}"
    2. "{ ↵ "c": "Black", ↵ "v": "#000000" ↵}"

    In Postman:

          [
            "{\r\n  \"c\": \"White\",\r\n  \"v\": \"#FFFFFF\"\r\n}",
            "{\r\n  \"c\": \"Black\",\r\n  \"v\": \"#000000\"\r\n}"
          ],

    A couple of questions:

    - Why are there extra carriage return characters?

    - How come each array element is wrapped around double quotes?  It is not a JSON array with the double quotes.

    Thanks,

    Ken

    Sunday, January 15, 2017 3:26 AM

Answers

  • Hi Kenny,

    This behavior is (mostly) expected, if a bit counterintuitive. This happens because when the index field has a type of Collection(Edm.String), we try to automatically coerce your input to a string array. In particular, for SQL varchar columns, if we see that a string looks like a JSON array, we convert each element of the array to string. That's where those enclosing quotes come from. During the conversion to string, we use pretty print representation of JSON - that's where those \r\n characters come from.

    If this behavior is not desired, one way to suppress it is to change the type of the index field to Edm.String instead of Collection(Edm.String). Then indexer will just use your SQL data as-is.

    Let me know if this helps in your scenario, or if you have any follow up questions.


    Thanks! Eugene Shvets Azure Search

    Monday, January 16, 2017 3:36 AM
    Moderator

All replies

  • Hi Kenny,

    This behavior is (mostly) expected, if a bit counterintuitive. This happens because when the index field has a type of Collection(Edm.String), we try to automatically coerce your input to a string array. In particular, for SQL varchar columns, if we see that a string looks like a JSON array, we convert each element of the array to string. That's where those enclosing quotes come from. During the conversion to string, we use pretty print representation of JSON - that's where those \r\n characters come from.

    If this behavior is not desired, one way to suppress it is to change the type of the index field to Edm.String instead of Collection(Edm.String). Then indexer will just use your SQL data as-is.

    Let me know if this helps in your scenario, or if you have any follow up questions.


    Thanks! Eugene Shvets Azure Search

    Monday, January 16, 2017 3:36 AM
    Moderator
  • Hi Eugene,

    If I were to setup the field as an Edm.String, how could I get back a Json type object with the indexClient.Document.Search method?

    Thanks,

    Ken

    Monday, January 16, 2017 3:27 PM
  • Hi Ken,

    Azure Search doesn't currently support storing and searching arbitrary JSON - we currently only support primitive data types (strings, numbers, dates, Booleans, geopoints) and arrays of strings. Adding more expressiveness is on our road map as it has been commonly requested by the customers (https://feedback.azure.com/forums/263029-azure-search/suggestions/6670910-modelling-complex-types-in-indexes). For the time being, you van create use "parallel arrays" approach to represent your JSON data: create a collection field Colors and a collection field Values, and populate those as appropriate. Hope this helps!  


    Thanks! Eugene Shvets Azure Search

    Monday, January 16, 2017 11:00 PM
    Moderator
  • Hi Eugene, thanks for the feedback.  I have another question related to the same method posted initially.


                    string searchServiceName = ConfigurationManager.AppSettings["SearchServiceName"];
                    string apiKey = ConfigurationManager.AppSettings["SearchServiceApiKeyAdm"];
                    string indexName = ConfigurationManager.AppSettings["SearchServiceIndex"];

                   SearchServiceClient _searchClient = new SearchServiceClient(searchServiceName, new SearchCredentials(apiKey));
                   ISearchIndexClient _indexClient = _searchClient.Indexes.GetClient(indexName);


                     var batch = IndexBatch.Merge(doc);
                    _indexClient.Documents.Index(batch);

    The last line (_indexClient.Documents.Index(batch) is always giving the below screenshot message; however, I am not getting this issue with the Github example (https://github.com/Azure-Samples/search-dotnet-getting-started/blob/master/DotNetHowTo/DotNetHowTo/Program.cs).  

    Any pointers?

    Thanks,

    Ken

    Thursday, January 19, 2017 4:23 AM
  • Hi Kenny, can you show how you're constructing the doc variable? Is it typed dynamic - that would explain the compiler error you're seeing?

    Thanks! Eugene Shvets Azure Search

    Thursday, January 19, 2017 7:40 AM
    Moderator
  • Hi Eugene, I am using this class:

        public partial class CatalogClass
        {
            public string id { get; set; }
            public string title { get; set; }
            public string colormap { get; set; }
        }

    I have a SQL data reader that creates the doc like this and pass into a method as "dynamic doc":

                catalogclasses = new CatalogClass[]
                {
                    new CatalogClass
                    {
                        id = dr[0].ToString(),
                        title = dr[1].ToString(),
                        colormap = dr[12].ToString()
                    }
                };

    Thanks,

    Ken

    Thursday, January 19, 2017 2:17 PM
  • Don't type the docs parameter as dynamic. Use IEnumerable<CatalogClass> or CatalogClass[] instead.

    Thanks! Eugene Shvets Azure Search

    Thursday, January 19, 2017 7:24 PM
    Moderator
  • Thanks, that was it!  I could've sworn I did that but I think I must have missed the IEnumerable part :-)

    A couple of more questions.  

    - In the Azure Search index, "colormap" is defined as a "Collection(Edm.String)", how would I setup the .Net class for "colormap"?  In the above sample, it's a "string" and the "doc" object for "colormap" has this value: "[{"c":"Fiesta","v":"#f53300"},{"c":"Black","v":"#000000"}]"  I received this error when I do the merge: "The request is invalid. Details: parameters : An unexpected 'PrimitiveValue' node was found when reading from the JSON reader. A 'StartArray' node was expected."

    * Just to clarify, I had tried Public string[] colormap { get; set; } but the data comes back from search as string instead of a JSON object.   The Azure Search Indexer is able to map the data correctly into the index.

    - Using the Azure Search SDK, is there a way to conditionally select columns.  Currently in my SearchParameters, I have a fixed set of fields: Select = new[] { "id, title, colormap" }  Is there a way to say if "colormap" is blank, don't return "colormap"?

    Thanks,

    Ken


    • Edited by Kenny Guen Friday, January 20, 2017 1:42 PM
    Thursday, January 19, 2017 9:47 PM
  • Make ColorMap a string array just as you originally tried. The parsing of string that looks like a JSON array is only done by indexers, because there are situations when the datasource cannot represent arrays (for example SQL, or Azure table).

    And no, there's no way to conditionally select columns.


    Thanks! Eugene Shvets Azure Search

    Sunday, January 22, 2017 12:57 AM
    Moderator
  • There is no other way... really?

    So you're saying that your default process is to scramble the json string we give you for a collection, and the only fix is to "not use a collection" ???

    I'm using a collection as you (still) have no apparent plans to support hierarchical facets as it's been on the roadmap since the beginning. I'm mashing my content into a Json string as it's pretty straightforward to control the serialization to and from a database varchar column.

    I can then use the fact it's a facet and do client side filtering to strip out the facets that are not appropriate (it's a mess, but you don't support "only bring me back facets that match this criteria".

    ... but this is still screwed as when I then take that returned facet and add it to my filter with an Any clause, no matches are found because you guys have "handily" reformatted the whole string to add newline and spacings...

    And I can't turn this off???

    Wednesday, March 15, 2017 12:16 PM
  • Hi Graham, the "pretty printing" of the JSON string is unintentional - since this is breaking your scenario, we'll fix this logic to avoid pretty printing. Can you please send me your input string example at eugenesh at the usual Microsoft domain?

    Thanks! Eugene Shvets Azure Search

    Wednesday, March 15, 2017 7:18 PM
    Moderator
  • Done, email sent with comments and feedback
    Saturday, March 25, 2017 10:18 AM