locked
score in azure search RRS feed

  • Question

  •  can some one explain the default scoring and scoring after using scoring profiles in azure search. I have seen the lucene pratical formula + tf-idf documentation, still i can not figure out how is it calculated.These are the two documents in which i expected second one to have high score since it has more downloads rest of the fields are exactly same but azure search gave it second rank. i have given download count a boost by using linear interpolation, boost value of 5.(boost start=0, boost end=20000000). Can some one help me out?

    "value": [
    {
    "@search.score": 14.642677,
    "extensionId": "5bce8247-44da-4371-a08a-e5a56c4d209b",
    "extensionName": "codeshot",
    "displayName": "codeshot",
    "ftDisplayName": "codeshot",
    "shortDescription": "Your codeshot",
    "ftShortDescription": "Your codeshot",

    "tags": [
    "codeshot",
    "shot"
    ],
    "categories": [
    "Other"
    ],
    "flags": [
    "Validated",
    "Public"
    ],
    "lastUpdated": "2018-06-27T06:15:49.233Z",
    "publishedDate": "2018-06-22T16:02:59.15Z",
    "downloadCount": 26,
    "averageRating": 0,
    "trending": 0
    },
    {
    "@search.score": 14.567907,
    "extensionId": "81dd2def-6449-40f2-9ddf-3fe26fd0960a",
    "extensionName": "codeshot11",
    "displayName": "codeshot",
    "ftDisplayName": "codeshot",
    "shortDescription": "Your codeshot",
    "ftShortDescription": "Your codeshot",
    "tags": [
    "codeshot",
    "shot"
    ],
    "categories": [
    "Other"
    ],
    "flags": [
    "Public",
    "Validated"
    ],
    "lastUpdated": "0001-01-01T00:00:00Z",
    "publishedDate": "0001-01-01T00:00:00Z",
    "downloadCount": 27,
    "averageRating": 0,
    "trending": 0
    }
    ]


    Sunday, July 1, 2018 12:16 PM

All replies

  • As you have mentioned, our default scoring is done using TF-IDF algorithm (https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html). When a document has multiple fields, TF-IDF is applied across all text fields within a document, and the final score is calculated as the sum of those scores. Similarly, the boost you have added to the downloads field will boost the score of that field and be summed up in the total score. Here is a specific instance of a scoring profile calculation: https://stackoverflow.com/questions/41427940/how-do-scoring-profiles-generate-scores-in-azure-search/41454570#41454570

    The lower scoring on the second item, despite having a download boost and similar field values, could be related to the way your index is sharded, more details here: https://stackoverflow.com/questions/29814079/azure-search-scoring

    Let me know if the above resources help. It could also be beneficial for you to share the query.

    Tuesday, July 3, 2018 12:32 AM