[MS-OXORSS] PostRssItemHash interoperability
- Hi,
MS-OXORSS Section 2.2.1.3 "PidLidPostRssItemHash" states
Type: PtypInteger32.
Contains a hash of the feed XML computed by using an implementation -dependent algorithm; used
to quickly determine whether two items are different.
I'm assuming that this is used to verify whether an item in the feed is different to the items in the current store. I'm not sure how to interoperate with Outlook without the implementation-dependent algorithm.
The example here is that a user has Outlook (e.g. on the laptop) and another application (e.g. on a phone) and is using the same store for both. So either application can update the list of articles (items) from the feed source. Without agreement on the PostRssItemHash calculation, presumably both applications will add each item to the store (since even if it already exists, it is "new" if the PostRssItemHash doesn't correspond), leading to the store containing two copies of every article. Adding a third application (e.g. desktop machine) will produce yet another copy.
Can I get more information on the algorithm and how the hash is calculated?
Brad
All Replies
- Hi Brad:
Thanks for your inquiry.
A member of Protocol Documentation Team will be in touch soon.
Regards, Obaid Farooqi - Brad,
I am the engineer who has taken ownership of your inquiry. I am investigating this and will update you as things progress.
Dominic Michael Salemno
Senior Support Escalation Engineer - Brad,
I am still investigating this issue. I will update you as things progress.
Dominic Michael Salemno
Senior Support Escalation Engineer
US-CSS DSC Protocols Team - Brad,
I am still investigating this but will have an answer for you shortly.
Dominic Michael Salemno
Senior Support Escalation Engineer
US-CSS DSC Protocols Team Brad,
PidLidPostRssItemHash contains a *hash* of the following elements contained within the feed:
RSS
Title
description
encoded
enclosure
Atom
title
summary
content
The values contained in the XML elements of the feed are converted to Unicode strings upon which their hash values are generated.
The hash function in question is the DJB Hash Function shown below:
//
// http://msdn.microsoft.com/en-us/library/bb821100.aspx
//
ULONG CalcWzHash(const WCHAR *wz)
{
ULONG ulHash = 0;while (*wz)
ulHash = (ulHash << 5) + ulHash + *wz++;return ulHash;
}Please note that the enclosure element usually does not contain a value. Although enclosure may have some attributes, we do not generate a hash from the attributes.
Does this information answer your question?
Dominic Michael Salemno
Senior Support Escalation Engineer
US-CSS DSC Protocols Team- Marked As Answer byDominic Salemno MSFTMSFT, ModeratorWednesday, November 11, 2009 8:04 PM
- Unmarked As Answer byBrad Hards Saturday, November 14, 2009 10:08 AM
- Hi Dominic.
I appreciate this information, but have some followup questions:
1. How are the values converted to unicode encoded (e.g. UTF-16, UTF-8)?
2. Does RSS use three values (Title, description, "encoded enclosure") or are "encoded" and "enclosure" meant to refer to different things? I'm assuming the former, but the layout in your table is a bit confusing.
3. I'm not sure how the link to "Algorithm to Encode Entry IDs and Attachment IDs" relates to this algorithm. Can you explain?
4. Are the unicode strings just concatenated? That is, for the example given in http://msdn.microsoft.com/en-us/library/ee179505.aspx, is the input "Learn to narrow your search criteria for better searches in ContosoInstant Search can help you find information in a flash."?
Thanks for the work on this.
Brad - Brad,
I am investigating your questions.
Dominic Michael Salemno
Senior Support Escalation Engineer
US-CSS DSC Protocols Team Brad,
1. UTF-16
2. Yes, “encoded” and “enclosure” refer to two different things. RSS uses four values: title, description, encoded and enclosure.
3. The algorithm is similar, and was only provided as an example.
4. No. The hash value is calculated before concatenating the values.
Does this information answer your questions?
Dominic Michael Salemno
Senior Support Escalation Engineer
US-CSS DSC Protocols Team
- Proposed As Answer byDominic Michael Salemno Tuesday, November 17, 2009 7:37 PM
- Unproposed As Answer byBrad Hards Friday, November 20, 2009 9:08 PM
- Sorry, I'm going to need even more guidance.
1. I'm OK with this.
2. I'm not OK with this. Lets take another example, this one from MSNBC
So it is easy to see the title value and the description. I note your comment about enclosure. But what is "encoded"? I don't see that in the RSS spec (http://cyber.law.harvard.edu/rss/rss.html#hrelementsOfLtitemgt )<item> <title>Cruise operator: Young adults bring parents</title> <description>An Australian cruise company said it would continue to insist that young adult passengers be accompanied by parents even though the policy has been branded discriminatory.<br clear="both" style="clear: both;"/> <br clear="both" style="clear: both;"/> <a style='font-size: 10px; color: maroon;' href='http://www.pheedcontent.com/hostedMorselClick.php?hfmm=v3:bf2dd797167c981b9054ed36e0a12603:FYdQiR6e6rZuaUrN6JJoo4e0GZspAgKoq%2B864OsSNPmDuLDTmL2fCvGXITeMmxokDOqDTRAfT44D'><img border='0' title='Email this Article' alt='Email this Article' src='http://images.pheedo.com/images/mm/emailthis.png'/></a> <a style='font-size: 10px; color: maroon;' href='http://www.pheedcontent.com/hostedMorselClick.php?hfmm=v3:57d3dfb5af786e3e42aeb370dcffea05:omtDW7RyAh%2Fa5FS0yZukC3JIsu0yCaIsq3sujdk7mU8312sIkMH7hLpGAj6jHBLhuxEvZisa%2F0RZsg%3D%3D'><img border='0' title='Add to Newsvine' alt='Add to Newsvine' src='http://images.pheedo.com/images/mm/newsvine.png'/></a> <br clear="both" style="clear: both;"/><br clear="all"/> <img src="http://ads.pheedo.com/img.phdo?kw=" align="absmiddle" /> <a href='http://ads.pheedo.com/click.phdo?s=fe632391ffcbbf57deca3887f454772f&p=64&kw=Australia'>Australia</a> - <a href='http://ads.pheedo.com/click.phdo?s=fe632391ffcbbf57deca3887f454772f&p=64&kw=Recreation'>Recreation</a> - <a href='http://ads.pheedo.com/click.phdo?s=fe632391ffcbbf57deca3887f454772f&p=64&kw=Travel'>Travel</a> - <a href='http://ads.pheedo.com/click.phdo?s=fe632391ffcbbf57deca3887f454772f&p=64&kw=Specialty+Travel'>Specialty Travel</a> - <a href='http://ads.pheedo.com/click.phdo?s=fe632391ffcbbf57deca3887f454772f&p=64&kw=Cruises'>Cruises</a> </description> <link>http://www.msnbc.msn.com/id/34033957/ns/world_news-asiapacific/</link> <pubDate>Thu, 19 Nov 2009 09:36:51 GMT</pubDate> <category>News</category> <guid isPermaLink="false">http://www.msnbc.msn.com/id/34033957/ns/world_news-asiapacific/</guid> </item>
3. I'm OK with this.
4. How do we hash the four values if we don't concatenate them? We could avoid resetting the hash value, and just call
while (*wz) ulHash = (ulHash << 5) + ulHash + *wz++;on each string, but that would just be the same as concatenating them.
Brad


