The Microsoft Azure Suport Engineers seem to think the issue is that the files are not being cached because of the file type. I use an arbirary file type "ptdb" which is an encrypted binary file, and I cannot change the file format. They gave
me a list of arbirary file types and claim that if the file is not one of these then it cannot be cached in the CDN. I have not been able to find any official documentation regarding this and it seems scetchy. Can any one illumniate this issue?
Here are several emails from support:
will cache all file types. There were multiple levels of caching for CDN
service. Below files types will be cached at the parent level and all other
files formats will be cached in the next level. Having the files such as large
in size and files which were frequently accessed in the
parent level would give better performance for customers.
Because these files were not cached at
parent level and there were little high number of requests to this resource,
there would have been few requests failed. This is because it reached the
scalability of CDN at this level(nodes saturated). When resource is cached in
parent level, customer would get high scalability and entire requests would
have been succeeded. To get better scalability, please ask customer to use
following file types.
I would be highly surprised if there was a magic list of file types that the CDN caches.
What I would wonder, though, is whether your web app (or storage account?) is emitting an appropriate cache-control header on this type of file. If this is a web app, it could be that your web server (IIS most likely) isn't setting the right cache headers
because it doesn't recognize the file type.
If you fetch the file directly (not through the CDN), what does the cache-control header show?
Here is another response from Microsoft when I asked them to clarify "parent" level.
CDN hierarchy and how best you can utilize the CDN service.
The Azure CDN has a two level hierarchy in each physical location.
2.Parent cache servers
For example, in London, we have several hundred edge servers and a few dozen parent cache servers.
The edge servers are tuned for speedy customer delivery, the parent servers are designed for storage of large amounts of cache data so that each request does not have to go all the way back to origin for each edge
server request. However, the parent cache servers only store certain files which are efficient to store, hence the limitation on file extensions. Files that are 1MB or greater in size are a good fit for parent caching.
The edge servers which serve customers are the first tier of caching. If they do not have the file the customer needs, they will then make a decision on where to go to get a copy of the file they need. First, they
check the list of approved extensions for using the parent caches, if the extension matches, they will ask the appropriate parent cache server to give them a copy of the file. This is very fast since the parent is in the same location as the edge servers and
if the file is already downloaded, the edge servers will get it almost immediately and serve it to the customer.
If the file is not on the parent cache list, the edge server will instead ask the origin for a copy. This will take it much longer to fetch the file. Additionally, depending on the number of customers on the edge
server and the popularity of the file in question, the file might get evicted if it’s not downloaded often enough. Only files that are regularly downloaded are kept in cache.
The basic considerations for the customer are the following:
1.Is the file greater than 1 MB in size? If so, using the parent is always a good idea
a.A file extension that matches our approved extensions is required to make this work
b.MIME type does not matter, only normal cache control header rules apply
2.If the file is smaller than 1 MB then parent caches will not help performance
a.jpg, css, png all go directly to origin (for example)
3.Is the file downloaded often enough to keep it in cache?
a.If the file is downloaded only a couple times per day, it is not a good fit for the CDN and should not use the CDN in the first place.
b.Somewhere in the neighborhood of 50 to 100 requests per hour per file is probably a good number to shoot for. Anything less than that is probably best served from origin.
Just to emphasize that the implementation of Azure CDN is not publicly documented. The "parent level" mentioned by MSFT support is used to troubleshoot your specific case. It's not recommended for others to follow the "how best you can utilize the CDN service"
you posted because it is subject to change.
Specific to the issue you mentioned as you've already opened a ticket with our support please follow up with them. I'll close this thread for now. Thanks for your understanding and you're appreciated to post solution here if you finally get this issue resolved.