Write-behind cache, coalescing, and batching
-
12 Şubat 2009 Perşembe 00:04I recall a write-behind feature being mentioned for CTP 3. Is there any more info on how this is planned to be implemented? How configurable will it be? I'd image each object would need to implement some sort of interface that would include a Store method to serialize it to database along with a Store(ICacheable[] objects) to support custom batch storage code or something like that?
I am interested in a write-behind cache for two objectives:
A) Coalescing multiple updates to the same object into a single update
B) Batching updates for multiple objects to reduce round-trips
Here's a scenario that I would like to implement. I'll use the usual shopping cart example. I'd like to store a shopping cart on the cache grid and update it as needed with new items, quantity changes, etc. I don't want to pester the database with storing all of these changes immediately so having the object in the grid will solve this need and multiple updates will naturally be coalesced until it's time to write the cart to the database.
Now I would like the write-behind cache to be configurable to allow me to decide how quickly my data needs to reach the database. I'd always like the option to simply exit the callback without doing my write and then the next time the write-behind internal arrives I'd once again get the callback for this object and have the option to write or skip. I'd need cache metadata such as the date the object was first added to the cache, the date it was last updated, a dirty bit to tell if changes have been made since it was last written to the database, and some sort of incrementing version identifier (CacheItemVersion in CTP2 seems fine for this).
I'm using the ASP.Net Health Monitoring System in my site and I really like the flexibility offered by it's buffering system with normal flush and urgent flush counts and intervals. I think something similar would be nice for the caching system where I could designate which objects would qualify as urgent and which as normal. Shopping carts for submitted orders I would quality as urgent and would want written to the database within 1 minute to appear in a user's order history and enter processing ASAP. Shopping carts that are not yet completed I'd only want written to the database every 10 minutes or so just in case the grid outright died (high available or not, someone can kick a power plug to a whole rack/etc).
For certain objects it would be beneficial to specify that they should be flushed to database when evicted for whatever reason (TTL expired, memory pressure, etc).
Also, it would be beneficial to optionally be able to remove the item from the cache in the write-behind callback to Store after it's been persisted to the database. For instance, we wouldn't need completed order carts hanging out in the grid any more. Without that feature objects would sit until their TTL expired which could be a long time for a cart (3 hours maybe?).
As for batching, there are a lot of options here. There would be a need to set a batchsize property somewhere since we wouldn't want to batch every possible object at once as that could cause other issues like escalating locking problems in the DB if too many rows get locked by one update. Only objects of the same type should appear in a batch. In addition it might be interesting to support batching by tag templates to batch items for the same department at once for instance. The actual writing would either be accomplished by the callback call the Store(ICacheable[] objects) method or doing SQL TDS protocol level batching like the SqlDataAdapter supports to cut down on database multiple call overhead and roundtrips.
Our application has very many tiny calls to the DB to read/write (hundreds of thousands a minute). Each call on its own uses miniscule resources but together they add up to 50% of our overall load. We are really looking forward to batching and coalescing support.
- Düzenleyen Sideout 12 Şubat 2009 Perşembe 00:53 updated with more details
Tüm Yanıtlar
-
28 Şubat 2012 Salı 13:14
I recall a write-behind feature being mentioned for CTP 3. Is there any more info on how this is planned to be implemented? How configurable will it be? I'd image each object would need to implement some sort of interface that would include a Store method to serialize it to database along with a Store(ICacheable[] objects) to support custom batch storage code or something like that?
[Rohit: It's already available in the v1.1 release of Caching. You can play with the bits to get more idea. In a nutshell, you need to add a custom provider at a named cache level which will take care of translating the Key, Value semantics of the cache into the Db. You don't have to do it on a per object level. ]
I am interested in a write-behind cache for two objectives:
A) Coalescing multiple updates to the same object into a single update
B) Batching updates for multiple objects to reduce round-trips
[Rohit: Yes, Both are supported in the Write Behind Design and the batch interval is configurable.]
[Rohit: Went through your scenario. Thanks for providing such a detailed explanation of what you are trying to accomplish. I am mostly able to follow the scenario and have given some guidance below. Feel free to get back for further queries. There is one high level thing - AFCache supports WriteBehind configurations on a per named cache level and not a per object level. So, you need to look at the answers with that mindset. It's always possible to handle object specific semantics in the provider (which you would write)]
Here's a scenario that I would like to implement. I'll use the usual shopping cart example. I'd like to store a shopping cart on the cache grid and update it as needed with new items, quantity changes, etc. I don't want to pester the database with storing all of these changes immediately so having the object in the grid will solve this need and multiple updates will naturally be coalesced until it's time to write the cart to the database.
Now I would like the write-behind cache to be configurable to allow me to decide how quickly my data needs to reach the database. I'd always like the option to simply exit the callback without doing my write and then the next time the write-behind internal arrives I'd once again get the callback for this object and have the option to write or skip. I'd need cache metadata such as the date the object was first added to the cache, the date it was last updated, a dirty bit to tell if changes have been made since it was last written to the database, and some sort of incrementing version identifier (CacheItemVersion in CTP2 seems fine for this).
[Rohit: You can configure the WritebehindInterval(min. interval for which the item will stay in the cache before it is handed to the Db. ). You can do the skipping of objects by returning the object back - the Server treats this as a case where in the object could not be written to the Db and would retry the next time. You can also configure how many times the server should retry for the same object before it gives up. By default, the server tries forever. You have access to the CacheItem properties, such as tags, version etc. but we don't expose a property such as the first add time of object. I think you can avoid it. The object is given to the provider only if there were changes made to it since the last time it was written to the Db. So, you won't get spurious items that haven't been changed.]I'm using the ASP.Net Health Monitoring System in my site and I really like the flexibility offered by it's buffering system with normal flush and urgent flush counts and intervals. I think something similar would be nice for the caching system where I could designate which objects would qualify as urgent and which as normal. Shopping carts for submitted orders I would quality as urgent and would want written to the database within 1 minute to appear in a user's order history and enter processing ASAP. Shopping carts that are not yet completed I'd only want written to the database every 10 minutes or so just in case the grid outright died (high available or not, someone can kick a power plug to a whole rack/etc).
[Rohit: The way to achieve this would be to put the normal and urgent items in different caches which have a different writebehindInterval configuration. You can store the submittedRequests in a different Named cache.]
For certain objects it would be beneficial to specify that they should be flushed to database when evicted for whatever reason (TTL expired, memory pressure, etc).
Also, it would be beneficial to optionally be able to remove the item from the cache in the write-behind callback to Store after it's been persisted to the database. For instance, we wouldn't need completed order carts hanging out in the grid any more. Without that feature objects would sit until their TTL expired which could be a long time for a cart (3 hours maybe?).
[Rohit: Remove will cause a callback to the provider and you can choose to not remove the data from the Db. The default expiry is 10 mins, so it shouldn't ideally be a concern and you can configure this as well. ]
As for batching, there are a lot of options here. There would be a need to set a batchsize property somewhere since we wouldn't want to batch every possible object at once as that could cause other issues like escalating locking problems in the DB if too many rows get locked by one update. Only objects of the same type should appear in a batch. In addition it might be interesting to support batching by tag templates to batch items for the same department at once for instance. The actual writing would either be accomplished by the callback call the Store(ICacheable[] objects) method or doing SQL TDS protocol level batching like the SqlDataAdapter supports to cut down on database multiple call overhead and roundtrips.
[Rohit: You don't have to worry about batch sizes. Each object will be given to the provider only after it has stayed in the cache for a min. of the WbInterval. The keys would be given in batches (determined by server with usually a 1 min gap since the last batch). The writing to SQL is taken care by the provider code which you write and you are free to do it any way you like. ]Our application has very many tiny calls to the DB to read/write (hundreds of thousands a minute). Each call on its own uses miniscule resources but together they add up to 50% of our overall load. We are really looking forward to batching and coalescing support.
[Rohit: Yeah, Writebehind-Read through should be useful for your scenario. Let us know your feedback on the same. I would suggest that you should take a look at the 1.1 release and the sample provider documentation to get a feel for the knobs available and how to play with them. Feel free to get back for further queries]
- Yanıt Olarak Öneren rohit_msfte 28 Şubat 2012 Salı 13:15
- Yanıt Olarak İşaretleyen Sideout 28 Şubat 2012 Salı 13:59