The first thing I notice is the sliding window aggregate requirement. Let's tackle that first.
//I want to calculate sum of this val per group (Id1+Id2)
// per second
// for the last 3 seconds.
//Break it into parts
// 1) per second --> Open a new window every second
// 2) for the last 3 seconds --> close the window after 3 seconds
var interval = TimeSpan.FromSeconds(1);
var window = TimeSpan.FromSeconds(3);
Observable.Timer(TimeSpan.Zero, interval), //Open window every 1second)
input, //Collect all the values
left=>Observable.Timer(window), //Close the left window after 3seconds
Running that with some dummy data, seems fine. I pull that out into a custom operator