none
Failovers Between Primary and Secondary and Back Again

    Question

  • I have a need for a failover between anything primary and secondary using Rx. For example, one may have a primary cluster that one needs to connect to and if it's down, fall back to a local cluster. However, as soon as the primary is back up, go back to the original primary one.

    In my case, the cluster is a messaging middleware, but you can apply the same technique to web services, databases, etc. If UPS is down, use fedex. As soon as it's back up, switch back. So assuming you have a list of addresses of any remote services, failing over seems easy, but more curious about different techniques on how to go back to primary when it's up. The only way you know that it's up though, is the original way to connect to it. There's no special api to query if it's up. If an exception is thrown, you know it's down.

    Appreciate any samples using Rx.



    • Edited by TimJohnson1 Tuesday, October 22, 2013 5:16 AM
    Monday, October 21, 2013 4:45 AM

Answers

  • This is an interesting problem. You cant just use Concat+Repeat as you want the secondary sequence to terminate when the Primary publishes new values.

    Perhaps something like this would be useful?

    void Main()
    {
    	//We need to be able to: On failure, concurrently reconect to 1+2. Take 2 until 1.
    	// 1 However is actually a recursive call to the Failover operator.
    	
    	PrimarySource().Failover(SecondarySource()).Dump();
    }
    
    // Define other methods and classes here
    public IObservable<long> PrimarySource()
    {
    	return Observable.Interval(TimeSpan.FromSeconds(2))
    					 .Take(3)
    					 .Concat(Observable.Throw<long>(new IOException("System is unavailable?!")));
    }
    
    public IObservable<long> SecondarySource()
    {
    	return Observable.Interval(TimeSpan.FromMilliseconds(250)).Select(i=>i*-1);
    }
    
    public static class ObsEx
    {
    	public static IObservable<T> Failover<T>(this IObservable<T> primary, IObservable<T> secondary)
    	{
    		IObservable<T> recurse = Observable.Create<T>(o=>
    				{
    					var loop = primary.Failover(secondary).Publish();
    					var continuation = secondary.TakeUntil(loop)
    										    	.Merge(loop)
    												.Subscribe(o.OnNext);
    					return new CompositeDisposable(continuation, loop.Connect());
    				});
    			
    		return primary.OnErrorResumeNext(recurse);
    	}
    }

    In this LinqPad snippet, I have two sequences; the primary will produce the values [0,1,2] and then OnError, the Secondary will produce decrementing values from 0.

    This still leaves the question of what should happen when the secondary goes down, unanswered.

    Regards

    Lee 


    Lee Campbell http://LeeCampbell.blogspot.com

    • Proposed as answer by LeeCampbell Monday, October 21, 2013 1:34 PM
    • Unproposed as answer by TimJohnson1 Monday, October 21, 2013 4:07 PM
    • Marked as answer by TimJohnson1 Monday, October 21, 2013 4:39 PM
    Monday, October 21, 2013 1:32 PM

All replies

  • This is an interesting problem. You cant just use Concat+Repeat as you want the secondary sequence to terminate when the Primary publishes new values.

    Perhaps something like this would be useful?

    void Main()
    {
    	//We need to be able to: On failure, concurrently reconect to 1+2. Take 2 until 1.
    	// 1 However is actually a recursive call to the Failover operator.
    	
    	PrimarySource().Failover(SecondarySource()).Dump();
    }
    
    // Define other methods and classes here
    public IObservable<long> PrimarySource()
    {
    	return Observable.Interval(TimeSpan.FromSeconds(2))
    					 .Take(3)
    					 .Concat(Observable.Throw<long>(new IOException("System is unavailable?!")));
    }
    
    public IObservable<long> SecondarySource()
    {
    	return Observable.Interval(TimeSpan.FromMilliseconds(250)).Select(i=>i*-1);
    }
    
    public static class ObsEx
    {
    	public static IObservable<T> Failover<T>(this IObservable<T> primary, IObservable<T> secondary)
    	{
    		IObservable<T> recurse = Observable.Create<T>(o=>
    				{
    					var loop = primary.Failover(secondary).Publish();
    					var continuation = secondary.TakeUntil(loop)
    										    	.Merge(loop)
    												.Subscribe(o.OnNext);
    					return new CompositeDisposable(continuation, loop.Connect());
    				});
    			
    		return primary.OnErrorResumeNext(recurse);
    	}
    }

    In this LinqPad snippet, I have two sequences; the primary will produce the values [0,1,2] and then OnError, the Secondary will produce decrementing values from 0.

    This still leaves the question of what should happen when the secondary goes down, unanswered.

    Regards

    Lee 


    Lee Campbell http://LeeCampbell.blogspot.com

    • Proposed as answer by LeeCampbell Monday, October 21, 2013 1:34 PM
    • Unproposed as answer by TimJohnson1 Monday, October 21, 2013 4:07 PM
    • Marked as answer by TimJohnson1 Monday, October 21, 2013 4:39 PM
    Monday, October 21, 2013 1:32 PM
  • How would you refactor above when it's not a stream within primary or secondary? The primary will be called once per request but you may have thousands per second.
    Monday, October 21, 2013 8:12 PM
  • I would see this as a separate problem to what I thought the original problem was.

    It seems that you can not know when the Primary is back up or not, so that you would have to always try the primary first and then the secondary. This means that you would just call

    primary.MakeWebRequest()
           .OnErrorResumeNext(secondary.MakeWebRequest())
           .Subscribe(....);

    However, you can see that this would scale awfully if the primary was down often, or, for extended periods of time. If scale is important, then you will need better infrastructure. Systems like my-channels' Nirvana offer connection status, push notifications and clustering failover built in at a software level (all just runs on a JVM). You could potentially look to hardware to help too?

    If you can think of a way to solve this with normal observer pattern and your current infrastructure, then you will be able to also create a (probably a nicer) solution with Rx. However, if you cant, then Rx wont be a magic potion for you.


    Lee Campbell http://LeeCampbell.blogspot.com

    Monday, October 21, 2013 10:14 PM
  • What if while you are on a secondary, you could have primary try connecting once a sec and then publish out to switch back? Would that be possible within Rx?
    Tuesday, October 22, 2013 5:14 AM
  • In you original question you specified that 

    The only way you know that it's up though, is the original way to connect to it. There's no special api to query if it's up. If an exception is thrown, you know it's down.

    So are you suggesting now that there is a hook on the API that allows the consumer to know when the service is available? If not then maybe you are going to apply some sort of policy e.g.

    • if the primary throws and forces a failover to secondary more than 2 times in a row, then consider primary down. Reevaluate the Primary status every 1 seconds by hitting some sort of health check endpoint. If the Primary healthcheck passes then, publish it is healthy.

    The key part of this question is still not related to Rx, but your infrastructure and your policy for determining when to use Primary and when to use secondary.

    In this scenario then I would suggest having one observable sequence (outer) that returns you a connection to use (primary or secondary) and it would then provider you an (inner) observable sequence that is your webresponse.

    Something like

    from connection in connectionRepository.CurrentConnection
    from response in connection.GetWebResponse()
    select response;

    The connection repository would then deal with identifying which is the most appropriate connection/serviceClient is to use. Here the CurrentConnection would probably internally be a BehaviorSubject<T>.

    But you do still have to figure out how you would solve this independently of the fact you want to use Rx as the tool to solve it.


    Lee Campbell http://LeeCampbell.blogspot.com

    Tuesday, October 22, 2013 7:34 AM
  • Thanks Lee. So assuming that you would go with your strategy as it seems logical, how would the complete code look? Just curious at this point about the outer and inner observables with BehaviorSubject. Would love to try out. If Rx is not the right solution, what would be an alternative? I would, however, live to see the outer/inner observable code though .
    Wednesday, October 23, 2013 5:12 AM