none
SQL 2008 Cluster一个节点上的群集服务不能启动。 RRS feed

  • 问题

  • 环境:Windows 2003 R2 X64 +SQLServer 2008 R2

    今天早上来 发现两个节点上都存在很多KB要安装。安装之后,发现当其中任意一个节点上的群集服务启动之后另外一个节点上的群集服务就不能正常启动了。

    报错信息:

    Cluster service is shutting down because the membership engine detected a membership event while trying to join the server cluster. Shutting down is the normal response to this type of event. Cluster service will restart per the Service Manager's recovery actions.

     事件ID:1073   1173

    Cluster service was halted to prevent an inconsistency within the server cluster. The error code was 5890.


    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.

    • 已编辑 Wison-Ho 2011年5月30日 6:27
    2011年5月30日 1:37

全部回复

  • 看了微软的这个介绍。

    http://support.microsoft.com/kb/890761

    上面说进去之后,

    1. Click to clear the Require NTLMv2 session security check box, and then click OK.

     

    但是,我检查两个节点上的这个选项都是没被勾选上的啊。。。。。

     


    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.
    2011年5月30日 1:42
  • 而且我看他的Hotfix链接中也没有For X64的,只有X86和IA64的
    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.
    2011年5月30日 2:22
  • 没人可以帮下忙吗?
    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.
    2011年5月30日 6:26
  • 请求大家的帮忙
    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.
    2011年5月31日 3:14
  • 今天重新做了cluster。

    之前的一次是先装好系统之后就做cluster,之后再打补丁。

    然后出现第二台机器上的cluster service不能启动

    今天是先将所有的补丁安装之后,再进行cluster的配置,测试还算OK。

    但是不知道以后是否还会出现这个情况。


    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.
    2011年5月31日 10:37
  • 你的更新里面是不是有针对cluster service的更新啊,如果有,那么肯定是patch一个node另外一个node无法启动,因为两者不同版本啊

     

    对于patch的时候,最好是关闭cluster上所有服务,然后所有node同时patch,然后再恢复


    If you think my suggestion is useful, please rate it as helpful.
    If it has helped you to resolve the problem, please Mark it as Answer.
    http://twitter.com/7Kn1ghts

    2011年5月31日 13:40
  • 不知道是否有针对cluster服务的补丁,因为当时有71个补丁需要安装。
    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.
    2011年6月1日 0:24
  • 难道我每次安装Windows补丁的时候,都是要将两个节点上的群集服务停止吗?

    这样是不是影响太大了。。。。


    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.
    2011年6月1日 5:57
  • 难道我每次安装Windows补丁的时候,都是要将两个节点上的群集服务停止吗?

    这样是不是影响太大了。。。。


    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.

    patch 的时候最好是申请 sqlserver的downtime,因为patch确实是比较难控制的

    再一个,你一次性patch71个,这个有点太多了啊,你最好自己用wsus/sms控制patch的KB,不要一股脑全上,否则出了上面的问题你就得重新配置cluster,sqlserver重新配置cluster不是那么简单啊


    If you think my suggestion is useful, please rate it as helpful.
    If it has helped you to resolve the problem, please Mark it as Answer.
    http://twitter.com/7Kn1ghts

    2011年6月1日 8:01
  • 你的意思是Patch的时候,就停止掉两个节点上的cluster 服务?然后再打补丁?

    打完之后,再重启两台节点服务器?

    环境决定根本没有这样的机会。。。


    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.
    2011年6月1日 8:23
  • 难道我每次安装Windows补丁的时候,都是要将两个节点上的群集服务停止吗?

    这样是不是影响太大了。。。。


    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.

    patch 的时候最好是申请 sqlserver的downtime,因为patch确实是比较难控制的

    再一个,你一次性patch71个,这个有点太多了啊,你最好自己用wsus/sms控制patch的KB,不要一股脑全上,否则出了上面的问题你就得重新配置cluster,sqlserver重新配置cluster不是那么简单啊


    If you think my suggestion is useful, please rate it as helpful.
    If it has helped you to resolve the problem, please Mark it as Answer.
    http://twitter.com/7Kn1ghts

    那我是否可以这样,直接先在两个节点上都安装补丁,之后重启其中inactive的节点,启动之后,再重启另外一台?

    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.
    2011年6月2日 1:33
  • 难道我每次安装Windows补丁的时候,都是要将两个节点上的群集服务停止吗?

    这样是不是影响太大了。。。。


    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.

    patch 的时候最好是申请 sqlserver的downtime,因为patch确实是比较难控制的

    再一个,你一次性patch71个,这个有点太多了啊,你最好自己用wsus/sms控制patch的KB,不要一股脑全上,否则出了上面的问题你就得重新配置cluster,sqlserver重新配置cluster不是那么简单啊


    If you think my suggestion is useful, please rate it as helpful.
    If it has helped you to resolve the problem, please Mark it as Answer.
    http://twitter.com/7Kn1ghts

    那我是否可以这样,直接先在两个节点上都安装补丁,之后重启其中inactive的节点,启动之后,再重启另外一台?

    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.
    其实都一样啊,只要两边nodes的patch都一样,cluster services应该不会有问题

    If you think my suggestion is useful, please rate it as helpful.
    If it has helped you to resolve the problem, please Mark it as Answer.
    http://twitter.com/7Kn1ghts

    2011年6月2日 9:24
  • 但是要重启了之后才生效的吧?

    那么随便重启一台节点之后,这个节点的patch和另外一个节点的patch就不一样了吧?


    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.
    2011年6月2日 12:39
  • 但是要重启了之后才生效的吧?

    那么随便重启一台节点之后,这个节点的patch和另外一个节点的patch就不一样了吧?


    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.

    使得啊,我就是这个意思,你要patch OS(cluster service related)就都要patch,downtime是肯定的,你没办法做到一边UP然后再failover

    但是SQL patch你可以failover,不过一般来说patch都是cluster-wise的,不需要你failover


    If you think my suggestion is useful, please rate it as helpful.
    If it has helped you to resolve the problem, please Mark it as Answer.
    http://twitter.com/7Kn1ghts

    2011年6月3日 13:01
  • For windows patching, better to patch passive node first. Failover all resourses on patched node and make sure everything work properly, then patch remaining node.

    2011年6月6日 16:41
  • For windows patching, better to patch passive node first. Failover all resourses on patched node and make sure everything work properly, then patch remaining node.


    我先说下我上次patch的做法:

    首先在passive的节点上安装windows的补丁。因为提示重启,当时就有重启。但是重启之后cluster的service不能正常启动了。后面没办法,就重做了cluster


    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.
    2011年6月8日 3:03
  • You should be able to find reason of failed to start in windows event logs and cluster log file.
    2011年6月8日 3:27
  • 但是要重启了之后才生效的吧?

    那么随便重启一台节点之后,这个节点的patch和另外一个节点的patch就不一样了吧?


    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.

    使得啊,我就是这个意思,你要patch OS(cluster service related)就都要patch,downtime是肯定的,你没办法做到一边UP然后再failover

    但是SQL patch你可以failover,不过一般来说patch都是cluster-wise的,不需要你failover


    If you think my suggestion is useful, please rate it as helpful.
    If it has helped you to resolve the problem, please Mark it as Answer.
    http://twitter.com/7Kn1ghts

    你的意思是:

    当对cluster的OS进行patch的时候,就将两台机器上的cluster service停止掉吗?之后对两台服务器进行OS patch?之后再两台机器关闭,再一台一台的开启 是吧?

    那么sqlserver是否要停止呢?

    因为怕出问题,所以自从cluster安装好之后,就一直不敢打补丁了。。。。

     


    If you haven't all the things you want,be grateful for the things you don't have that you didn't want.
    2011年8月1日 6:35
  • Not the case, we always do OS patching on passive node first and reboot it. Failover all resources to patched node then patch and reboot another node.
    2011年8月1日 14:16
  • follow this guide:

    http://support.microsoft.com/kb/958734

     


    If you think my suggestion is useful, please rate it as helpful.
    If it has helped you to resolve the problem, please Mark it as Answer.
    http://twitter.com/7Kn1ghts

    2011年8月1日 14:59