locked
Azure Stack DevKit Install - AzS-Sql01 Network Issues RRS feed

  • Question

  • When running the installation for the latest DevKit version AzS-SQL01 looses it's ability to communicate with any other machine in the stack including the host. It cannot ping it's default gateway nor is it reachable from any VM on the SDNSwitch.

    All other VMs seems accessible. 

    Reran the installation 3 times using different options, same result.

    Looking for ideas on how to troubleshoot this one.


    Casper Pieterse - Snr. Solution Architect - Dimension Data

    Wednesday, July 19, 2017 12:57 PM

Answers

  • "5) Tried adding a new vNIC on the SDN network, no luck."

    For everyone's understanding, adding an adapter to the SDN network won't help, the installer uses the network controller to set properties to reach the VXLAN network, you can't set them manually. Removing the adapter results in a 'death in the water' VM. Thus, you have to redeploy. 


    Cheers,

    Ruud
    Twitter:    Blog: AzureStack.Blog  LinkedIn:    
    Note: Please “Vote As Helpful” if you find my contribution useful or “Mark As Answer” if it does answer your question. That will encourage me - and others - to take time out to help you.


    Thursday, August 3, 2017 7:17 AM

All replies

  • Hi Casper,

    We are Investigating your issue and require some logs in order to continue troubleshooting. 

    If you could, please email ascustfeedback@microsoft.com to get a workspace setup to upload your logs.  

     

    Make sure to use a Work, Organizational or Student address when contacting ascustfeedback@microsoft.comand include the thread URL in the subject.

    https://aka.ms/GetAzureStackLogs :)

     

    We apologize for any inconvenience and appreciate your time and interest in Azure Stack.

    If you continue experience any issues with TP3 release, feel free to contact us.

     

     Thanks,


    Gary Gallanes

    Wednesday, July 19, 2017 3:29 PM
  • Hello Casper,

    We have setup a Workspace for you to upload your logs & sent email with a link and detailed instructions for gathering and uploading your logs.    

     Thanks,


    Gary Gallanes

    Thursday, July 20, 2017 5:09 PM
  • Hello Casper,

    We've received your logs and performing analysis.  We'll get back to you will next steps as soon as our analysis is complete.

     Thanks,


    Gary Gallanes

    Monday, July 24, 2017 11:39 PM
  • Hello Caper,

    Question: How long did it take for the SQL VM to reboot?

     

    Also – I sent you email requesting some additional logs. 

    Look forward to hearing back for you.

     

     Thanks,

    Gary


    Gary Gallanes

    Saturday, July 29, 2017 12:52 AM
  • Hi,

    It reboots fairly quickly (to the OS at least). Login takes a bit longer as it tries to find a login server with obviously no luck

    Sequence goes something like this:

    1) When I run the installer everything goes fine and VMs comes up.  

    2) I run a continuous ping to the SQL and it replies just fine

    3) VM starts it's installation process and joins domain successfully

    4) Somewhere in the installation process it just stops responding and all network access network access is list. It can't even see it's default gateway nor is the ARP table populated.

    5) Tried adding a new vNIC on the SDN network, no luck.

    6) Disabled the FW on the VM, also no luck

    5) I'll have to go back and check (if required) but think if I add a second vNIC on the external network it does come up with an IP from my local DHCP server - but is has been a couple of weeks, so not 100% sure.


    Casper Pieterse - Snr. Solution Architect - Dimension Data

    Saturday, July 29, 2017 1:16 PM
  • I have the same Issues. I try to find the bug for tree weeks now. but I can't find I bug on my site. I think after domain join something blogs the azs-sql01 out of the internal AzureStack network. But what I woundering why it can take a IP from some where. But if I run ipconfig /renew it says the link is down.
    If I run Get-NetAdapter it says all is fine and the link is up and running. 

    That is at all wired...



    Sunday, July 30, 2017 8:33 PM
  • I have submitted logs to Microsoft and will submit the additional information requested tomorrow morning. 

    Can you share the physical hardware you are using? 


    Casper Pieterse - Snr. Solution Architect - Dimension Data

    Sunday, July 30, 2017 10:29 PM
  • Customer is redeploying to mitigate issue. Was unable to collect addition logs for RCA.

    Gary Gallanes

    Thursday, August 3, 2017 1:17 AM
  • "5) Tried adding a new vNIC on the SDN network, no luck."

    For everyone's understanding, adding an adapter to the SDN network won't help, the installer uses the network controller to set properties to reach the VXLAN network, you can't set them manually. Removing the adapter results in a 'death in the water' VM. Thus, you have to redeploy. 


    Cheers,

    Ruud
    Twitter:    Blog: AzureStack.Blog  LinkedIn:    
    Note: Please “Vote As Helpful” if you find my contribution useful or “Mark As Answer” if it does answer your question. That will encourage me - and others - to take time out to help you.


    Thursday, August 3, 2017 7:17 AM
  • The Problem is a Networking Issue the "Azure VFP Switch Extension".
    As soon as I disable the Extension in the Virtual Switch Settings the AZS-SQL01 Machine has a working Network and I can proceed with the Installation.
    I am not sure if this Extension is needed later on, but I am 100% sure that it is the issue for the connectivity Problems of the AZS-SQL01 Machine.
    Saturday, August 12, 2017 11:13 PM
  • Ok, now I got it. It is the QLogic BCM5709c Network driver.

    Once I searched for now drivers over Windows Update all went smooth.

    Also the AZS-SQL01 Machine is directly accessible after the driver update even with zure VFP Switch Extension switched on.

    I had an API registration error at 60.140.141 "An error occurred while trying to make an API call to Resource Manager: Unable to connect to the remote server", this was also gone after the driver update.

    Sunday, August 13, 2017 8:42 PM
  • The latest QLogic drivers address an issue where the Interface Description and PNP ID can differ which is what breaks the installation.  If you run the following commands and see that the two sets of information for the InterfaceDescription don't match you need to update the driver:

    Get-Netadapter | Ft Name,InterfaceDescription
    Get-VMSwitch -Name "SdnSwitch" | Format-Table  -Property Name,NetAdapterInterfaceDescriptions

    Thanks,

    -Steve


    Steve Linehan | Principal PM Manager | Microsoft AzureCAT

    Sunday, August 13, 2017 11:43 PM
  • Hi All,

    Just a quick clarification -  I was able to get it to work but I had to be quite particular about the drivers I used.  This driver from Dell seemed to be the only one that would work.

    https://www.dell.com/support/home/au/en/audhs1/Drivers/DriversDetails?driverId=5TY3G

    Prior to installing that driver - the SLB Mux would only advertise the SLB Manager Endpoint and no additional VIPs.  Post update, the output from the AzS-BGPNAT01 VM was as follows:

    If you're not seeing the above - then you still have a problem with the driver.

    Monday, February 19, 2018 2:11 AM