none
I am fed up with Azure and their ridiculous availability RRS feed

  • Question

  • My azure instance ( footydisqussql.cloudapp.net ) which hosts an important SQL has been unavailable for half a week.
    On the monitor tab it says its online and running , i have restarted it several times but when I connect to 3306 (sql) or SSH its just times out.

    When I click on the support button I am only allowed to ask questions related to Billing and not technical question because that requires paid membership. I am a MSDN and BizSpark member.

    Microsoft, tell me how am I supposed to report to you that my instance is not working? I havent touched that instance for 2 months and suddenly it goes down? That is surely not caused by me so its your responsibility? So I have to pay to get support for issues caused by someone else than me?

    I do like Microsoft and try to stay away from Google products like gmail and google docs. But if you are going to be like this I am wont hesitate to check out ec2 or google cloud services.



    :)

    Tuesday, January 20, 2015 6:09 PM

Answers

  • Similar to chkdsk for Windows fsck is present in Linux. When Linux boots it
    performs an fsck.

    Example of errors in serial logs where fsck is required.

    Checking all file systems.
    [/sbin/fsck.ext4 (1) -- /] fsck.ext4 -a
    /dev/sda1
    /dev/sda1 contains a file system with errors, check forced

    /dev/sda1: Inodes that were part of a corrupted orphan linked list
    found. 
    /dev/sda1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY
     EXT4-fs (sda1): INFO: recovery required on readonly filesystem
    EXT4-fs
    (sda1): write access will be enabled during recovery
    EXT4-fs warning (device
    sda1): ext4_clear_journal_err:4531: Filesystem error recorded from previous mount: IO failure
    EXT4-fs warning (device sda1):
    ext4_clear_journal_err:4532: Marking fs in need of filesystem check .


    If however the file system is “clean”, you
    will see entries comparable to this, note additional data disks are also present
    on this VM with the presence of device /dev/sde1 

    Checking all file systems.
    [/sbin/fsck.ext4 (1) -- /] fsck.ext4 -a
    /dev/sda1
    /dev/sda1: clean, 65405/1905008 files, 732749/7608064 blocks

    [/sbin/fsck.ext4 (1) -- /tmp] fsck.ext4 -a /dev/sdc1
    [/sbin/fsck.ext4 (2) -- /backup] fsck.ext4 -a /dev/sde1
    /dev/sdc1: clean, 12/1048576 files,
    109842/4192957 blocks
    /dev/sde1 : clean, 51/67043328 files,
    4259482/268173037 blocks


    If there is corruption on file systems, manual intervention will be required to fix it.

    Recovery steps to clean-up corrupt file systems is to mount the OS disk from
    the VM exhibiting the problem to a working VM

    A = Original VM
    B = Temp VM

    1) Stop VM A via the management portal (If this is the last VM in a cloud service without a Reserved VIP you will lose the current IP address of the Cloud Service)

    2) Create a temporary VM in the same cloud service if you wish to retain the VIP,
    alternatively if you want to delete also the Cloud Service just create a temp
    VM      

    3) Delete VM A BUT select “keep the attached disks”

    4) Once the lease is cleared, “Attach disk” from VM A to VM B via the Azure Portal

    5) On VM B  you will need to locate the drive you have attached

    a. First locate the drive name to fsck, on VM B by looking in relevant log file

    grep SCSI /var/log/kern.log (ubuntu)
    grep SCSI /var/log/messages (centos, suse, oracle) 

    Example

    kernel: [ 9707.100572] sd 3:0:0:0: [ sdc ] Attached SCSI disk

    b. You will not be able to mount the file system so check and
    double check that you are going to run fsck on the correct un-mounted file
    system and not one of your mounted file systems.

    c. fdisk –l
    will return the attached disks, combine this output with your

    df –h
    fdisk -l

     Disk /dev/sdc: 32.2 GB,
    32212254720 bytes
    255 heads, 63 sectors/track, 3916 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Sector size (logical/physical): 512
    bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x000c23d3
       Device Boot      Start         End      Blocks  
    Id  System
    /dev/sdc1   *           1        3789    30432256   83  Linux
    /dev/sdc2            3789        3917     1024000   82  Linux swap / Solaris

    df -h
    Filesystem      Size  Used Avail Use%
    Mounted on
    /dev/sda1        29G  2.2G   25G   9% /
    tmpfs            776M     0  776M   0% /dev/shm
    /dev/sdb1        69G  180M   66G   1% /mnt/resource

    d. sda1 and sdb1 are mounted sdc1 is not mounted, we will
    run fsck against /dev/sdc1

    fsck -yM /dev/sdc1

    fsck from util-linux-ng 2.17.2
    e2fsck 1.41.12
    (17-May-2010)
    /dev/sdc1: clean, 57029/1905008 files, 672768/7608064 blocks

    6) Detach disk from VM  B via the management portal

    7) Recreate the original VM (Create VM from Gallery, Select My Disks) you will see
    the Disk referring to VM A

    As an alternative download the .vhd file, fix the corruption in a local
    Hyper-v environment and reload into Azure.



    Monday, February 2, 2015 12:16 PM

All replies

  • I can understand the frustration when things don't work. My understanding is that port 1433 is used as default for SQL. Have you changed it to 3306 for your SQL? In that case I assume that port 3306 is opened? In addition, I would suggest that you have some kind of health monitoring on your SQL instance so that you are aware that it is down at the first instance, especially if it is in production. Also that it is a pity MSDN and BizSpark doesn't offer support for free but that is a business arrangement that Microsoft has created for its customers.

    Frank

    Tuesday, January 20, 2015 6:45 PM
  • Its MYSQL not MSSQL. So defaultport is 3306.

    Did you read what I wrote? everything worked for couple of months (meaning SQL was available, ports were opened etc.), I didnt touch the server and suddenly its down.

    What am I supposed to do now? Just wait till it fixes by itself? pay for premium support? is this how Microsoft wants customers to be treated?


    :)

    Tuesday, January 20, 2015 6:47 PM
  • I think you just have to treat this in a business-like manner. At this point you are saying that it is Microsoft's fault that your SQL goes down, so you deserve free support. But Microsoft may not agree with you. My experience was that I created a support ticket and worked it out with Microsoft. And they would not charge me when they determined it was their problem.

    Have you tried creating a new SQL with the same data to see if it works?


    Frank

    Tuesday, January 20, 2015 7:09 PM
  • How do I even get access to the current SQL database? If I could get the dump I can create new instance and import it there.

    :)

    Tuesday, January 20, 2015 7:10 PM
  • Your SQL is in a VM. You can remote into your VM and get to your SQL data.

    Frank

    Tuesday, January 20, 2015 7:13 PM
  • Of course I know that. I cant even SSH into it. Port 22 is open but connection times out. Its a linux machine so I cannot do windows RDP into it.

    Anyway, this doesnt explain why the VM suddenly without touching refuses to answer SSH connections and MySQL connections.


    :)

    Tuesday, January 20, 2015 7:24 PM
  • Is your data stored in Azure storage? If it is in the VM and you have no access what so ever I think you need to talk to support.

    Frank

    Tuesday, January 20, 2015 8:03 PM
  • thats the issue, I cannot reach out to support because I dont have a premium azure account. Just crazy, I hope this thread appears in all the search engines because this is travesty.


    :)

    Tuesday, January 20, 2015 9:42 PM
  • Hi,

    Thank you for your question.

    I am trying to involve someone familiar with this topic to further look at this issue. There might be some time delay. Appreciate your patience. Thank you for your understanding and support.

    Best regards,

    Susie              

                          


    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Monday, January 26, 2015 1:52 AM
    Moderator
  • Hi,

    To better analyze this issue, would you please provide the Subsription ID, Deployment ID and the name of the VM to us?

    Best regards,

    Susie

    1.


    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Monday, January 26, 2015 4:09 AM
    Moderator
  • and how do I even pay for a support ticket? if it just asked me right in front "Bizspark/msdn doesnt allow you to ask technical question, but for xx$ you can" then i would pay easily. It feels like I have to register a new monthly subscription or something. 

    :)

    Monday, January 26, 2015 10:38 AM
  • Hi,

    It seems that Azure Credits may not be used to purchase Azure support plans for BizSpark members.

    You can click "Get support" in the link below and log in, it will check the support plan of the subscription automatically:

    http://azure.microsoft.com/en-us/support/options/

    If you don't have an active subscription you will need to contact general customer support to let them create a support ticket for you: http://support.microsoft.com/gp/customer-service-phone-numbers?wa=wsignin1.0

    In addition, I have known your deployment ID and subscription ID, I will delete your reply to hide the information. Thanks for your understanding.

    Best regards,

    Susie


    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.




    Tuesday, January 27, 2015 3:38 AM
    Moderator
  • Hi,

    According to the analysis of the supporter, the issue was due to the file system:


    Begin: Running /scripts/init-bottom ... done.


    [   10.503790] random: nonblocking pool is initialized

    * Starting Mount filesystems on boot[74G[ OK ]


    * Starting Fix-up sensitive /proc filesystem entries[74G[ OK ]


    * Starting Populate and link to /run filesystem[74G[ OK ]


    * Stopping Fix-up sensitive /proc filesystem entries[74G[ OK ]


    * Stopping Populate and link to /run filesystem[74G[ OK ]


    * Stopping Track if upstart is running in a container[74G[ OK ]


    [   14.718378] EXT4-fs (sda1): Couldn't remount RDWR because of unprocessed orphan inode list.  Please umount/remount instead

    An error occurred while mounting /.


    keys:Press S to skip mounting or M for manual recovery


    * Starting Initialize or finalize resolvconf[74G[ OK ]


    Please delete the VM (keep the VHD), mount the VHD to a working VM, then run e2fsck -f /dev/sd_1. For the /dev/sd_1 part, you can fill in the letter the drive is assigned to. Typically, it should be “c.”

    Best regards,

    Susie


    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Thursday, January 29, 2015 9:16 AM
    Moderator
  • Hi thanks for reply,

    so the issue was caused by a fault in the file system? How do I even unmount and mount vhd? Look, I am not a linux guy. In addition I have a mysql db with important user details that if lost can be massive blow to me, and most likely cause shutdown of my website entirely.

    If possible, I can pay whatever it costs to fix this.


    :)

    Thursday, January 29, 2015 9:20 AM
  • Similar to chkdsk for Windows fsck is present in Linux. When Linux boots it
    performs an fsck.

    Example of errors in serial logs where fsck is required.

    Checking all file systems.
    [/sbin/fsck.ext4 (1) -- /] fsck.ext4 -a
    /dev/sda1
    /dev/sda1 contains a file system with errors, check forced

    /dev/sda1: Inodes that were part of a corrupted orphan linked list
    found. 
    /dev/sda1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY
     EXT4-fs (sda1): INFO: recovery required on readonly filesystem
    EXT4-fs
    (sda1): write access will be enabled during recovery
    EXT4-fs warning (device
    sda1): ext4_clear_journal_err:4531: Filesystem error recorded from previous mount: IO failure
    EXT4-fs warning (device sda1):
    ext4_clear_journal_err:4532: Marking fs in need of filesystem check .


    If however the file system is “clean”, you
    will see entries comparable to this, note additional data disks are also present
    on this VM with the presence of device /dev/sde1 

    Checking all file systems.
    [/sbin/fsck.ext4 (1) -- /] fsck.ext4 -a
    /dev/sda1
    /dev/sda1: clean, 65405/1905008 files, 732749/7608064 blocks

    [/sbin/fsck.ext4 (1) -- /tmp] fsck.ext4 -a /dev/sdc1
    [/sbin/fsck.ext4 (2) -- /backup] fsck.ext4 -a /dev/sde1
    /dev/sdc1: clean, 12/1048576 files,
    109842/4192957 blocks
    /dev/sde1 : clean, 51/67043328 files,
    4259482/268173037 blocks


    If there is corruption on file systems, manual intervention will be required to fix it.

    Recovery steps to clean-up corrupt file systems is to mount the OS disk from
    the VM exhibiting the problem to a working VM

    A = Original VM
    B = Temp VM

    1) Stop VM A via the management portal (If this is the last VM in a cloud service without a Reserved VIP you will lose the current IP address of the Cloud Service)

    2) Create a temporary VM in the same cloud service if you wish to retain the VIP,
    alternatively if you want to delete also the Cloud Service just create a temp
    VM      

    3) Delete VM A BUT select “keep the attached disks”

    4) Once the lease is cleared, “Attach disk” from VM A to VM B via the Azure Portal

    5) On VM B  you will need to locate the drive you have attached

    a. First locate the drive name to fsck, on VM B by looking in relevant log file

    grep SCSI /var/log/kern.log (ubuntu)
    grep SCSI /var/log/messages (centos, suse, oracle) 

    Example

    kernel: [ 9707.100572] sd 3:0:0:0: [ sdc ] Attached SCSI disk

    b. You will not be able to mount the file system so check and
    double check that you are going to run fsck on the correct un-mounted file
    system and not one of your mounted file systems.

    c. fdisk –l
    will return the attached disks, combine this output with your

    df –h
    fdisk -l

     Disk /dev/sdc: 32.2 GB,
    32212254720 bytes
    255 heads, 63 sectors/track, 3916 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Sector size (logical/physical): 512
    bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x000c23d3
       Device Boot      Start         End      Blocks  
    Id  System
    /dev/sdc1   *           1        3789    30432256   83  Linux
    /dev/sdc2            3789        3917     1024000   82  Linux swap / Solaris

    df -h
    Filesystem      Size  Used Avail Use%
    Mounted on
    /dev/sda1        29G  2.2G   25G   9% /
    tmpfs            776M     0  776M   0% /dev/shm
    /dev/sdb1        69G  180M   66G   1% /mnt/resource

    d. sda1 and sdb1 are mounted sdc1 is not mounted, we will
    run fsck against /dev/sdc1

    fsck -yM /dev/sdc1

    fsck from util-linux-ng 2.17.2
    e2fsck 1.41.12
    (17-May-2010)
    /dev/sdc1: clean, 57029/1905008 files, 672768/7608064 blocks

    6) Detach disk from VM  B via the management portal

    7) Recreate the original VM (Create VM from Gallery, Select My Disks) you will see
    the Disk referring to VM A

    As an alternative download the .vhd file, fix the corruption in a local
    Hyper-v environment and reload into Azure.



    Monday, February 2, 2015 12:16 PM