none
503 Errors with Large PDF file RRS feed

  • Question

  • My Azure web sites are in the Eastern US. I am getting 503 errors when producing large PDF reports (500+ pages). Am I hitting some sort of limit within Azure Websites? I have the site configured as "basic" with 1 instance and "medium" core.

    I first thought it was my stored procedure on SQL Azure, but it runs in less than 1 minute. If I change the parameters of the report (to reduce the # of records returned), I can get the report to run. However, I need this large PDF to be created.

    Thanks,

    Mark


    • Edited by TexasPete4 Thursday, January 22, 2015 12:16 AM
    Wednesday, January 21, 2015 11:22 PM

All replies

  • Here is a long previous thread dealing with this. A change was made to allow a subset of scenario to work on Basic/Standard, but there are still scenarios that don't.
    Thursday, January 22, 2015 12:39 AM
    Moderator
  • David,

    I ramped the web site up to Standard - 3 instances - which should have placed the web site in a non-shared environment. However, I still get the 503 error.

    According to the previous thread, the issue was with shared permissions. Basic/Standard modes should be in dedicated mode.

    The weird thing about this issue is that there are no entries in the error logs. I captured an error early today, but since then I get nothing new added to the error log files.

    Getting really frustrated with this issue. Someone at MS needs to put this issue on the front burner. SSRS is being taken away, yet we have very limited alternatives.

    Mark

    Thursday, January 22, 2015 1:49 AM
  • Unfortunately, as I mentioned above, Basic/Standard enables only a subset of those scenarios, while others remain non-functioning today.

    Could you share some test site name that has the issue? You can create some dummy site with the same bits if you don't want to share your real site name.

    Thursday, January 22, 2015 1:52 AM
    Moderator
  • Is there someplace I can send you the login info? the web site is "crmnademo.azurewebsites.net" this is my demo site - so the data is not an issue. However, I would like to send you the login credentials somewhere else.
    Thursday, January 22, 2015 1:57 AM
  • The site name is all we need to investigate, thanks. I will ask others to take a look. But to set expectations, please be aware that if you are in fact hitting one of the cases that are known to not work (even in basic/standard), then the timeline to get it fixes may be somewhat far out.
    Thursday, January 22, 2015 5:15 AM
    Moderator
  • Could you please repro the issue once on this site, and respond with the approximate time that you tried (UTC preferably)? That will help find the matching logs for the correct time period.

    When I just go to the site myself, it asks for credentials, so I don't think I can cause the issue to happen.

    thanks,
    David

    Thursday, January 22, 2015 6:31 PM
    Moderator
  • David,

    At 8:43AM EST (13:43 GMT) I started the process that throws the error. At 8:47AM EST (13:47 GMT) the process threw the error.

    Thanks,

    Mark


    --- as a follow up to this error - the next page I tried to display on the website took 2-3 minutes to display. It really looks like it is hitting a processing limit even on the non-shared sites.
    • Edited by TexasPete4 Friday, January 23, 2015 1:56 PM
    Friday, January 23, 2015 1:52 PM
  • Twice in that range, I'm seeing it launch vbc.exe, which itself launches cvtres.exe. However, I'm not seeing any errors in our log.

    Note that there are no processing limits in non-shared sites (beyond VM limits), so that cannot be the issue.

    Would it be possible for you to build a repro site that we could run ourselves, ideally free of any secret? i.e. something that is just enough to demonstrate the basic issue.

     
    Wednesday, January 28, 2015 12:12 AM
    Moderator
  • David - give me someplace to give you a login. This is my demo site. So I don't care if you get into it. I just don't want everyone to get into it :)

    BTW - I just ran the report with a large range of data and it threw an error AND locked down the site. 19:52 EST.

    • Edited by TexasPete4 Wednesday, January 28, 2015 12:52 AM
    Wednesday, January 28, 2015 12:49 AM
  • You can email me at david.ebbo (at) microsoft.com.
    Wednesday, January 28, 2015 12:55 AM
    Moderator
  • After some investigation, I see what's going on. There is a 230 second (i.e. a little less than 4 mins) timeout for requests that are not sending any data back. After that, the client gets the 500 you saw, even though in reality the request is allowed to continue server side.

    If you are expecting to do server side processing that lasts this long, I would suggest using an alternate flow that works in a more async way. i.e. let the user queue up the report generation, and then allow them to download it when it's complete. Or as an alternative, you could upload the result to blob storage, and have it be available there for pickup.

    I would argue that even without this timeout, the majority of users would themselves give up on the request if they saw it spinning in their browser for that long. So the async pattern makes for a better user experience.

    Does that seem feasible?

    • Proposed as answer by AnhKhoa1306 Tuesday, May 15, 2018 4:12 AM
    Wednesday, January 28, 2015 12:42 PM
    Moderator
  • Again David – thanks for your help.

    I understand what is going on now. The users run these management reports once a day/week/month and know that they take a while to process. It is basically taking everything that happens with an order and giving the user a summary. So I think they are happy to wait for a few minutes to get them (it used to take 6 weeks to get the information).

    With that said, I need to formulate a plan of action to prevent these errors. So here are a few questions I need to get answered:

    1. In Azure, is there a way to temporarily bypass the 230 second timeout?
      • Is this a setting I have control over?
      • If I upgrade to another scale, is this timeout lengthed?
      • Is this the same timeout for Jobs that run? (I have the same issue with Jobs that run at night)
    2. You suggest letting the user queue up the reports. I think I understand what you are suggesting, but how would that work in a web browser? SQL actually runs the stored procedure in 23 seconds and returns control to the code behind. However, the next step is to process the data set through RDLC report viewer. We have no control during this process and no way of returning data back from the server so it does not timeout.
      • Any ideas here?
      • If we figured out how to allow the report to run – how do we alert the user when it is completed? In the code behind, how do you pass control to another page and keep the report processing?
    3. Is there a document that explains how to do the investigating that you had to do in order to figure this out? I could not get Azure to log the errors anywhere. I have tried the application error logging and sometimes the errors get logged there and sometimes not. I would like to be able to investigate things without bothering other people J
    4. If the reports actually run after the 230 second timeout has occurred – why does Azure ignore all other requests? When I caused the 500 error last night, I could not get the website to respond to any other request. Everything on the site seemed to be locked down until the process completed. Is this the way Azure is supposed to act?

    I know there are a lot of questions, but I really need to get this resolved. There are a lot of reports in the system that takes a while to process and I cannot go live until this is resolved.

    Thanks again for your help.

    Mark

    Wednesday, January 28, 2015 3:17 PM
  • Some answers:

    1. This timeout cannot be bypassed, and is the same in all the site modes. It also applies to triggered webjobs (i.e. manual or scheduled), but not to continuous webjobs. Basically, it applies anytime an http request is involved.
    2. You can use a technique like the one discussed here. Basically, your http thread just starts a background task and comes back. You can use a unique ID for each background process. This way, you can have another URL that the client can check for status on the processing. When the processing is complete, it can save the file in some folder, allowing it to be retrieved from some other request.
    3. There was actually an entry in your raw IIS log file that looked like: "2015-01-28 02:38:41 crmnademo GET /GPSheetsRpt.aspx - 0 - - - - - crmnademo 500 121 0 291 1162 230010". 230010 is how long it took (230s). 121 is the sub code, and this one means that the request was aborted due to that timeout.
    4. I actually don't understand why this would happen, and would need to investigate. Did all requests fail, even those to simple static files?
    Wednesday, January 28, 2015 4:25 PM
    Moderator
  • David,

    Thanks for your answers. In order to get you some feedback immediately, I will answer #4 now.

    Once the 500 error hit, everything stopped working. I closed the window with the 500 error and tried to logout of my software. This process updates an activity table in SQL and then redirects the user to a “logout” page. However, nothing ever happened after I clicked on the logout option. The browser tab just spun and then eventually that window got a 500 error.

    You can probably replicate this issue by logging into my demo system again and running the report with a large date range. Once the 500 error hits, try going back to the original menu and click on logout.

    Thank you,

    Mark


    • Edited by TexasPete4 Wednesday, January 28, 2015 11:40 PM
    Wednesday, January 28, 2015 11:37 PM