locked
Windows Store App PDF Document does not render when stream disposed after load

    Question

  • I have a working solution to load and render a PDF document from a byte array in a Windows Store App. Lately some users have reported out-of-memory errors though. As you can see in the code below there is one stream I am not disposing of. I've commented out the line. If I do dispose of that stream, then the PDF document does not render anymore. It just shows a completely white image. Could anybody explain why and how I could load and render the PDF document and dispose of all disposables?

        private static async Task<PdfDocument> LoadDocumentAsync(byte[] bytes)
        {
            using (var stream = new InMemoryRandomAccessStream())
            {
                await stream.WriteAsync(bytes.AsBuffer());
        
                stream.Seek(0);
        
                var fileStream = RandomAccessStreamReference.CreateFromStream(stream);
                var inputStream = await fileStream.OpenReadAsync();
                try
                {
                    return await PdfDocument.LoadFromStreamAsync(inputStream);
                }
                finally
                {
                    // do not dispose otherwise pdf does not load / render correctly. Not disposing though may cause memory issues.
                    // inputStream.Dispose();
                }
            }
        }
    

    and the code to render the PDF

        private static async Task<ObservableCollection<BitmapImage>> RenderPagesAsync(
            PdfDocument document, 
            PdfPageRenderOptions options)
        {
            var items = new ObservableCollection<BitmapImage>();
        
            if (document != null && document.PageCount > 0)
            {
                for (var pageIndex = 0; pageIndex < document.PageCount; pageIndex++)
                {
                    using (var page = document.GetPage((uint)pageIndex))
                    {
                        using (var imageStream = new InMemoryRandomAccessStream())
                        {
                            await page.RenderToStreamAsync(imageStream, options);
                            await imageStream.FlushAsync();
        
                            var renderStream = RandomAccessStreamReference.CreateFromStream(imageStream);
                            using (var stream = await renderStream.OpenReadAsync())
                            {
                                var bitmapImage = new BitmapImage();
                                await bitmapImage.SetSourceAsync(stream);
                                items.Add(bitmapImage);
                            }
                        }
                    }
                }
            }
        
            return items;
        }
    

    As you can see I am using this RandomAccessStreamReference.CreateFromStream method in both of my methods. I've seen other examples that skip that step and use the InMemoryRandomAccessStream directly to load the PDF document or the bitmap image, but I've not managed to get the PDF to render correctly then. The images will just be completely white again. As I mentioned above, this code does actually render the PDF correctly, but does not dispose of all disposables.


    Thursday, November 06, 2014 11:11 AM

Answers

  • I assume LoadFromStreamAsync(IRandomAccessStream) does not parse the whole stream into the PdfDocument object but instead only parses the main PDF dictionaries and holds a reference to the IRandomAccessStream.

    This actually is the sane thing to do, why parse the whole PDF into own objects (a possibly very expensive operation resource-wise) if the user eventually only wants to render one page, or even merely wants to query the number of pages...

    Later on, when other methods of the returned PdfDocument are called, e.g. GetPage, these methods try to read the additional data from the stream they need for their task, e.g. for rendering. Unfortunately in your case that means after the finally { inputStream.Dispose(); }

    You have to postpone the inputStream.Dispose() until all operations on the PdfDocument are finished. That means some hopefully minor architectural changes for your code. Probably moving the LoadDocumentAsync code as a frame into the RenderPagesAsync method or its caller suffices.

    • Marked as answer by Remco Blok Saturday, January 10, 2015 2:23 PM
    Friday, November 07, 2014 10:02 AM

All replies

  • I assume LoadFromStreamAsync(IRandomAccessStream) does not parse the whole stream into the PdfDocument object but instead only parses the main PDF dictionaries and holds a reference to the IRandomAccessStream.

    This actually is the sane thing to do, why parse the whole PDF into own objects (a possibly very expensive operation resource-wise) if the user eventually only wants to render one page, or even merely wants to query the number of pages...

    Later on, when other methods of the returned PdfDocument are called, e.g. GetPage, these methods try to read the additional data from the stream they need for their task, e.g. for rendering. Unfortunately in your case that means after the finally { inputStream.Dispose(); }

    You have to postpone the inputStream.Dispose() until all operations on the PdfDocument are finished. That means some hopefully minor architectural changes for your code. Probably moving the LoadDocumentAsync code as a frame into the RenderPagesAsync method or its caller suffices.

    • Marked as answer by Remco Blok Saturday, January 10, 2015 2:23 PM
    Friday, November 07, 2014 10:02 AM
  • Thanks for your response. I think you mean something like this:

           private static async Task<ObservableCollection<BitmapImage>> LoadDocumentAndRenderPagesAsync(byte[] bytes, PdfPageRenderOptions options)
            {
                using (var memoryStream = new InMemoryRandomAccessStream())
                {
                    await memoryStream.WriteAsync(bytes.AsBuffer());
    
                    memoryStream.Seek(0);
    
                    var fileStream = RandomAccessStreamReference.CreateFromStream(memoryStream);
                    using (var inputStream = await fileStream.OpenReadAsync())
                    {
                        var document = await PdfDocument.LoadFromStreamAsync(inputStream);
    
                        var items = new ObservableCollection<BitmapImage>();
    
                        if (document != null && document.PageCount > 0)
                        {
                            for (var pageIndex = 0; pageIndex < document.PageCount; pageIndex++)
                            {
                                using (var page = document.GetPage((uint)pageIndex))
                                {
                                    using (var imageStream = new InMemoryRandomAccessStream())
                                    {
                                        await page.RenderToStreamAsync(imageStream, options);
                                        await imageStream.FlushAsync();
    
                                        var renderStream = RandomAccessStreamReference.CreateFromStream(imageStream);
                                        using (var stream = await renderStream.OpenReadAsync())
                                        {
                                            var bitmapImage = new BitmapImage();
                                            await bitmapImage.SetSourceAsync(stream);
                                            items.Add(bitmapImage);
                                        }
                                    }
                                }
                            }
                        }
    
                        return items;
                    }
                }
            }

    Unfortunately the process crashes with an unhandled win32 exception at the

    await page.RenderToStreamAsync(imageStream, options);

    line. I've not managed to catch the exception in my app or attach the debugger to the crashing process.

    I wonder if I should just save the byte array to a file instead of keeping it in memory. I would then load the pdf document from the file. Perhaps I should even render the pages to images on the harddisk instead of in memory. I was hoping to avoid that though.

    Friday, November 07, 2014 12:10 PM
  • Here is a version that uses files instead of memory streams. It still crashes on the RenderToStreamAsync line. Seems like a bug in the Windows Runtime to me

            private static async Task<ObservableCollection<BitmapImage>> LoadDocumentAndRenderPagesAsync(byte[] bytes, PdfPageRenderOptions options)
            {
                var items = new ObservableCollection<BitmapImage>();
                var folder = ApplicationData.Current.TemporaryFolder;
                var pdfFile = await folder.CreateFileAsync(Guid.NewGuid() + ".pdf", CreationCollisionOption.ReplaceExisting);
    
                await FileIO.WriteBytesAsync(pdfFile, bytes);
    
                var document = await PdfDocument.LoadFromFileAsync(pdfFile);
    
                if (document != null && document.PageCount > 0)
                {
                    for (var pageIndex = 0u; pageIndex < document.PageCount; pageIndex++)
                    {
                        var pngFile =
                                    await
                                    folder.CreateFileAsync(Guid.NewGuid() + ".png", CreationCollisionOption.ReplaceExisting);
                        
                        using (var page = document.GetPage(pageIndex))
                        {
                            using (var imageStream = await pngFile.OpenAsync(FileAccessMode.ReadWrite))
                            {
                                await page.RenderToStreamAsync(imageStream, options);
                                await imageStream.FlushAsync();
                            }
                        }
    
                        var bitmapImage = new BitmapImage();
                        using (var renderStream = await pdfFile.OpenAsync(FileAccessMode.Read))
                        {
                            await bitmapImage.SetSourceAsync(renderStream);
                        }
    
                        items.Add(bitmapImage);
                    }
                }
    
                return items;
            }

    Friday, November 07, 2014 3:47 PM
  • I guess the issue is with having a using block surrounding and Async-method (e.g. the Stream is being disposed of before the Rendering finishes. You could try the following and check if it works. That way you should be able to make sure that the stream is only disposed of once the rendering has finished.

    var bitmapImage = new BitmapImage();
    var renderStream = await pdfFile.OpenAsync(FileAccessMode.Read)
    await bitmapImage.SetSourceAsync(renderStream);
    items.Add(bitmapImage);
    renderStream.Dispose();

    Monday, November 10, 2014 9:59 AM
  • thank you for your reply. Unfortunately my app crashes on the await page.RenderToStreamAsync(imageStream, options); line before it even gets to loading the bitmap. I think this is a bug in the Windows Runtime.
    Monday, November 10, 2014 1:51 PM
  • My crash was caused by an invalid value I specified in the PdfPageRenderOptions. Once I had valid values in the PdfPageRenderOptions I managed to get the pattern working that disposes of all disposables that I wrote in my first reply to Jieng Sungdsg's reply. Thanks
    Saturday, January 10, 2015 2:27 PM