none
How to convert PDF to tiff RRS feed

  • Question

  • I am trying to convert multi page pdf file to multiple tiff files. I tried ImageMagick, it works but extremely slow their suggestion is to use GhostScript directly which I tried and had the issue below..

    I tried  Ghostscript.NET.Rasterizer, it works very good but the dpi gets auto set and ignores my value. see here

    I looked into PdfSharp but it doesn't convert pdf to tif as I understood. see here

    I looked into PdfDocument Class but I didn't know how to use it even after looking at their sample.

    Purchasing a product (library) is not an option. Is there any solution.


    Monday, February 17, 2020 4:40 PM

Answers

  •   public void ExtracImagesFromPdf(string fileName, string fileNameResultDirectory)
            {
        MagickReadSettings settings = new MagickReadSettings();
            // Settings the density to 300 dpi will create an image with a better quality
            settings.Density = new Density(300);
    
             using (MagickImageCollection images = new MagickImageCollection())
             {
                // Add all the pages of the pdf file to the collection
                 images.Read(fileName, settings);
    
                 int page = 1;
                 foreach (MagickImage image in images)
                 {
                    
    
                    image.Format = MagickFormat.Png;
                     image.Write(fileNameResultDirectory + "\\0000" + page + ".Png");
                     page++;
                 }
             }   

    This is the code I used and it works fast with Pdf that has 10 or less pages. I am looking for something that can convert range of 100-1000. In their site forums they said it is better to use GS directly which I am trying to do.  

    • Marked as answer by Guest1993 Thursday, February 20, 2020 6:25 PM
    Monday, February 17, 2020 8:18 PM
  • Hi Guest1993,
    You can use sautinsoft.pdffocus assembly to convert PDF to tiff.
    Please following the steps:
    Right click the project >Add >Manage NuGet Packages> Browse >Enter the sautinsoft.pdffocus >Install it.
    Here is code example you can refer to.

    //Convert PDF file to Multipage TIFF file
    SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();
    //this property is necessary only for registered version
    //f.Serial = "XXXXXXXXXXX";
    string pdfPath = @"C:\Users\Desktop\Book6.pdf";
    string tiffPath = Path.ChangeExtension(pdfPath, ".tiff");
    f.OpenPdf(pdfPath);
    if (f.PageCount > 0)
    {
        f.ImageOptions.Dpi = 120;
        if (f.ToMultipageTiff(tiffPath) == 0)
        {
            System.Diagnostics.Process.Start(tiffPath);
        }
    }

    And about GhostScript, you can refer to these documents.
    [Converting PDF to a collection of images on the server using GhostScript]
    [How To Convert PDF to Image Using Ghostscript API]
    Hope there are helpful for you.
    Note: This response contains a reference to a third party World Wide Web site. Microsoft is providing this information as a convenience to you. Microsoft does not control these sites and has not tested any software or information found on these sites; Therefore, Microsoft cannot make any representations regarding the quality, safety, or suitability of any software or information found there. There are inherent dangers in the use of any software found on the Internet, and Microsoft cautions you to make sure that you completely understand the risk before retrieving any software from the Internet.
    Best Regards,
    Daniel Zhang


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    • Marked as answer by Guest1993 Thursday, February 20, 2020 6:26 PM
    Tuesday, February 18, 2020 6:24 AM
  • Hi here is an alternative solution. For more details, see this link: https://www.e-iceblue.com/Tutorials/Spire.PDF/Spire.PDF-Program-Guide/Conversion/Save-PDF-Document-as-tiff-image.html

    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using Spire.Pdf;
    namespace SavePdfAsTiff
    {
        class Program
        {
            static void Main(string[] args)
            {
                PdfDocument document = new PdfDocument();
                document.LoadFromFile(@"01.pdf");
        JoinTiffImages(SaveAsImage(document),"result.tiff",EncoderValue.CompressionLZW);
                System.Diagnostics.Process.Start("result.tiff");
            }
            private static Image[] SaveAsImage(PdfDocument document)
            {
                Image[] images = new Image[document.Pages.Count];
                for (int i = 0; i < document.Pages.Count; i++)
                {
                    images[i] = document.SaveAsImage(i);
                }
                return images;
            }
    
            private static ImageCodecInfo GetEncoderInfo(string mimeType)
            {
                ImageCodecInfo[] encoders = ImageCodecInfo.GetImageEncoders();
                for (int j = 0; j < encoders.Length; j++)
                {
                    if (encoders[j].MimeType == mimeType)
                        return encoders[j];
                }
                throw new Exception(mimeType + " mime type not found in ImageCodecInfo");
            }
    
            public static void JoinTiffImages(Image[] images, string outFile, EncoderValue compressEncoder)
            {
                //use the save encoder
                Encoder enc = Encoder.SaveFlag;
                EncoderParameters ep = new EncoderParameters(2);
                ep.Param[0] = new EncoderParameter(enc, (long)EncoderValue.MultiFrame);
                ep.Param[1] = new EncoderParameter(Encoder.Compression, (long)compressEncoder);
                Image pages = images[0];
                int frame = 0;
                ImageCodecInfo info = GetEncoderInfo("image/tiff");
                foreach (Image img in images)
                {
                    if (frame == 0)
                    {
                        pages = img;
                        //save the first frame
                        pages.Save(outFile, info, ep);
                    }
    
                    else
                    {
                        //save the intermediate frames
                        ep.Param[0] = new EncoderParameter(enc, (long)EncoderValue.FrameDimensionPage);
    
                        pages.SaveAdd(img, ep);
                    }
                    if (frame == images.Length - 1)
                    {
                        //flush and close.
                        ep.Param[0] = new EncoderParameter(enc, (long)EncoderValue.Flush);
                        pages.SaveAdd(ep);
                    }
                    frame++;
                }
            }
        }
    }

    • Marked as answer by Guest1993 Thursday, February 20, 2020 6:26 PM
    Thursday, February 20, 2020 7:17 AM
  • I made changes to the open source library of Ghostscript and generated new dll that fixed the issue. I hope they will look into it or let me upload my fix so they can use it. This way other people don't face the same issue. Refer to my first link in the question to see working code. 
    • Marked as answer by Guest1993 Thursday, February 20, 2020 6:29 PM
    Thursday, February 20, 2020 6:29 PM

All replies

  • Unfortunately you are limited as .NET doesn't support PDFs out of the box and third party support generally costs money. I can tell you that Ghostscript is a viable solution as we use it in one of our backend systems but it is going to be slow because Ghostscript is an interpreted system. In our experience you can speed it up by playing around with the settings until you find the right one's for your needs.

    Unfortunately this is the C# forums so we are not going to be any help actually using any third party libraries. You'll need to post questions in their forums.


    Michael Taylor http://www.michaeltaylorp3.net

    Monday, February 17, 2020 5:19 PM
    Moderator
  • I looked into PdfDocument Class but I didn't know how to use it even after looking at their sample.

    For example, this test converts a PDF file into .PNG files (tested on Windows 10, C#/Winforms) =>

    private async void ConvertPdfDocument(string sFile, string sPathDest)
    {
        try
        {
            var storagePdfFile = await Windows.Storage.StorageFile.GetFileFromPathAsync(sFile);
            Windows.Data.Pdf.PdfDocument pdfDocument;
            pdfDocument = await Windows.Data.Pdf.PdfDocument.LoadFromFileAsync(storagePdfFile, "");
            uint nPageIndex = 0;
            while (nPageIndex < pdfDocument.PageCount)
            {
                using (Windows.Data.Pdf.PdfPage pdfPage = pdfDocument.GetPage(nPageIndex))
                {
                    Windows.Storage.StorageFolder storageFolder = await Windows.Storage.StorageFolder.GetFolderFromPathAsync(sPathDest);
                    Windows.Storage.StorageFile storageFile = await storageFolder.CreateFileAsync("Page_" + nPageIndex.ToString("00") + ".png", Windows.Storage.CreationCollisionOption.GenerateUniqueName);
                    if (storageFile != null)
                    {
                        Windows.Storage.Streams.IRandomAccessStream randomStream = await storageFile.OpenAsync(Windows.Storage.FileAccessMode.ReadWrite);
                        await pdfPage.RenderToStreamAsync(randomStream);
                        await randomStream.FlushAsync();
                        randomStream.Dispose();
                    }
                }
                nPageIndex++;
            }
        }
        catch (System.Exception ex)
        {
            System.Windows.Forms.MessageBox.Show("Error : " + ex.Message, "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
        }
    }



    Function test :

    ConvertPdfDocument(@"E:\Welcome.pdf", @"E:\Test");





    • Edited by Castorix31 Monday, February 17, 2020 5:32 PM
    • Proposed as answer by simonb549 Monday, February 17, 2020 6:45 PM
    Monday, February 17, 2020 5:28 PM
  • Can you provide an example of how you used GhostScript.
    Monday, February 17, 2020 5:37 PM
  • Do I need to install any packages for windows.storage. "The name windows doesn't exist in current context." I am using c# asp.net console. Thank you
    Monday, February 17, 2020 5:46 PM
  • I can't because I don't have access to that code. As I said, it is partially dependent upon the documents you are converting. If you are converting PDFs that are original documents then conversion settings are different than if you're converting PDFs that were themselves images.

    However we used SO and other links to get some ideas so I've posted some here, they are old.
    https://www.codeproject.com/Articles/317700/Convert-a-PDF-into-a-series-of-images-using-Csharp
    https://stackoverflow.com/questions/11517659/how-to-use-ghostscript-for-converting-pdf-to-imagehttps://www.codeproject.com/Questions/614695/Convert-PDF-to-TIFF-using-Csharp-NET

    You'll still need to tweak the settings until you get as fast as possible. Albeit the conversion is still slow but there isn't much you can do about it. We also found having the latest Ghostscript code installed sped things up as well. We were running an older version.

    Also note that some people have found that simply programmatically uploading the file to one of the many online converters and downloading it again was faster. However you are then tied to their site and will likely need to do screen scraping.


    Michael Taylor http://www.michaeltaylorp3.net

    Monday, February 17, 2020 5:54 PM
    Moderator
  • Have you tried using a virtual printer software?

    Instead of printing your pdf document on paper, it’ll print the pdf into the required format.

    OR you could use zamzar to do it.
    Monday, February 17, 2020 5:54 PM
  • it has to be done by in house program.
    Monday, February 17, 2020 6:00 PM
  • Do I need to install any packages for windows.storage. "The name windows doesn't exist in current context." I am using c# asp.net console. Thank you

    No, nothing to install

    Just the References to add :

     C:\Program Files (x86)\Windows Kits\10\UnionMetadata\Windows.winmd

     C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETCore\v4.5\System.Runtime.WindowsRuntime.dll

    Monday, February 17, 2020 6:42 PM
  • I can see the file when I try the path using windows explorer but when I go to "Add references" I can't see the file.  I am using asp.net not core does it make a difference? Thank you for your time.
    Monday, February 17, 2020 7:00 PM
  • I can see the file when I try the path using windows explorer but when I go to "Add references" I can't see the file.  I am using asp.net not core does it make a difference? Thank you for your time.

    I just click on "Browse..." then I add the path :

    Monday, February 17, 2020 8:06 PM
  • How do you define "extremely slow", and how large are the documents you're converting?  ImageMagick uses GhostScript to do the rendering.  With even a relatively complicated document, it takes less than a second.

    Tim Roberts | Driver MVP Emeritus | Providenza &amp; Boekelheide, Inc.

    Monday, February 17, 2020 8:07 PM
  •   public void ExtracImagesFromPdf(string fileName, string fileNameResultDirectory)
            {
        MagickReadSettings settings = new MagickReadSettings();
            // Settings the density to 300 dpi will create an image with a better quality
            settings.Density = new Density(300);
    
             using (MagickImageCollection images = new MagickImageCollection())
             {
                // Add all the pages of the pdf file to the collection
                 images.Read(fileName, settings);
    
                 int page = 1;
                 foreach (MagickImage image in images)
                 {
                    
    
                    image.Format = MagickFormat.Png;
                     image.Write(fileNameResultDirectory + "\\0000" + page + ".Png");
                     page++;
                 }
             }   

    This is the code I used and it works fast with Pdf that has 10 or less pages. I am looking for something that can convert range of 100-1000. In their site forums they said it is better to use GS directly which I am trying to do.  

    • Marked as answer by Guest1993 Thursday, February 20, 2020 6:25 PM
    Monday, February 17, 2020 8:18 PM
  • I only see the file inside 10.0.18362.0

    Monday, February 17, 2020 8:21 PM
  • I only see the file inside 10.0.18362.0

    It is the same file, just another version

    You can see it has Windows.Storage.StorageFile and Windows.Data.Pdf.PdfDocument inside...

    Monday, February 17, 2020 11:39 PM
  • We use Spire.Pdf to convert and handle pdf files, it's not free but cheaper than many other libraries. In addition, it does provide a free version but has some page limitations, you can have a check if interested.



    Tuesday, February 18, 2020 1:28 AM
  • Hi Guest1993,
    You can use sautinsoft.pdffocus assembly to convert PDF to tiff.
    Please following the steps:
    Right click the project >Add >Manage NuGet Packages> Browse >Enter the sautinsoft.pdffocus >Install it.
    Here is code example you can refer to.

    //Convert PDF file to Multipage TIFF file
    SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();
    //this property is necessary only for registered version
    //f.Serial = "XXXXXXXXXXX";
    string pdfPath = @"C:\Users\Desktop\Book6.pdf";
    string tiffPath = Path.ChangeExtension(pdfPath, ".tiff");
    f.OpenPdf(pdfPath);
    if (f.PageCount > 0)
    {
        f.ImageOptions.Dpi = 120;
        if (f.ToMultipageTiff(tiffPath) == 0)
        {
            System.Diagnostics.Process.Start(tiffPath);
        }
    }

    And about GhostScript, you can refer to these documents.
    [Converting PDF to a collection of images on the server using GhostScript]
    [How To Convert PDF to Image Using Ghostscript API]
    Hope there are helpful for you.
    Note: This response contains a reference to a third party World Wide Web site. Microsoft is providing this information as a convenience to you. Microsoft does not control these sites and has not tested any software or information found on these sites; Therefore, Microsoft cannot make any representations regarding the quality, safety, or suitability of any software or information found there. There are inherent dangers in the use of any software found on the Internet, and Microsoft cautions you to make sure that you completely understand the risk before retrieving any software from the Internet.
    Best Regards,
    Daniel Zhang


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    • Marked as answer by Guest1993 Thursday, February 20, 2020 6:26 PM
    Tuesday, February 18, 2020 6:24 AM
  • Thank you for the suggestion, this is a paid version. As far as GhostScript, I followed this  tutorial it works good except it doesn't set the DPI. It ignores it.

    Tuesday, February 18, 2020 2:30 PM
  • Hi here is an alternative solution. For more details, see this link: https://www.e-iceblue.com/Tutorials/Spire.PDF/Spire.PDF-Program-Guide/Conversion/Save-PDF-Document-as-tiff-image.html

    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using Spire.Pdf;
    namespace SavePdfAsTiff
    {
        class Program
        {
            static void Main(string[] args)
            {
                PdfDocument document = new PdfDocument();
                document.LoadFromFile(@"01.pdf");
        JoinTiffImages(SaveAsImage(document),"result.tiff",EncoderValue.CompressionLZW);
                System.Diagnostics.Process.Start("result.tiff");
            }
            private static Image[] SaveAsImage(PdfDocument document)
            {
                Image[] images = new Image[document.Pages.Count];
                for (int i = 0; i < document.Pages.Count; i++)
                {
                    images[i] = document.SaveAsImage(i);
                }
                return images;
            }
    
            private static ImageCodecInfo GetEncoderInfo(string mimeType)
            {
                ImageCodecInfo[] encoders = ImageCodecInfo.GetImageEncoders();
                for (int j = 0; j < encoders.Length; j++)
                {
                    if (encoders[j].MimeType == mimeType)
                        return encoders[j];
                }
                throw new Exception(mimeType + " mime type not found in ImageCodecInfo");
            }
    
            public static void JoinTiffImages(Image[] images, string outFile, EncoderValue compressEncoder)
            {
                //use the save encoder
                Encoder enc = Encoder.SaveFlag;
                EncoderParameters ep = new EncoderParameters(2);
                ep.Param[0] = new EncoderParameter(enc, (long)EncoderValue.MultiFrame);
                ep.Param[1] = new EncoderParameter(Encoder.Compression, (long)compressEncoder);
                Image pages = images[0];
                int frame = 0;
                ImageCodecInfo info = GetEncoderInfo("image/tiff");
                foreach (Image img in images)
                {
                    if (frame == 0)
                    {
                        pages = img;
                        //save the first frame
                        pages.Save(outFile, info, ep);
                    }
    
                    else
                    {
                        //save the intermediate frames
                        ep.Param[0] = new EncoderParameter(enc, (long)EncoderValue.FrameDimensionPage);
    
                        pages.SaveAdd(img, ep);
                    }
                    if (frame == images.Length - 1)
                    {
                        //flush and close.
                        ep.Param[0] = new EncoderParameter(enc, (long)EncoderValue.Flush);
                        pages.SaveAdd(ep);
                    }
                    frame++;
                }
            }
        }
    }

    • Marked as answer by Guest1993 Thursday, February 20, 2020 6:26 PM
    Thursday, February 20, 2020 7:17 AM
  • Thank you but Spire pdf is not free.
    Thursday, February 20, 2020 2:11 PM
  • Thank you but Spire pdf is not free.
    The given method with PdfDocument works fine and is fast
    Thursday, February 20, 2020 3:04 PM
  • It uses spire pdf which id not free. However, I found the issue with the ghostscript.net and fixed it now works fine and fast. 
    Thursday, February 20, 2020 6:22 PM
  • It uses spire pdf which id not free. However, I found the issue with the ghostscript.net and fixed it now works fine and fast. 
    the ConvertPdfDocument sample doesn't use Spire at all

    • Edited by valat Thursday, February 20, 2020 6:28 PM
    Thursday, February 20, 2020 6:28 PM
  • I made changes to the open source library of Ghostscript and generated new dll that fixed the issue. I hope they will look into it or let me upload my fix so they can use it. This way other people don't face the same issue. Refer to my first link in the question to see working code. 
    • Marked as answer by Guest1993 Thursday, February 20, 2020 6:29 PM
    Thursday, February 20, 2020 6:29 PM
  • Oh sorry, you are talk about Castorix31 code. I couldn't get it to work for this issue:

    "It is the same file, just another version

    You can see it has Windows.Storage.StorageFile and Windows.Data.Pdf.PdfDocument inside..."

    Thursday, February 20, 2020 7:02 PM
  • Oh sorry, you are talk about Castorix31 code. I couldn't get it to work for this issue:

    "It is the same file, just another version

    You can see it has Windows.Storage.StorageFile and Windows.Data.Pdf.PdfDocument inside..."

    There is no issue.

    You add a reference to any version of Windows.winmd

    and it works perfectly

    Thursday, February 20, 2020 7:07 PM
  • It didn't for me or I didn't know how to do it. Hopefully other people see it in the future and make a use of all those options.
    Thursday, February 20, 2020 9:11 PM
  • It didn't for me or I didn't know how to do it. Hopefully other people see it in the future and make a use of all those options.

    Then mark it as answer, otherwise other people will never find it, as it works...

    Thursday, February 20, 2020 9:18 PM