none
Picture cropped while extracting from Word 2010 Document RRS feed

  • Question

  • Hi,

    I have a document and 2 pictures within it, I am extracting those pictures using Office interope and below code. However these kind of picture are cropped after I extract them. Only from both left and right side.

    inlineShape.Select();
    wordApplication.Selection.CopyAsPicture();
    Image image = (Image)data.GetData(DataFormats.Bitmap, true);
    Bitmap currentBitmap = new Bitmap(image);
    currentBitmap.Save(Path)

    I think the problem is in Selecting inlineshape, the code cannot select the whole picture.

    If anyone wants I can send him/her the document I am using, I cannot attach it here. Please help.


    Wednesday, December 31, 2014 7:55 AM

Answers

  • If the document is in the .docx or .docm format, you can extract the image quite easily by changing the extension to .zip and extracting the image directly from the file's archive. Documents in the .doc format can be converted to .docx/.docm beforehand. An alternative is to save the document in .html format and extract the images from the associated html folder.

    The following macro will extract all embedded media files (not just images) from a docx or docm document. After selecting the documents to process, the code extracts the images and outputs them to a new 'DocMedia' folder in that folder. Each output file's name is prefixed with the parent document's name. If the files have media other than images embedded, these will be extracted too.

    Sub ExtractDocxMedia()
    ' The following macro extracts the media objects from a docx or docm
    ' file and outputs them to a new 'Media' folder in the document's folder.
    ' The output file's name is prefixed with the parent document's name.
    '
    'Note: The macro only processes docx & docm files - doc files can't be processed this way
    ' (though they could be converted to the docx format for processing).
    '
    Application.ScreenUpdating = False
    Dim StrInFold As String, StrMediaFold As String, StrTmpFold As String
    Dim StrDocFile As String, StrZipFile As String, Obj_App As Object
    Dim FoundFile As Variant, StrTmp As String, StrMediaFile As String
    'Create FileDialog object as File Picker dialog box
    With Application.FileDialog(FileDialogType:=msoFileDialogFilePicker)
      'Use Show method to display File Picker dialog box and return user's action
      If .Show = -1 Then
        'Step through each string in the FileDialogSelectedItems collection
        For Each FoundFile In .SelectedItems
          StrDocFile = FoundFile
          StrInFold = Left(StrDocFile, InStrRev(StrDocFile, "\"))
          StrTmp = Split(Right(StrDocFile, Len(StrDocFile) - Len(StrInFold)), ".")(0)
          'Define the zip name
          StrZipFile = Split(StrDocFile, ".")(0) & ".zip"
          'Create the zip file, by simply copying to a new file with a zip extension
          FileCopy StrDocFile, StrZipFile
          StrMediaFold = StrInFold & "Media"
          StrTmpFold = StrInFold & "Tmp"
          'Test for existing tmp & output folders, create they if they don't already exist
          If Dir(StrTmpFold, vbDirectory) = "" Then MkDir StrTmpFold
          If Dir(StrMediaFold, vbDirectory) = "" Then MkDir StrMediaFold
          'Create a Shell App for accessing the zip archives
          Set Obj_App = CreateObject("Shell.Application")
          'Next, process any media
          On Error Resume Next 'In case the file is in use or zip file has no media
          'Extract the zip archive's media files to the temporary folder
          Obj_App.NameSpace(StrTmpFold & "\").CopyHere Obj_App.NameSpace(StrZipFile & "\word\media\").Items
          On Error GoTo 0 'Restore error trapping
          'Delete the zip file - the loop takes care of timing issues
          Do While Dir(StrZipFile) <> ""
            Kill StrZipFile
          Loop
          'Get the temporary folder's file listing
          StrMediaFile = Dir(StrTmpFold & "\*.*", vbNormal)
          'Process the temporary folder's files
          While StrMediaFile <> ""
            'Copy the file to the output folder, prefixed with the source file's name
            FileCopy StrTmpFold & "\" & StrMediaFile, StrMediaFold & "\" & StrTmp & StrMediaFile
            'Delete the media file
            Kill StrTmpFold & "\" & StrMediaFile
            'Get the next media file
            StrMediaFile = Dir()
          Wend
          'Delete the temporary folder
          RmDir StrTmpFold
        Next
      End If
    End With
    Application.ScreenUpdating = True
    End Sub


    Cheers
    Paul Edstein
    [MS MVP - Word]

    Friday, January 2, 2015 8:08 AM
  • Hi Prasnjit Nath,

    Thanks for posting in MSDN forum.

    >>I think the problem is in Selecting inlineshape, the code cannot select the whole picture<<

    No, the method InlineShape.Select to select for the inlineshape is correct. We can test the copy method via code below:

    Sub testInlinshapeSelect()
    ActiveDocument.InlineShapes(1).Select
    
    Application.Selection.CopyAsPicture
    
    Application.Selection.EndKey Unit:=wdStory, Extend:=wdMove
    
    Application.Selection.Paste
    End Sub
    

    I couldn't find a methed in Word object model to expore the picture howeve as a workaround, we can create a blank document and copy the pictures we wanted exported into the templeate document. At last, we can save the template document as a web page, then the picutre would saved under coresponding folder.

    If you want to Word object model to support this feature, you can submit the feedback from link below:
    Submit Feedback - Microsoft Office

    Hope it is helpful.

    Regards & Fei


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Thursday, January 1, 2015 8:17 AM
    Moderator

All replies

  • Hi Prasnjit Nath,

    Thanks for posting in MSDN forum.

    >>I think the problem is in Selecting inlineshape, the code cannot select the whole picture<<

    No, the method InlineShape.Select to select for the inlineshape is correct. We can test the copy method via code below:

    Sub testInlinshapeSelect()
    ActiveDocument.InlineShapes(1).Select
    
    Application.Selection.CopyAsPicture
    
    Application.Selection.EndKey Unit:=wdStory, Extend:=wdMove
    
    Application.Selection.Paste
    End Sub
    

    I couldn't find a methed in Word object model to expore the picture howeve as a workaround, we can create a blank document and copy the pictures we wanted exported into the templeate document. At last, we can save the template document as a web page, then the picutre would saved under coresponding folder.

    If you want to Word object model to support this feature, you can submit the feedback from link below:
    Submit Feedback - Microsoft Office

    Hope it is helpful.

    Regards & Fei


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Thursday, January 1, 2015 8:17 AM
    Moderator
  • If the document is in the .docx or .docm format, you can extract the image quite easily by changing the extension to .zip and extracting the image directly from the file's archive. Documents in the .doc format can be converted to .docx/.docm beforehand. An alternative is to save the document in .html format and extract the images from the associated html folder.

    The following macro will extract all embedded media files (not just images) from a docx or docm document. After selecting the documents to process, the code extracts the images and outputs them to a new 'DocMedia' folder in that folder. Each output file's name is prefixed with the parent document's name. If the files have media other than images embedded, these will be extracted too.

    Sub ExtractDocxMedia()
    ' The following macro extracts the media objects from a docx or docm
    ' file and outputs them to a new 'Media' folder in the document's folder.
    ' The output file's name is prefixed with the parent document's name.
    '
    'Note: The macro only processes docx & docm files - doc files can't be processed this way
    ' (though they could be converted to the docx format for processing).
    '
    Application.ScreenUpdating = False
    Dim StrInFold As String, StrMediaFold As String, StrTmpFold As String
    Dim StrDocFile As String, StrZipFile As String, Obj_App As Object
    Dim FoundFile As Variant, StrTmp As String, StrMediaFile As String
    'Create FileDialog object as File Picker dialog box
    With Application.FileDialog(FileDialogType:=msoFileDialogFilePicker)
      'Use Show method to display File Picker dialog box and return user's action
      If .Show = -1 Then
        'Step through each string in the FileDialogSelectedItems collection
        For Each FoundFile In .SelectedItems
          StrDocFile = FoundFile
          StrInFold = Left(StrDocFile, InStrRev(StrDocFile, "\"))
          StrTmp = Split(Right(StrDocFile, Len(StrDocFile) - Len(StrInFold)), ".")(0)
          'Define the zip name
          StrZipFile = Split(StrDocFile, ".")(0) & ".zip"
          'Create the zip file, by simply copying to a new file with a zip extension
          FileCopy StrDocFile, StrZipFile
          StrMediaFold = StrInFold & "Media"
          StrTmpFold = StrInFold & "Tmp"
          'Test for existing tmp & output folders, create they if they don't already exist
          If Dir(StrTmpFold, vbDirectory) = "" Then MkDir StrTmpFold
          If Dir(StrMediaFold, vbDirectory) = "" Then MkDir StrMediaFold
          'Create a Shell App for accessing the zip archives
          Set Obj_App = CreateObject("Shell.Application")
          'Next, process any media
          On Error Resume Next 'In case the file is in use or zip file has no media
          'Extract the zip archive's media files to the temporary folder
          Obj_App.NameSpace(StrTmpFold & "\").CopyHere Obj_App.NameSpace(StrZipFile & "\word\media\").Items
          On Error GoTo 0 'Restore error trapping
          'Delete the zip file - the loop takes care of timing issues
          Do While Dir(StrZipFile) <> ""
            Kill StrZipFile
          Loop
          'Get the temporary folder's file listing
          StrMediaFile = Dir(StrTmpFold & "\*.*", vbNormal)
          'Process the temporary folder's files
          While StrMediaFile <> ""
            'Copy the file to the output folder, prefixed with the source file's name
            FileCopy StrTmpFold & "\" & StrMediaFile, StrMediaFold & "\" & StrTmp & StrMediaFile
            'Delete the media file
            Kill StrTmpFold & "\" & StrMediaFile
            'Get the next media file
            StrMediaFile = Dir()
          Wend
          'Delete the temporary folder
          RmDir StrTmpFold
        Next
      End If
    End With
    Application.ScreenUpdating = True
    End Sub


    Cheers
    Paul Edstein
    [MS MVP - Word]

    Friday, January 2, 2015 8:08 AM