none
OneNote is not giving OCRText for (VBA) inserted image in OneNote 2010.How to force onenote to do OCR on Image? RRS feed

  • Question

  • Hi,

      I am inserting image in OneNote programatically.


                        strNamespace= "h??p://schemas.microsoft.com/office/onenote/2010/onenote"
                        m_xmlImageContent= "<one:Outlinelang=""nl""><one:OEChildren><one:OElang=""nl""><one:Image><one:Sizewidth=""{1}"" height=""{2}"" isSetByUser=""true""/><one:Data>{0}</one:Data></one:Image></one:OE></one:OEChildren></one:Outline>"
                        m_xmlNewOutline= "<?xmlversion=""1.0""?><one:Pagexmlns:one=""{2}"" ID=""{1}"" lang=""nl""><one:Title><one:OE><one:T><![CDATA[{3}]]></one:T></one:OE></one:Title>{0}</one:Page>"
                        pageToBeChange= "MyPage"

                   
                         Dim bitmapAs IPictureDisp
                         Dim strfileAs String
                   
                         strfile= "C:\TEMP\Document1.jpg"
                         Set bitmap= LoadPicture(strfile)
               
       
                          bytFile= GetFileBytes(strfile)
                          base64String = EncodeBase64(bytFile())

                       
                           hh= Round(bitmap.Width/ 30)
                           ww= Round(bitmap.Height/ 30)
                          
                           imageXmlStr= StringFormat(m_xmlImageContent, base64String, hh, ww)
                          
                          
                           pageChangesXml= StringFormat(m_xmlNewOutline, imageXmlStr, newPageID, strNamespace, "")

                      
                     
                            oneNote.UpdatePageContentpageChangesXml

     Image is getting inserted in page of OneNote2010. Then i am navigating to that page.So page gets opened in OneNote, i can see image inserted in it. Giving enough time to OneNote to do OCR.

    Now i am opening another program writtten by me for getting OCR from current opened page. using...

     oneNote.GetPageContent newPageID, pageXml, piBasic, xs2010

     If pageDoc.LoadXML(pageXml) Then
               Set nodes = pageDoc.DocumentElement.SelectNodes("//one:OCRText")
               MsgBox nodes(0).Text
     End If


    But it is not giving OCR for Programatically inserted image. For manually inserted image i can get OCRText.

    One more strange thing-

       If i inserted image programaticallyand run the program for getting OCR data, OCRText is not comming.

    Now if go to oneNote and do right click on programatically inserted image -> "Copy text from image". One progressbar is coming saying 'copying text from image'.

      Now if run the program for getting OCR data again, Result is coming and showing OCR data as output !!!!

    Will anyone please guide me.I am trying this since last many days.

    Regards,


    cap.

    Friday, June 21, 2013 8:25 PM

Answers

  • Hi Cap,

    Thank you for posting in the MSDN Forum.

    I'm trying to involve some senior engineers into this issue and it will take some time. Your patience will be greatly appreciated.

    Sorry for any inconvenience and have a nice day!

    Best regards,


    Quist Zhang [MSFT]
    MSDN Community Support | Feedback to us
    Develop and promote your apps in Windows Store
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.

    • Marked as answer by cap flam Thursday, June 27, 2013 5:49 AM
    Monday, June 24, 2013 12:50 PM
    Moderator

All replies

  • Hi Cap,

    Thank you for posting in the MSDN Forum.

    I'm trying to involve some senior engineers into this issue and it will take some time. Your patience will be greatly appreciated.

    Sorry for any inconvenience and have a nice day!

    Best regards,


    Quist Zhang [MSFT]
    MSDN Community Support | Feedback to us
    Develop and promote your apps in Windows Store
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.

    • Marked as answer by cap flam Thursday, June 27, 2013 5:49 AM
    Monday, June 24, 2013 12:50 PM
    Moderator
  • Dear Quist,

    Thanks for your response. After posting my question I tried to fix the bug. After a couple of days of trials and errors, in desperation, I finally uninstalled and reinstalled Office 2010.  A miracle took place, the first trial of same vba script posted above work perfectly.

    I have a further question. I try now to set the language for the OCR in the pageChangesXml* string in German. When I run the script with the below setting, I get always an OCRtext in lang=”nl”, whereby Dutch is the default editor language of my oneNote. Could you please help me on this issue?

    =>Details of pageChangesXml

    strNamespace= "h??p://schemas.microsoft.com/office/onenote/2010/onenote"
                       

    m_xmlImageContent = "<one:Outline lang=""de""><one:OEChildren><one:OE lang=""de""><one:Image><one:Data>{0}</one:Data></one:Image></one:OE></one:OEChildren></one:Outline>"

                        m_xmlNewOutline = "<?xml version=""1.0""?><one:Page xmlns:one=""{2}"" ID=""{1}"" lang=""de""><one:Title><one:OE><one:T><![CDATA[{3}]]></one:T></one:OE></one:Title>{0}</one:Page>"

                    imageXmlStr= StringFormat(m_xmlImageContent, base64String)
                                                 
    pageChangesXml= StringFormat(m_xmlNewOutline, imageXmlStr, newPageID, strNamespace,

    Thanks in advance,

    cap.

    Thursday, June 27, 2013 6:03 AM
  • Hi,

    Because of its complexity your question falls into the paid support category which requires a more in-depth level of support. If the support engineer determines that the issue is the result of a bug the service request will be a no-charge case and you won't be charged. Please visit the below link to see the various paid support options that are available to better meet your needs. http://support.microsoft.com/default.aspx?id=fh;en-us;offerprophone

    Regards

    Pradip

    Saturday, August 17, 2013 4:18 AM