locked
Convert rtf format to html format RRS feed

  • Question

  • Hi all. After days lost I decided to ask here, hoping somebody would know solution, or at least direction to go.

    Here is my problem. In Access back end database I have a records with Memo fields that hold rtf formated text. Many records and often long text which was inserted via TAM RTF control back in Acc97. My problem now is that I need to convert those programmatically and so far did not find elegant solution.

    What I did try so far, with some success, are 2 or 3 options (worth mentioning)

    1. In VBA grab contents of the Memo field, copy it to the clipboard, set clipboard format to Rich Text Format via SetClipboardData API (works all fine) and then through automation paste it into Word and then use Word to return it in html format. Well, as much as this work, two big issues, one is that this is quite far from being elegant and two (more importantly) html code returned by Word is pure disaster. It would increase the size of database by 10 times or so. 

    2. Instead of pasting it in Word, I did another SetClipboardData, to another registered format, HTML Format and then pasted it. This is of course much more elegant, but for some reason Clipboard then inserts specific header before the HTML and this creates issues and need of extra parsing through returned code, for couple of reasons very problematic.

    3. My search on the net resulted in finding an old VB project by Brady Hegberg, but this one was/is real "manual" conversion with lots of bugs, at least version I got, and results were far from acceptable. Aside, I just can't accept that there would be no solution to the problem other than "manually" decoding RTF codes and converting them into HTML codes these days! And we know this functionality must be built in in many libraries, I just can't find it.

    All 3 I could easily integrate into my Access front-end, but as said, none is anywhere close to what I need. Then I stumbled upon a little freeware app by Bulgarian! WonderWebWare.com company and was delighted with the fact that it's extremely light-weighting, but that it does the job better than many commercial apps out there, while it states that process uses "IE internal conversion functions" and as such it does not require any external libraries or apps to function (and so it does).

    So my question (finally), does anyone knows anything about those IE internal conversion functions, IE control can of course easily be used in VBA, problem is that, as much I have searched I did not find any reference that would lead me to a right path. Or, if anyone has any other simple solution that could be usable, it would be a great help. And do you think it makes sense asking this in IE forums?

    In simple - VBA procedure should go through table, take value of Memo field with RTF encoded text, convert it to HTML and return it to the same (or new) field. Everything described is simple of course, except this rtf to html conversion.

    Many thanks in advance for all the help you can provide,

    Miroslav




    • Edited by Nemiroslav Friday, May 25, 2012 6:03 PM
    Friday, May 25, 2012 5:57 PM

Answers

  • OK, here it is, after some funny hiccups and following few wrong paths... Believe or not the biggest issue and time waster was OpenClipboard function, which was initially supplied with form's handle and this prevented paste operation all together. It took me a while to discover it. 

    For the purpose of this thread I created simple Access form and down below is a code attached to it. Please note this solution is for MS Access 2010, it uses own Web Browser control provided with it. It does not need any external reference or control. For earlier versions of Access solution was also tested with Microsoft Web Browser Activex and without any code modification it works exactly the same.

    You should be able to find your way around for your specific use, showed is core functionality.

    Form picture:

    Elements are tstConv command button, zu_Text regular text field holding the rtf, WBc is Web Browser Control, mHTMLCode shows HTML code extracted from WBc and cConverted is text field with Rich Text property on, showing how will Access display it if this control is used.

    And code, turns out to be very simple in the end

    Private Declare Function OpenClipboard Lib "user32" (ByVal hwnd As Long) As Long
    Private Declare Function RegisterClipboardFormat Lib "user32" Alias "RegisterClipboardFormatA" (ByVal lpString As String) As Long
    Private Declare Function EmptyClipboard Lib "user32" () As Long
    Private Declare Function CloseClipboard Lib "user32" () As Long
    Private Declare Function SetClipboardData Lib "user32" (ByVal wFormat As Long, ByVal hMem As Long) As Long
    Private Declare Function GlobalAlloc Lib "kernel32" (ByVal wFlags As Long, ByVal dwBytes As Long) As Long
    Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (ByVal Destination As Long, Source As Any, ByVal Length As Long)
    Private Declare Function GlobalUnlock Lib "kernel32" (ByVal hMem As Long) As Long
    Private Declare Function GlobalLock Lib "kernel32" (ByVal hMem As Long) As Long
    Private Declare Function GlobalFree Lib "kernel32" (ByVal hMem As Long) As Long
    
    Private Const GMEM_DDESHARE = &H2000
    Private Const GMEM_MOVEABLE = &H2
    
    Private Sub tstConv_Click()
    
    Dim myRTF As String
    myRTF = Zu_Text 'Memo field with rtf encoded text in it
    
    'Copy the contents of the Memo field with rtf to the clipboard
    Dim lSuccess As Long
    Dim lRTF As Long
    Dim hGlobal As Long
    Dim lpString As Long
    
    lSuccess = OpenClipboard(0)
    lRTF = RegisterClipboardFormat("Rich Text Format")
    lSuccess = EmptyClipboard
    hGlobal = GlobalAlloc(GMEM_MOVEABLE Or GMEM_DDESHARE, Len(myRTF))
    lpString = GlobalLock(hGlobal)
    CopyMemory lpString, ByVal myRTF, Len(myRTF)
    GlobalUnlock hGlobal
    SetClipboardData lRTF, hGlobal
    
    CloseClipboard
    GlobalFree hGlobal
    
    Me.WBc.Object.Navigate "about:blank"
    Do While Me.WBc.Object.ReadyState <> 4
    DoEvents
    Loop
    
    Me.WBc.Object.Document.designMode = "On"
    
    Do While Me.WBc.Object.ReadyState <> 4
    DoEvents
    Loop
    
    Me.WBc.Object.Document.execCommand "Paste", False, True
    
    mHTMLCode = WBc.Object.Document.body.innerhtml
    cConverted = WBc.Object.Document.body.innerhtml
    
    End Sub
    

    Miroslav

    • Marked as answer by Nemiroslav Tuesday, May 29, 2012 1:05 PM
    Tuesday, May 29, 2012 1:03 PM

All replies

  • Daniel, thanks. As you might guess I googled far and wide for anything related, so I did come across resources you list. First is mentioned under my point 3. and remaining 2 are not usable in VBA (they use built-in classes or controls that can not be used in MS Access). If I'm mistaken with that, please let me know.

    Miroslav


    • Edited by Nemiroslav Friday, May 25, 2012 8:01 PM
    Friday, May 25, 2012 8:00 PM
  • What is wrong with using hidden word automation?  You can use do this type of thing in Access.

    Point 3. at least it is a starting off of which you could easily build off of your own custom solution.


    Daniel Pineault, 2010 Microsoft MVP
    http://www.cardaconsultants.com
    MS Access Tips and Code Samples: http://www.devhut.net


    Saturday, May 26, 2012 12:01 AM
  • Daniel, hidden Word automation is absolutely unacceptable. For a simple reason, one line of text in rtf is converted to 440 lines of html!Try simple rtf text and do the "Save As html" from Word and check results.

    And no way I could go and write my own converter for this (or even worse, modifying existing solution that fails on every second valid rtf), this would be anticipated 10 times the job that needs to be done, while at the same time we know this functionality is built on various levels in most anything related on Windows.

    Here is one of those resources that pointed me towards IE control solution or alike:

    http://www.experts-exchange.com/Programming/System/Windows__Programming/A_3096-Convert-RTF-to-HTML-and-HTML-to-RTF.html

    If you read under "What's going on?" you will see why I'm following this path now and might need to ask on IE developer forums. This solution is developed in C++ and of course, we can not use it in Access, as we don't have mentioned controls. See note by author:

    Also note that <META> tag on line, 4 which indicates the MSHTML was used to generate the HTML.  Perhaps that's a clue to avoiding the clipboard operations.  Like I said, I stopped researching when I found this simple clipboard-based solution.

    Good for him, but since we don't have luxury of classes/controls he uses in C++, I will have to follow mentioned clue and try to find documentation on specific MSHTML built-in functionality. Such solution would be simple, quick and with optimum results.

    Saturday, May 26, 2012 4:39 AM
  • Years ago, I DID write my own converter for MS Word -> HTML, using MS Word VBA. I'll see if I can dig up some source code. My system was not perfect (and would need some updating), but it was WAY better than the built-in "save as HTML" feature...

    Matthew Slyman M.A. (Camb.)

    Monday, May 28, 2012 11:27 AM
  • Dear Matthew, many thanks for taking your time with this issue. Last 2 days were spent with something I was hoping to avoid when posting here, but I guess nothing wrong with that, I learned something new. So, I was studying the MSHTML documentation and am happy to report that I did found most optimal way of performing the conversion. It will take a day or so to wrap it into something usable/shareable, but when done I will post it here.

    Solution will not require any external application (just Access), will produce perfect results and will be robust as it can be, since it will use MS own HTML engine/controls available with Windows to convert from rtf to html. 

    Monday, May 28, 2012 12:24 PM
  • OK, here it is, after some funny hiccups and following few wrong paths... Believe or not the biggest issue and time waster was OpenClipboard function, which was initially supplied with form's handle and this prevented paste operation all together. It took me a while to discover it. 

    For the purpose of this thread I created simple Access form and down below is a code attached to it. Please note this solution is for MS Access 2010, it uses own Web Browser control provided with it. It does not need any external reference or control. For earlier versions of Access solution was also tested with Microsoft Web Browser Activex and without any code modification it works exactly the same.

    You should be able to find your way around for your specific use, showed is core functionality.

    Form picture:

    Elements are tstConv command button, zu_Text regular text field holding the rtf, WBc is Web Browser Control, mHTMLCode shows HTML code extracted from WBc and cConverted is text field with Rich Text property on, showing how will Access display it if this control is used.

    And code, turns out to be very simple in the end

    Private Declare Function OpenClipboard Lib "user32" (ByVal hwnd As Long) As Long
    Private Declare Function RegisterClipboardFormat Lib "user32" Alias "RegisterClipboardFormatA" (ByVal lpString As String) As Long
    Private Declare Function EmptyClipboard Lib "user32" () As Long
    Private Declare Function CloseClipboard Lib "user32" () As Long
    Private Declare Function SetClipboardData Lib "user32" (ByVal wFormat As Long, ByVal hMem As Long) As Long
    Private Declare Function GlobalAlloc Lib "kernel32" (ByVal wFlags As Long, ByVal dwBytes As Long) As Long
    Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (ByVal Destination As Long, Source As Any, ByVal Length As Long)
    Private Declare Function GlobalUnlock Lib "kernel32" (ByVal hMem As Long) As Long
    Private Declare Function GlobalLock Lib "kernel32" (ByVal hMem As Long) As Long
    Private Declare Function GlobalFree Lib "kernel32" (ByVal hMem As Long) As Long
    
    Private Const GMEM_DDESHARE = &H2000
    Private Const GMEM_MOVEABLE = &H2
    
    Private Sub tstConv_Click()
    
    Dim myRTF As String
    myRTF = Zu_Text 'Memo field with rtf encoded text in it
    
    'Copy the contents of the Memo field with rtf to the clipboard
    Dim lSuccess As Long
    Dim lRTF As Long
    Dim hGlobal As Long
    Dim lpString As Long
    
    lSuccess = OpenClipboard(0)
    lRTF = RegisterClipboardFormat("Rich Text Format")
    lSuccess = EmptyClipboard
    hGlobal = GlobalAlloc(GMEM_MOVEABLE Or GMEM_DDESHARE, Len(myRTF))
    lpString = GlobalLock(hGlobal)
    CopyMemory lpString, ByVal myRTF, Len(myRTF)
    GlobalUnlock hGlobal
    SetClipboardData lRTF, hGlobal
    
    CloseClipboard
    GlobalFree hGlobal
    
    Me.WBc.Object.Navigate "about:blank"
    Do While Me.WBc.Object.ReadyState <> 4
    DoEvents
    Loop
    
    Me.WBc.Object.Document.designMode = "On"
    
    Do While Me.WBc.Object.ReadyState <> 4
    DoEvents
    Loop
    
    Me.WBc.Object.Document.execCommand "Paste", False, True
    
    mHTMLCode = WBc.Object.Document.body.innerhtml
    cConverted = WBc.Object.Document.body.innerhtml
    
    End Sub
    

    Miroslav

    • Marked as answer by Nemiroslav Tuesday, May 29, 2012 1:05 PM
    Tuesday, May 29, 2012 1:03 PM
  • Thanks for this! I have spent the past week or two wrestling with exactly this issue.

    Now to work my way through the code and implement it in my application :)

    Geoffrey

    Wednesday, May 30, 2012 10:40 AM
  • Hi Nemiroslav,

    I tried your method with MS Access 2003, it won't work. All the controls and code lines are present, do I miss something? please advice.

    Friday, August 24, 2012 7:38 PM
  • Ali, this solution was tested in Access 2010 and Access 2007 (where it works with Web Browser ActiveX control, without any code modification). As for Access 2003, I really don't know, I can not try it, although it should work if IE (and associated ActiveX) is not some really old version. When you say it doesn't work, what it means? Are you receiving any errors? Can you give more info?
    Saturday, August 25, 2012 3:10 PM
  • Thanks for your response. I don't get any error and nothing showing on those 3 controls neither after clicking on the tstConv button. It seems there is nothing being copied into the clipboard or is any way to detect that during the debug steps? Again, thanks for the follow up.
    Monday, August 27, 2012 4:48 PM
  • Ali, there are many things you should try. I guess first one would be to set break-point somewhere early in the code and then debug line by line and check return values etc. Use immediate window to check what's going on. Check syntax of functions used and see about their return values. See if you can use some Clipboard spy utility and see if data gets copied to it or not. Etc etc. I do not really know if this works with earlier Access versions, nor how would behave with earlier OS than Win7.

    Also, did you check your rtf encoded text, to be in a valid rtf format, something you can normally open in Wordpad when saved to a file? 

    Monday, August 27, 2012 5:13 PM
  • Nemiroslav, I did try it in Access 2007 and it worked as expected. It seems it might has something to do with the version difference of Web browser controls. I wonder what is the best way to make it works in both versions of Access runtime. Any advice is greatly appreciated.
    Monday, August 27, 2012 8:21 PM
  • Ali, I'm afraid I can't help you much, but it appears you would need to install later IE on the machine with Access 2003. So, I suppose you could install IE 8 or later, remove IE Browser from this form and re-insert it and check if any progress. I think this issue is beyond what I can do, without knowing anything about the configuration you are trying to run it on. I strongly suspect it has nothing to do with Access 2003 itself, but with OS and IE installation. You may want to check reference to Webbrowser Control on MS (Shdocvw.dll) etc. That's all I can help with. Good luck!
    Tuesday, August 28, 2012 4:31 AM
  • Miroslav, many thanks for your help on this issue with all the valuable ideas. As your suggestion indicated, I aggree that the web brower is the key factor to deal with in order to support multiple versions of Access for this conversion feature. If there is a working solution made later, I will post here for someone who is seeking the same solution for their project as I did. You have done a great job!
    Tuesday, August 28, 2012 2:29 PM
  • Hi Miroslav

    I just used this method to convert data stored in TAM, and it looks like I was 80% successful. Where it is not working correctly is the Font Name (even in your example, Font is different - Bold, Underline, Italics are all fine). Also, indenting is not working quite right.

    In the process of your development, were you able to refine this further?

    Close but not quite right. But I have to admit, this is a brilliant solution that I would not have been able to figure out myself.

    Dave

    Thursday, September 25, 2014 1:40 AM
  • Hi Miroslav

    i found and use your code example to convert RTF to HTML. I use this form to convert a field in a table with many records in it. i have written code to automate the process record to record in vba. my Problem is that after the second record a Dialog box appears telling me that the content of wbc has changed and ask me if i want save it. i habe found no way to suppress this Dialog box. do you have any idea how to do this. ?

    Regards Markus

    Friday, May 27, 2016 9:26 AM
  • Miroslav:
    Thank you for this useful code. As mentioned by Dave it is not perfect but good enough for our purpose.

    Markus:
    I was able to solve the problem with the dialog box by removing the following code and putting it into Form_Activate, see code below. The dialog box appears if there are unsaved changes when navigating to another page, so another solution could be to save the changes or undo them before navigation. But according to my understanging those lines of code only have to be executed once.

    Private Sub Form_Activate()
    Me!axWebBrowser.Object.Navigate "about:blank"
    Do While Me!axWebBrowser.Object.ReadyState <> 4
    DoEvents
    Loop
    Me!axWebBrowser.Object.Document.designMode = "On"
    Do While Me!axWebBrowser.Object.ReadyState <> 4
    DoEvents
    Loop
    End Sub

    Before pasting the new text I use "Undo" to clear the Web Control.

    Me!axWebBrowser.Object.Document.execCommand "Undo", False, True 'Clear the control'
    Me!axWebBrowser.Object.Document.execCommand "Paste", False, True 'Paste the content'
    Regards, Marko



    • Edited by mLacheta Thursday, February 16, 2017 4:09 PM
    Thursday, February 16, 2017 4:08 PM