none
Identify symbols inserted via InsertSymbol (Needs optimization) RRS feed

  • Question

  • We have a requirement to restrict our font usage to Arial including symbols inserted
    via InsertSymbol. Our requirements are imposed by a routine (DocEval) which all of our
    users run to validate each document. This is just one of the many routines included in DocEval. Because of the time required to evaluate each character in a document, I have spent alot of time trying to optimize this function. I recently discovered that the range.XML function contains "<w:sym" when the range contains a symbol.

    There is probably a more optimized way to identify if the range contains a symbol and looking for any other ideas.

    Thanks in Advance,
    Tim

    ************************************************************
    Private Function FontSymbolChk(ByVal doc As Word.Document,
          ByVal ErrColor As Word.WdColor) As String
        '4. Characters inserted by InsertSymbol Chr(40)
        doc.Application.ScreenUpdating = False
        doc.Application.StatusBar = "  Font Check - Symbols"
        Dim sw As Stopwatch = Stopwatch.StartNew
        Dim paras As List(Of Word.Paragraph) =
          (From p As Word.Paragraph In doc.Paragraphs
          Where p.Range.Text.Contains(Chr(40)) AndAlso
          p.Range.XML.Contains("<w:sym")).ToList
        Debug.Print("Symbol Paras:" &
          (sw.ElapsedMilliseconds / 1000).ToString("0.00") & " sec" &
          ",paras:" & paras.Count.ToString)
        Dim Ret As String = "", cnt As Integer = 0
        For Each p As Word.Paragraph In paras
          Dim rngs As List(Of Word.Range) =
            (From r As Word.Range In p.Range.Characters
            Where r.Text = Chr(40) AndAlso
            r.XML.Contains("<w:sym")).ToList
          For Each r As Word.Range In rngs
            r.Select()
            Dim dlg As Word.Dialog =
              doc.Application.Dialogs(Word.WdWordDialog.wdDialogInsertSymbol)
            If dlg.Font.ToString <> "(normal text)" Then
              Ret &= vbCrLf & Lib1.GetParaInfo(r.Paragraphs(1).Range) & " - Invalid character"
              r.Font.Color = ErrColor : cnt += 1
            End If
            dlg = Nothing
          Next
          rngs = Nothing
        Next
        doc.Application.ScreenUpdating = True
        Debug.Print("Font Check - Symbols:" &
          (sw.ElapsedMilliseconds / 1000).ToString("0.00") & " sec, cnt:" & cnt.ToString)
        Return Ret
    End Function


    Tim

    Tuesday, February 7, 2012 3:56 PM

Answers

  • Hi Tim,

    If you evaluate the XLM as a simple text string, you could quickly establish whether (and where in the string) the first "<w:sym" string occurs using Instr. Build this into a loop and you

    Sub Test()
    Dim Str As String, StrOffsets as String, i as Long
    Str = ActiveDocument.Range.XML
    While InStr(Str, "<w:sym") > 0
      i = i + InStr(Str, "<w:sym")
      StrOffsets = StrOffsets  & vbcr & i
      Str = Right(Str, Len(Str) - InStr(Str, "<w:sym"))
    Wend
    MsgBox StrOffsets
    End Sub

    With some added logic, you could get the paragraph indices, etc. This would be much faster than iterating through all the paragraphs - especially if the document is 'clean'.


    Cheers
    Paul Edstein
    [MS MVP - Word]

    • Proposed as answer by Bruce Song Thursday, February 23, 2012 9:38 AM
    • Marked as answer by Tim_Shaf Thursday, February 23, 2012 12:11 PM
    Thursday, February 16, 2012 5:11 AM

All replies

  • Hi Tim,

    Thank you for posting.

    I did some research about this problem but also didn't find a more optimized way to resolve it.

    I will help you involve others to help you. There might be some delay about the response. Appreciate your patience.

    By the way, what is the version of your word application?

    Best Regards,


    Bruce Song [MSFT]
    MSDN Community Support | Feedback to us

    Wednesday, February 8, 2012 6:26 AM
  • Hi Tim,

    You can prevent the file being saved with non-compliant fonts by adding code like the following to a class module in Word's Normal template: 

    Private Sub wdApp_DocumentBeforeSave(ByVal Doc As Document, SaveAsUI As Boolean, Cancel As Boolean)
    Dim RngTmp As Range
    Set RngTmp = ActiveDocument.Range(0, 0)
    With ActiveDocument.Content
      With .Find
        .ClearFormatting
        .Replacement.ClearFormatting
        .Font.Name = "Arial"
        .Replacement.Text = ""
        .Forward = True
        .Wrap = wdFindStop
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchAllWordForms = False
        .MatchSoundsLike = False
        .MatchWildcards = True
        .Execute
      End With
      Do While .Find.Found
        If .Duplicate.Start <> RngTmp.Start Then
          RngTmp.End = .Duplicate.Start
          RngTmp.Select
          MsgBox "The selected range has characters in a non-compliant font", vbExclamation
          Cancel = True
          Exit Sub
        Else
          Set RngTmp = .Duplicate
          RngTmp.Collapse wdCollapseEnd
        End If
        .Collapse wdCollapseEnd
        .Find.Execute
      Loop
    End With
    End Sub

    Anytime the user tries to save such a document, the first non-compliant range will be selected (for fixing) and the save action terminated. Since the document can't be saved in this state, it can't be closed either.

    To see how to implement the class module, go to: http://word.mvps.org/faqs/macrosvba/InterceptSavePrint.htm


    Cheers
    Paul Edstein
    [MS MVP - Word]

    Wednesday, February 8, 2012 9:07 AM
  • Bruce,

    Thanks for your response.
    Using vs 2010 & Office 2010

    I suspect that there may be a more efficient alternative
    to 'p.Range.XML.Contains("<w:sym")'.

    Tim

    • Edited by Tim_Shaf Wednesday, February 8, 2012 1:25 PM
    Wednesday, February 8, 2012 1:20 PM
  • Paul,

    Thanks for your response.

    Unless I am missing something, I do not think
    this code will identify symbols inserted via InsertSymbol.

    Symbols inserted via InsertSymbol appear as Chr(40)
    and may be from a different font than what is applied
    to the Chr(40) character.


    Tim

    • Edited by Tim_Shaf Wednesday, February 8, 2012 1:26 PM
    Wednesday, February 8, 2012 1:24 PM
  • Tim,

    This is completey off topic but how did you create your avatar? 

    Thanks! 

    Wednesday, February 8, 2012 2:39 PM
  • Hi Tim,

    As I understand it, the only symbols (and other characters) you'd be concerned with are those that are not in Arial:
    We have a requirement to restrict our font usage to Arial including symbols inserted via InsertSymbol

    The code I posted finds whatever the first non-conforming block might be. It doesn't look at whether that something is a symbol character - it looks at whether it's Arial. By doing so, it traps both Symbols and ordinary text that isn't in Arial.


    Cheers
    Paul Edstein
    [MS MVP - Word]

    Thursday, February 9, 2012 8:49 AM
  • Paul,

    Thanks for your response.

    Please try the following:

    1. Insert a paragraph "This is a Test" formatted with Arial.
    2. Immediately before "Test", insert a symbol setting the "Font:"  dropdown
        to Symbol (anything but "(normal text) will work".
    3. Perform a search setting the Format: font:Arial.

    Note that the entire paragraph is selected.
    When the symbol is selected, it indicates that it is an Arial character even though
    it is a character from the "Symbol" font inserted via InsertSymbol.

    Please let me know if you get different results.


    Tim

    Monday, February 13, 2012 1:58 PM
  • Hi Tim,

    It turns out that, when Insert|Symbol is used with the Symbol and Windings fonts, at least, Word doesn't report the font name in the normal manner. For example, if you insert a character in the Symbol or Windings font into some Arial text, then select it, the UI reports the inserted character as Arial (even though the character clearly isn't from the Arial character set). And it's not only the UI's font display that's affected. As per your scenario, a 'Find' action for Arial finds the whole paragraph, whilst a 'Find' action for the Symbol or Windings font finds nothing. Only if you specifically format the inserted Symbol character with the Symbol or Windings font does Word report the font name in the normal manner. Apparently, the reason for this is so that, if the user selects a range containing one of these fonts, and applies a different font name to the range, the inserted character's font isn't affected. Characters inserted with some other fonts don't behave this way and do report correctly in the UI.

    The above behaviour is carried through to vba. With vba, if you select a 'Symbol' character inserted via Insert|Symbol, then query the font name, 'Symbol' will be returned. But, if you reference the font via its range, 'Arial' will be returned.

    The macro I posted above will find text if it has a font attribute that displays in the UI as something other than Arial, but that's not much help if a character was inserted via Insert|Symbol (whether in this or another document from which it has been copied) and is one of those that doesn't automatically report itself in the UI.

    In light of the above, I've revised the macro. It'll be much slower than before for two reasons:
    • it now tests all story ranges (the previous version only tested the body of the document); and
    • it now tests every character in each story range.
    I've optimised the code by turning off screen updating, but I expect you'll still find it slower than before. From your standpoint, though, the important issue is whether it runs any quicker than your own code. Even if it turns out your code is faster, you might want to implement it via the DocumentBeforeSave event as I have done.

    Private Sub wdApp_DocumentBeforeSave(ByVal Doc As Document, SaveAsUI As Boolean, Cancel As Boolean)
    Application.ScreenUpdating = False
    Dim oChr, StrFont As String, RngTmp As Range, RngStry As Range
    Set RngTmp = ActiveDocument.Range(0, 0)
    With ActiveDocument
      For Each RngStry In .StoryRanges
        With RngStry
          For Each oChr In .Characters
            StrFont = oChr.Font.Name
            oChr.Duplicate.Select
            With Dialogs(wdDialogInsertSymbol)
              If StrFont <> .Font And .Font <> "(normal text)" Then Selection.Font.Name = .Font
            End With
          Next oChr
          With .Find
            .ClearFormatting
            .Replacement.ClearFormatting
            .Font.NameAscii = "Arial"
            .Replacement.Text = ""
            .Forward = True
            .Wrap = wdFindStop
            .Format = True
            .MatchCase = False
            .MatchWholeWord = False
            .MatchAllWordForms = False
            .MatchSoundsLike = False
            .MatchWildcards = True
            .Execute
          End With
          Do While .Find.Found
            If .Duplicate.Start <> RngTmp.Start Then
              RngTmp.End = .Duplicate.Start
              RngTmp.Select
              Application.ScreenRefresh
              MsgBox "The selected range has characters in a non-compliant font", vbExclamation
              Cancel = True
              GoTo Done
            Else
              Set RngTmp = .Duplicate
              RngTmp.Collapse wdCollapseEnd
            End If
            .Collapse wdCollapseEnd
            .Find.Execute
          Loop
        End With
      Next RngStry
    End With
    Done:
    Set RngTmp = Nothing: Set RngStry = Nothing
    Application.ScreenUpdating = True
    End Sub


    Cheers
    Paul Edstein
    [MS MVP - Word]

    Tuesday, February 14, 2012 10:51 AM
  • Paul,

    Thanks for your detailed response.
    It is similar to what I am currently using in our Word 2003 template
    using vba.

    I am in the process of moving to Word 2010 and moving all of the
    functionality to a Word Add-In and looking at ways to optimize the code.

    This is just one routine that identifies issues in a document and provides
    a text box filled with any/all descrepancies identified.

    I have found that using LINQ (as shown in FontSymbolChk) is much more
    efficient than iterating thru each character in a document.

    I recently discovered the range XML method. I discovered that the XML
    contains "<w:sym" when the character has been inserted via InsertSymbol.
    This appears to be a more efficient approach, but since I am very limited in
    my experience with XML I was thinking there might be something even more
    efficient than .Range.XML.Contains("<w:sym")).

    Thanks again for your response.


    Tim

    Wednesday, February 15, 2012 3:39 PM
  • Hi Tim,

    If you evaluate the XLM as a simple text string, you could quickly establish whether (and where in the string) the first "<w:sym" string occurs using Instr. Build this into a loop and you

    Sub Test()
    Dim Str As String, StrOffsets as String, i as Long
    Str = ActiveDocument.Range.XML
    While InStr(Str, "<w:sym") > 0
      i = i + InStr(Str, "<w:sym")
      StrOffsets = StrOffsets  & vbcr & i
      Str = Right(Str, Len(Str) - InStr(Str, "<w:sym"))
    Wend
    MsgBox StrOffsets
    End Sub

    With some added logic, you could get the paragraph indices, etc. This would be much faster than iterating through all the paragraphs - especially if the document is 'clean'.


    Cheers
    Paul Edstein
    [MS MVP - Word]

    • Proposed as answer by Bruce Song Thursday, February 23, 2012 9:38 AM
    • Marked as answer by Tim_Shaf Thursday, February 23, 2012 12:11 PM
    Thursday, February 16, 2012 5:11 AM