12 October 2009

The following VBA can be copied into a VBA module in Word to create a macro that does text replacements on the document. This particular case replaces a set of HTML entities that cluttered a web-based form. The pollution was caused by users pasting from Word to a textarea, which did not understand the multi-byte character set. This does not solve the problem, but it did convert the bad characters to readable characters that were nearly equivalent. Regardless, it is a template that can be used for basic, repetitive replacements.

The Code

Sorry about the bad character display. I'll try to make time to work on the FCK-Drupal interaction that shredded them.

Dim mintCount As Integer
Sub StripHtmlEntities()
mintCount = 0
' Reset the find/replace options.
Selection.MoveStart
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
ReplaceString "•", vbCrLf & "-"
ReplaceString "’", "'"
ReplaceString "®", "(r)"
ReplaceString "–", "-"
ReplaceString "“", """"
ReplaceString "�", """"
MsgBox "A total of " & mintCount & " replacements were made."
End Sub
Private Sub ReplaceString(strFind As String, strReplace As String)
    With Selection.Find
        .Text = strFind
        .Wrap = wdFindContinue
    End With
    Do While Selection.Find.Execute = True
        mintCount = mintCount + 1
        Selection.Text = strReplace
        If mintCount > 10000 Then Exit Sub
    Loop
End Sub
 



blog comments powered by Disqus