Greg
Sat Mar 10 08:45:49 CST 2007
Hi macropod.
It is a derivative of your code and there is a point of checking paragraphs
of unequal length because it is faster than first checking all paragraphs
to see if they are of equal length ;-).
< I think they key difference is that my code checks all paras against each
other, whereas yours only checks adjacent paras. I might
<incorporate Helmut's revisions, though, since they seem to speed things up
a bit.
I don't know your test results but the code I posted has nothing to do with
adjacent pararapraphs. You can take those 799 unique paragraphs and add 10,
20 or a 100 duplicates anywhere in the mix, run the code and the duplicates
are removed.
All that said, there appears to be more to this that I don't understand. At
work yesterday with the The quick brown fox example Helmut gave, my code
was 69 seconds compared to Helmut's 123 seconds. Today at home with the
much longer Word2007 =(rand) text the processing 800 paragraphs takes my
method 272 seconds and Helmut's 236.
Perhaps there is no best way ;-)
--
Greg Maxey/Word MVP
See:
http://gregmaxey.mvps.org/word_tips.htm
For some helpful tips using Word.
"macropod" <invalid@invalid.invalid> wrote in message
news:OMP6W3vYHHA.4520@TK2MSFTNGP06.phx.gbl...
> Hi Greg,
>
> Looks like a derivative of something I developed & posted. My full version
> is:
>
> Dim SBar As Boolean ' Status Bar flag
> Dim TrkStatus As Boolean ' Track Changes flag
>
> Sub KillDuplicateParas()
> Call MacroEntry
> Dim i As Long, j As Long
> Dim eTime As Single
> eTime = Timer
> With ActiveDocument
> If .Paragraphs.Count > 1 Then
> ' Loop backwards to preserve paragraph count & indexing.
> ' Start at 2nd-last paragraph.
> For i = .Paragraphs.Count - 1 To 1 Step -1
> ' Ignore empty paragraphs
> If Len(.Paragraphs(i).Range.Text) > 1 Then
> ' Loop backwards to preserve paragraph count & indexing.
> ' Stop at last preceding paragraph.
> For j = .Paragraphs.Count To i + 1 Step -1
> ' Report progress on Status Bar.
> Application.StatusBar = i & " paragraphs to check. "
> ' No point in checking paragraphs of unequal length.
> If Len(.Paragraphs(i).Range) =
> Len(.Paragraphs(j).Range) Then
> ' Test strings of paragraphs of equal length.
> If .Paragraphs(i).Range = .Paragraphs(j).Range Then
> ' Delete duplicate paragraph.
> .Paragraphs(j).Range.Delete
> End If
> End If
> Next
> End If
> Next
> End If
> End With
> ' Report time taken. Elapsed time calculation allows for execution to
> extend past midnight.
> MsgBox "Finished. Elapsed time: " & (Timer - eTime + 86400) Mod 86400 & "
> seconds."
> Call MacroExit
> End Sub
>
> Private Sub MacroEntry()
> ' Store current Status Bar status, then switch on
> SBar = Application.DisplayStatusBar
> Application.DisplayStatusBar = True
> ' Store current Track Changes status, then switch off
> With ActiveDocument
> TrkStatus = .TrackRevisions
> .TrackRevisions = False
> End With
> ' Turn Off Screen Updating
> Application.ScreenUpdating = False
> End Sub
>
> Private Sub MacroExit()
> ' Clear the Status Bar
> Application.StatusBar = False
> ' Restore original Status Bar status
> Application.DisplayStatusBar = SBar
> ' Restore original Track Changes status
> ActiveDocument.TrackRevisions = TrkStatus
> ' Restore Screen Updating
> Application.ScreenUpdating = True
> End Sub
>
> I think they key difference is that my code checks all paras against each
> other, whereas yours only checks adjacent paras. I might incorporate
> Helmut's revisions, though, since they seem to speed things up a bit.
>
> Cheers
>
> --
> macropod
> [MVP - Microsoft Word]
> -------------------------
>
> "Greg Maxey" <gmaxey@gmail.com> wrote in message
> news:1173463454.081324.32510@s48g2000cws.googlegroups.com...
>>I came across an old post in Google groups for deleted duplicated
>> lines of text in a document.
>>
>> It used a For x = Count method to go through and check the range of
>> one paragraph to the the range of every other paragraph and delete any
>> duplicates.
>>
>> It had two If ... End If blocks. The first check the para range
>> length. If = then the second performed a text comparison. I assume
>> the author thought that it would save time by doing a text comparison
>> only on paras of equal length.
>>
>> The procedure worked as advertised, however with a longer document it
>> took a long time.
>>
>> I created about 800 paragraphs and determined that it was actually
>> much quicker to bypass the the first length check and just do a range
>> comparison on every paragraph. Down from 200 seconds to 75 seconds!
>>
>> Next I remembered an method that Jezebel showed me for stepping
>> through items using the .Next (property or method I am never sure
>> which).
>> I adapted the code as follows and the time taken was down to 3
>> seconds!
>>
>> Anyway, I just wanted to share this with the group:
>>
>> Sub KillDuplicateParagraphs()
>> Dim SBar As Boolean
>> Dim TrkStatus As Boolean
>> Dim eTime As Single
>> Dim oParRef As Paragraph
>> Dim oParChk As Paragraph
>> eTime = Timer
>> With ActiveDocument
>> TrkStatus = .TrackRevisions
>> .TrackRevisions = False
>> End With
>> With Application
>> SBar = .DisplayStatusBar
>> .DisplayStatusBar = True
>> .ScreenUpdating = False
>> End With
>> Set oParRef = ActiveDocument.Range.Paragraphs(1)
>> Set oParChk = oParRef.Next
>> Do
>> '*** Stet out first if block to delete duplicated empty paragraphs.
>> If Len(oParRef.Range.Text) > 1 Then
>> Do
>> 'An empty last paragraph may throw an error on the last loop.
>> On Error GoTo Err_Exit
>> If oParRef.Range = oParChk.Range Then
>> oParChk.Range.Delete
>> Else
>> Set oParChk = oParChk.Next
>> End If
>> Loop Until oParChk Is Nothing
>> End If '***
>> Set oParRef = oParRef.Next
>> 'Skip errors.
>> On Error Resume Next
>> Set oParChk = oParRef.Next
>> On Error GoTo 0
>> Loop Until oParRef Is Nothing
>> Err_Exit:
>> With Application
>> .StatusBar = False
>> .DisplayStatusBar = SBar
>> .ScreenUpdating = True
>> End With
>> ActiveDocument.TrackRevisions = TrkStatus
>> MsgBox "Finished. Elapsed time: " & (Timer - eTime + 86400) Mod 86400
>> & " seconds."
>> End Sub
>>
>