I have huge text in a word file I want to extract all the links that
starts from http:// and all filenames ending with .asp file extension
and keep it at the end of the document. Is this possible in word vba?

Re: Extract links and .asp file names from text by Jezebel

Jezebel
Fri Mar 31 05:25:52 CST 2006


You don't actually need VBA ---

Method 1: use wildcard searching: http[! ]{1,} and [a-zA-Z]{1,}.asp
respectively. You could use this technique to apply a special format, then
delete everything that doesn't have that format. (You use VBA for this also,
if you're extracting the items and doing something else with them.)

Method 2: convert all whitespace to paragraph marks, to turn the document
into a flat list of words. Copy and paste into Excel. Do a unique sort. All
your http lines will come together. Use the Find() function in the adjacent
column to find cells contain ".asp".




"Maxi" <maheshchindarkar@gmail.com> wrote in message
news:1143802946.520483.321760@z34g2000cwc.googlegroups.com...
>I have huge text in a word file I want to extract all the links that
> starts from http:// and all filenames ending with .asp file extension
> and keep it at the end of the document. Is this possible in word vba?
>



Re: Extract links and .asp file names from text by Maxi

Maxi
Fri Mar 31 05:33:52 CST 2006

I like method2, can you tell me how to convert all whitespace to
paragraph marks. Is there any particular symbol that I can replace
using the Find/Replace option?


Re: Extract links and .asp file names from text by Jonathan

Jonathan
Fri Mar 31 06:26:26 CST 2006


"Maxi" <maheshchindarkar@gmail.com> wrote in message
news:1143804830.278856.154740@i39g2000cwa.googlegroups.com...
>I like method2, can you tell me how to convert all whitespace to
> paragraph marks. Is there any particular symbol that I can replace
> using the Find/Replace option?
>

Place ^w in the Find What box and ^p in the Replace With box. Click Replace
All.

In fact you don't even need to copy & paste into Excel. Select the entire
document, go to Table, Sort and sort the paragraphs there. (Yes, the
paragraphs aren't in a table, but they can still be sorted!)


--
Regards
Jonathan West - Word MVP
www.intelligentdocuments.co.uk
Please reply to the newsgroup
Keep your VBA code safe, sign the ClassicVB petition www.classicvb.org


Re: Extract links and .asp file names from text by Jezebel

Jezebel
Fri Mar 31 16:15:11 CST 2006

Sorting in Excel has the advantage that you can eliminate the duplicates.
Also the .asp lines will be easier to find.



"Jonathan West" <jwest@mvps.org> wrote in message
news:eXBU25LVGHA.4660@tk2msftngp13.phx.gbl...
>
> "Maxi" <maheshchindarkar@gmail.com> wrote in message
> news:1143804830.278856.154740@i39g2000cwa.googlegroups.com...
>>I like method2, can you tell me how to convert all whitespace to
>> paragraph marks. Is there any particular symbol that I can replace
>> using the Find/Replace option?
>>
>
> Place ^w in the Find What box and ^p in the Replace With box. Click
> Replace All.
>
> In fact you don't even need to copy & paste into Excel. Select the entire
> document, go to Table, Sort and sort the paragraphs there. (Yes, the
> paragraphs aren't in a table, but they can still be sorted!)
>
>
> --
> Regards
> Jonathan West - Word MVP
> www.intelligentdocuments.co.uk
> Please reply to the newsgroup
> Keep your VBA code safe, sign the ClassicVB petition www.classicvb.org