Build a corpus

by bose on March 9, 2010

Use web dumper to download all the URL’s.

Script BBedit to batch convert to text files. Convert text files to new corpus. Use new corpus for crawling.

Should the corpus be large or small to start with?

{ 0 comments… add one now }

Leave a Comment

Previous post:

Next post: