URL Filters:
Following the
Google Sitemaps MP3 Interview with John Mueller, I wanted
to share with you a quick tip for using the GSiteCrawler software with
DotNetNuke.
Make sure you listen to the full interview for tips on preparing your
Google sitemaps. Following one of the tips, here's a filter list that I
have created so far to remove links that Google does not require.
This filter list is currently used when crawling the DNN Creative
website. It is particulalrly used to extract certain links such as the
register / login / terms / privacy links on each page, plus extra links
that are created and not relevant for Google if you are using Scott
McCulloch's News Articles Module and ActiveModules Active Forums module.
Here is a list of the filters that I currently use in GSiteCrawler:
(in the ban URL tab)
(you will notice that some of these filters include the defaults that
come with GSiteCrawler, I have left these on so that you can easily
compare)
/_vti_bin/
/afsort/
/Categories/
/CategoryID/
/CategoryView/
/login
/NewsListing/
/PostComment
/privacy
/profile/
/register
/Search/
/sendemail/
/Syndication/
/terms
?
fsforum/
http://johannesmueller.com/_private
Hope this is useful