Lets start a list of the ways sites can link, refer, or point urls to your pages other than direct hrefs. Mainly, we are after ways that SE's such as Google may run into and use urls they find:
- another site links to your graphics ( img src=http://www.searchengineworld.com/gfx/logo.png )
- a site links to your javascript files
- a site links to your css files?
- rss feeds and other xml feeds that people can link to without notice or referrals necc being generated.
- links in email that some se's can read (yahoo mail, hotmail, Gmail [webmasterworld.com])
- links marked with noindex
- links marked with nofollow
- raw urls within javascript or js comments
- raw urls within css or in css comments
- urls within meta data of graphics and video files
- urls within html comments
- urls within the head section, meta data of a html page, or alternate html entities (alt, name, id, etc)
- links or pages that maybe surfed while visitor has page rank engaged on the toolbar
- the target of a constructed, obfuscated, or encrypted js url (hidden until executed)
- links behind pay walls that Google can spider via webmaster tools
- Domains that have been 301'd with links.
- Links in Flash movies (games, quizzes, etc).
- non href'ed url's. (raw url on page http://www.webmasterworld.com)
- Links in any documents other than web pages e.g. .doc, .pdf, .txt, etc.
- blocking a page in robots.txt should make it blocked from bots, but they still spider it.
- Domain registrations/Whois and DNS data
- Links in form data.
- Links in other Google produced software (gadgets, widgets)
- NonTraditional pages (irc, twitter, UseNet, Yahoo, or Google Groups.
- Advertising links (AdWords/Yahoo), and other services like Maps.
What else? Wow - blew through 20....
I will update as we go. Thanks to everyone who pitches in...
I will update as we go. Thanks to everyone who pitches in...
Source: webmasterworld.com
0 komentar:
Post a Comment