Craig Francis

Search Engines

Search engines like Google and Yahoo allow people to search though many millions of websites in a fraction on a second. They have quite a challenge trying to organise and index all that data, so they automate it with spiders, mainly because there is no chance a human will be able to do it. These spiders are usually little computers that try to find as many pages on a website as possible, then take a copy of the data found for indexing purposes. The index is used later when a search is performed, it provides a quick way to look though those millions of pages and return a small set of results.

Helping the spider

The search engine spiders should be seen as very basic computers, they will not execute JavaScript, submit forms or spend much time looking at Flash. The only thing that can be relied on is the simple link, without these, the search engine spiders will find it very difficult finding all the pages on a website.

This is why elements like the navigation bar on the website really does need to be basic, website authors need to make it as easy as possible to find all the pages on the website if they are going to stand any chance of those pages appearing on a search engines results page.


Several search engines run on the basis that if a page has multiple, good quality links to it, the better that resource is. As a simple example, the BBC is considered a good website, it's homepage currently has a Google page rank of 9/10. If a link was to appear on that homepage which contained the text "lorem ipsum" linking to this website, then anyone who went to and did a search for "lorem ipsum" will find that this website should rank higher in the search results. This is simply because a good quality link says the destination URL is about "lorem ipsum".

The text found within the links is also quite important. For example if a large number of websites linked the word "failure" to a particular website, then search results for "failure" will show an interesting result. The page in question does not need to contain the word, but because of all the links the has been associated has been made. This is one of the reasons it becomes quite important to use the alt attribute on images within links.

This makes the second basic SEO rule, get good links pointing to a website, as this shows the website is a reputable and popular source.

However, website authors should not try to find ways to get a huge number of inbound links. Most search engine spiders are developed by people like Matt Cutts, who create algorithms used to find website authors trying to cheat the system - if this happens then a website can get black listed.

A good example of how a link can be broken is though the incorrect use of JavaScript. If a link replaces content on a page with the use of tools like AJAX, then unless there is an alternative for non JavaScript enabled browsers, then the search engine spiders will also be effected. First the spider will not waste resources executing the JavaScript (which might not even be a link), and second because the result of the action might not create a URL that can be linked to.

Meta tags

A few years ago it was considered good practice to include the "keywords" and "description" meta tags, but now it can be finally said they are slowly dying.

At the time they worked well, this was because search engines only had enough storage space for a few words per page, and there were not that many website authors trying to fool the search engine spiders. But now, with storage space becoming allot cheeper, the amount of words indexed is now rarely a problem.

Perhaps the main issue with these meta tags is because there not usually visible to website visitors. Because of this they are now ignored by most search engine spiders, from their point of view it's much better to index the content on the page that the website visitor will see, rather than some hidden text that might be completely unrelated, or more likely out of date.

However there is still a purpose to the "description" meta tag. Although its content is most likely ignored when doing the search (unconfirmed), it can be displayed on the search results page. Keeping with Google as an example, normally under the search results title there is a block of text that highlights some of the search words and their surrounding content, this is designed to give a basic idea on the relevance of the content. But in some cases the search engine might not be able to show anything useful, so it might just show a bit of text found at the start of the document. In most cases, this is irrelevant, but it can be over-ridden if a "description" meta tag exists, it's content might be still irrelevant, but at least its easier to read than a few seemingly random words being shown.

Duplicate content

If a website contains duplicate content, then some search engines might detect this and either penalise that website, or could even ignore that page entirely. Neither of these have been confirmed, but there is defiantly a side effect of having duplicate content, which is when you have two pages with the same content, then the number of links to those pages will be divided between them - the less links, the less popular that page is and therefore the less relevant that page might be.

Ideally all duplicate pages, and even domain names should be removed and setup with "301 redirects".

SEO Companies

In short, be careful.

No company can guarantee a number 1 ranking (ref Google), actually it would be difficult for them to guarantee anything. Unfortunately there are allot of companies out there which are trying to sell ways to improve search ranking though unorthodox ways, for example though the use of link farms.

Usually most websites can improve their ranking in the search results just by making sure the search engine spider can access the website (use of simple links), getting good links to the website from other related websites and ensuring there is no duplicate content.

However, there is one area which an SEO company can help, and that is ensuring the text on the page is optimised for a search engine. This does not mean stuffing it with keywords (search engine spiders are getting quite good at detecting that), but making sure that it uses words that potential visitors might search for, and that all the text remains relevant to the page without any unnecessary duplication.

Perhaps one additional area that they might help with is the addition of useful pages that other websites might link to. As these SEO companies should be looking at the website from the visitors point of view, they might have interesting ideas that could have been overlooked by the website authors.

Any feedback would be greatly appreciated, I don't include comments due to the admin time required, but if you email me, I will reply and make appropriate updates. Also, if you would like to take a copy of this article, please read the terms this article is released under. This article was originally written Wednesday 6th December 2006.