So, you really like my stuff, huh? That's awesome! You can subscribe to my site via my RSS feed. Or, if you'd like to receive my posts via email, please enter your email address below. Here are your subscription options:

Search engine optimization basics

At this point in web development, virtually everyone I talk to about a project tells me they would like to “come up in Google”. In the industry, we refer to that as search engine optimization, or SEO for short. This post attempts to cover the basics for search engine optimization as a starter resource for others.

SEO Overview

It has been my experience that most people think about SEO as something one does after a website is up and running, like tuning a car. While SEO was born from trickery designed to exploit the algorithims search engines use to rank web pages, modern SEO practices need to be considered and implimented from the beginning of a site’s conception. SEO is more than just keywords, it’s a part of the site’s structure itself.

Search engines 101

In order to understand SEO, you must know a little something about how a search engine works. Essentially, there are 5 pieces to a search engine:

  1. The web page crawler, also known as the spider
  2. The index, which is the database
  3. The algorithim
  4. The search box
  5. The search results pages

The spider is a type of robot that has the task of following links on the Web to download pages. Each page that is downloaded is processed and stored in the index. When a user makes an inquiry through the search box, the algorithim determines which pages to pull from the index and what order they are displayed on the results pages.

That’s the basic overview of how a search engine works. If you’re feeling geeky and want some in-depth knowledge, read The Anatomy of a Large-Scale Hypertextual Web Search Engine, which was the original research paper written by Sergey Brin and Lawrence Page for Google while they were still at Stanford. What matters to know about how engines work with respect to optimization is this:

  • Spiders find pages by following links, so you’ll want to make sure they can follow every link on your site.
  • The algorithim determines the order Web pages are displayed on the search results pages, so you want to make sure your pages take advantage of that algorithim.
  • Search engines rank Web pages, not websites.

Algorithims are tightly held secrets by the major search engines, and they are updated regularly. While there are people that focus on cracking the algorithims, most people will do just fine with the known best practices for maximizing your position in the engines. In fact, Google even provides some of them on their site as webmaster guidelines. The rest of what we know comes from experience and information sharing. When I first heard about SEO, I thought it was a scam for people that didn’t know better. Then I stumbled across Webmaster World. Webmaster World is the largets online forum in the world and it’s where all of the great optimizers share their knowledge with each other. If you want more than what this post offers, I recommend turning there next.

The lowdown

SEO tactics can be broken down into two areas: on page and off page factors. On page factors are what you do with your Web pages themselves. Off page factors are the factors outside of your Web pages that influence your ranking. Most people think it’s what you do on your Web pages that makes the most difference for rankings, but that’s not true. The biggest impact on your position on a results page comes from the links pointing to your page. Take for example a Google search on the word “failure“. The #1 result is the actual White House bio for George W. Bush. Now, I can promise you that the White House did nothing to optimize his bio for the word failure. What happened is that a bunch of bloggers organized to put links on their blogs using the word failure for the blue text and pointed the link to the President’s bio. It’s a phenomena called “Googlebombing”. Therefore, if you only do one thing when it comes to SEO, get links!

Content is King

After links, content is King. The only thing search engines process on a page to determine rankings is written content. In fact, Google employs artificial intelligence engineers and a grip of other really smart people to make computers that can “read” your page’s content. They can determine quite a bit about your page by the keywords that appear in the headlines, paragraphs, and more. I’ll cover more about how to write keyword friendly copy later. The most important thing you need to know about content is to have a lot of it. As I said earlier Google ranks web pages, not websites, therefore the more pages you have in Google, the more traffic you’ll pull.

And, the only thing SEs like more than content is fresh content. Google will visit your site a few times the first time the spider comes across it, if they don’t see new content, they won’t come back very often. If you do regularly produce new content, then Google *will* return often, and that equals good.

Proper document structure

Having loads of fresh content is great, but to get the most mileage out of it, you need proper document structure. Proper document structure refers to correctly coded web pages. This means having complete meta data and content that is wrapped in semantically correct tags. Google values the words that appear in headlines more than the ones in paragraph copy. And, in order for Google to know what your headline is you’ll need to do more than just make it a larger font and bold. Headlines need to be wrapped in <h1>, <h2>, and the other headline tags. Paragraphs need <p> tags. And so on.

I also highly recommend using <strong> and <em> tags in the middle of your paragraphs around keyword phrases.

Keyword density and proximity

While it’s true that search engines value keywords, gone are the days of using those keywords over and over on a single page. As I said before, Google employs artificial intelligence engineers and they have developed a thing called latent semantic indexing, which means they can read your pages to determine if you are keyword spamming. One of the things search engines look for is the number of times a keyword appears on a page divided by the total number of words on the page, which is known as keyword density. The optimal percentage is between 1.5% and 3%. I wouldn’t sweat keyword density too hard unless you are in a highly competitive keyword space.

Keyword proximity refers to the distance keywords appear to each other. Words that are next to each other are considered stronger than words that appear with other words between them. Proximity is less of an art and more of a concept to be aware of.

Aligning keywords through links, titles, headlines, and body copy

If you are really trying to rank for a particular keyphrase, then you’ll want to align a few elements around that keyword. Alignment is the idea that a keyword appears in the anchor text of a link to a page where it appears in the title element and the headline as well as the first sentence of the first paragraph. Search engines like keyword alignment because it often means the page is a strong candidate to be about that keyword.

Themes

Themes have to do with the site structure. Google likes to break down a site to determine what it is about. Brett Tabke does a great job of explaining themes in his article Search Engine Theme Pyramids, but he’s taken it down for the moment. Essentially, you want to build your site’s information architecture such that the top level pages are more generic, and the deeper pages are more specific.

Meta data

Meta data is additional data about a Web page found in the head of the document. Search engines like pages that have accurate and complete meta data. Some of the best ones to include are:

Title element
This is one of the most important parts of your page as it not only carries lots of keyword weight, but it’s also what search engines display as the result title on the search engine results page. Use this the title element as a chance to entice searchers to click on your link, not just stuff it with keywords.
Meta description
The meta description is also very important it’s the exact text search engines display for the description of a result on the search results pages. Again, use this text for marketing, not keyword stuffing.

Inbound links

As I wrote early on, the best thing you can do is acquire links on other websites pointing to your own. While quantity is great, quality is even better. Here are some tips for determining quality:

On topic
If the site linking to yours is about the same subject, that link is worth more than a link from a site that isn’t on the same topic. For example, if you run a site about knitting, it’s better to have a link from another knitting site than a web design site.
PageRank
Google has a value that it assigns to all Web pages that ranges from 0-10, with 10 being the highest.
Authority
Once a site reaches a high volume of content and many inbound links, it reaches what is known as authority status. Wikipedia is a great example of an authority site. Links from authority sites are worth more than non-authority site links.
.gov and .edu
Search engines, especially Google, like links from .gov and .edu better than .com and the others.
IP Addresses
Search engines will analyze how many IP addresses your inbound links come from. The more diversity you have among your links, the more it looks like a variety of people are linking to you, which search engines value.
Anchor text and surrounding copy
A link’s anchor text is the text underlined in blue. Search engines highly value those words. Additionally, they analyze the copy surrounding the text, so a contextually placed link strengthens the value of the inbound link.

Black hat SEO

There are search engine techniques that violate the terms of service set forth by the search engines. Using these techniques may produce a short term gain, but they will also lead to a site being banned, which means a search engine will manually remove your site from their index. I have never needed to use black hat techiniques to rank my clients in the search engines, so for me, the risk is not worth the short-term reward. Here’s a list of things that will lead to search engine banishment:

Cloaking
Cloaking is when your site serves two versions of the same page; one to the public, and one to the search engines. Optimizers use this technique so they can makes pages that are perfectly optimized, which are often not user friendly, and then serve up a user friendly page to searchers. There are rare cases where cloaking is ok, and that’s generally when a site uses text in graphics, but serves Google plain text. Cloaking is an easy technique to get banned for because Google can send out a cloaked version of their spider and then compare results.
Bad neighborhoods
Many people build links to their sites by swapping links with other sites. However, there are some sites that you don’t want to link to because Google has identified them as “bad neighborhoods”. Bad neighborhoods are sites that participate in what are known as link farms. Link farms are sites that are set up strictly to link to other sites and contain no real content of their own. You won’t be penalized if they link to you, but you can be penalized for linking to them. So, avoid trading links with sites that look really spammy.
Natural language
Google can break down the language on a page to determine if it is using natural language. If a page appears to be stuffing keywords among other random words, then they penalize that page, and potentially your whole site. Write real content because Google is smarter than you. ;)
Duplicate content
Search engines compare all of the web pages they index to determine if pages contain duplicate content. If you use the content from one web page on another one without changing it significantly, Google will ignore the other page as it considers it duplicate content.

More SEO resources

If you want to learn more about SEO, I recommend this SEO 101 article.

What say you about all of this?

Trackback URL Comment feed