URL Structure
- SEO friendly URL’s: Don’t use dynamic URLs like:
www.example.com/?p=586544
- Use semantic URLs with meaningful names:
www.example.com/topic-name
- Use Hyphens to split the words in URL (don’t use underscore)
Content Best Practices
- Write for humans first, search engines second.
- Write only high-quality content.
- Encourage other trustworthy sites to link to you.
- Remove duplicate content from your website.
- Link to other websites with relevant content.
- Tabbed content need to be avoided (Content inside tabbed will be given less priority by google)
- Page has to be checked with Googlebot simulator to check if it can render it.
- Link to other websites with relevant content.
- Flash content, iframes should be avoided
- Breadcrumb needs to be included in all the pages
- Page has to be validated with w3c tool to check CSS and HTML issues https://validator.w3.org
- Make sure you’re not blocking any content or sections of your website if you want it to be crawled.
- Links on pages blocked by robots.txt will not be followed. This means
- Unless they’re also linked from other search engine-accessible pages (i.e. pages not blocked via robots.txt, meta robots, or otherwise), the linked resources will not be crawled and may not be indexed.
- No link equity can be passed from the blocked page to the link destination. If you have pages to which you want equity to be passed, use a different blocking mechanism other than robots.txt.
Page Best Practices
Title tags
- The Title tag should be written like this: Primary Keyword – Secondary Keyword | Brand Name.
- Use a dash in between your keyword phrases and a pipe at the end before your brand name.
- Avoid duplicate title tags.
- Keep title tags at 50-60 characters or less in length, including spaces.
#example Primary Keyword - Secondary Keyword | Brand Name 8-foot Green Widgets - Widgets & Tools | Widget World
- Give every page a unique title.
Don’t overdo SEO keywords
Meta Description
- While not as important in search engine rankings, are extremely important in getting users to click through from the search engine results page (SERP) to your website.
- 150 to 160 characters is the recommended length
- Avoid duplicate meta descriptions
- Do not use quotes or any non-alpha characters (Google, cuts them out of the meta description)
Meta keywords: Meta keywords used to be popular back in the day. However, nowadays Google (or any other search engine for that matter) does not hold any weight to meta keywords, so go ahead and skip this.
Speed up your website for better rankings.
Code Best Practices
Here is the hierarchy of header tags
<h1></h1>
– usually reserved for webpage titles.<h2></h2>
– highlights the topic of the title.<h3></h3>
– reflects points in regards to the topic.<h4></h4>
– supports points from<h3>
.<h5></h5>
– not often used, but great for supporting points of<h4>
.
Use only one H1 tag per page
To design tables use
<td>
<th>
not<div>
For tables use only
<th>
tag for headers (Don’t use<td>
)Images
- Name all of your images in a way that describes what they are
- Use dashes between the words, rather than underscores ( purple-hat.jpg rather than purple_hat.jpg).
- Do not use non-alpha characters in your image or file names (so no %, &, $, etc…)
Use readable and meaningful URLs only.
Tip
Readable URL’s
https://www.entrepreneur.com/article/social-media-marketing-steps
Warning
Stay away from URL’s like the following, unless its not indexed.
https://www.entrepreneur.com/article/272531
Use Structured Data Testing Tool (Ex: Faq’s etc)
Robots
What is robots.txt?
Robots.txt is a text file webmasters create to instruct web robots (typically search engine robots) how to crawl pages on their website. The robots.txt file is part of the robots exclusion protocol (REP), a group of web standards that regulate how robots crawl the web, access and index content, and serve that content up to users. The REP also includes directives like meta robots, as well as page-, subdirectory-, or site-wide instructions for how search engines should treat links (such as “follow” or “nofollow”).
In practice, robots.txt files indicate whether certain user agents (web-crawling software) can or cannot crawl parts of a website. These crawl instructions are specified by “disallowing” or “allowing” the behavior of certain (or all) user agents.
Basic Format
User-agent: [user-agent name]
Disallow: [URL string not to be crawled]
Technicals
Robots.txt syntax can be thought of as the “language” of robots.txt files. There are five common terms you’re likely to come across in a robots file. They include:
- User-agent: The specific web crawler to which you’re giving crawl instructions (usually a search engine). A list of most user agents can be found here.
- Disallow: The command used to tell a user-agent not to crawl a particular URL. Only one “Disallow:” line is allowed for each URL.
- Allow (Only applicable for Googlebot): The command to tell Googlebot it can access a page or subfolder even though its parent page or subfolder may be disallowed.
- Crawl-delay: How many seconds a crawler should wait before loading and crawling page content. Note that Googlebot does not acknowledge this command, but the crawl rate can be set in Google Search Console.
- Sitemap: Used to call out the location of any XML sitemap(s) associated with this URL. Note this command is only supported by Google, Ask, Bing, and Yahoo.
Examples
Block all web crawlers from all content
User-agent: *
Disallow: /
Allowing all web crawlers access to all content
User-agent: *
Disallow:
Blocking a specific web crawler from a specific folder
User-agent: Googlebot
Disallow: /example-subfolder/
Tip
This syntax tells only Google’s crawler (user-agent name Googlebot) not to crawl any pages that contain the URL string www.example.com/example-subfolder/.
Other quick robots.txt must-knows:
- In order to be found, a robots.txt file must be placed in a website’s top-level directory.
- Robots.txt is case sensitive: the file must be named “robots.txt” (not Robots.txt, robots.TXT, or otherwise).
Tip
A search engine will cache the robots.txt contents, but usually updates the cached contents at least once a day. If you change the file and want to update it more quickly than is occurring, you can submit your robots.txt url to Google.
External Links:
Hope you found this article useful. If so hit 👍🏻 and spread the word!