Our 2023 guide to technical SEO

“Content is king”. Sound oh so familiar?

It’s not necessarily wrong, but content is only one piece of the puzzle of any SEO campaign.

Without good website health, your content will struggle to rank. It’s why technical SEO remains one of the key pillars of any SEO campaign.

We know all too well how complex technical SEO can be. So, we’ve created a simple guide to explain all of the key components of a technical SEO campaign. There’s no time like the present, so let’s dive straight in!

What we’ll be discussing in this guide:

What is technical SEO?
Why is it important to focus on technical SEO?
The cost of ignoring technical SEO
Website loading speed & user experience
Understanding how search engines index your website
Understanding how duplicate content impacts your website
Website architecture
The benefits of structured data
HTTP response codes
XML sitemaps
Robots.txt
Website security

What is technical SEO?

Technical SEO is about ensuring the overall health of a website. A site needs to be technically sound to satisfy Google’s guidelines. All effective marketing strategies are built upon the foundations of technical SEO.

To analyse the technical health of a website, it’s crucial to know what to look for so that any necessary changes can be implemented. Generally, the following factors need to be highlighted:

Discover the elements of a website hampered by SEO-related issues. Inspect issues like broken links, pages without meta descriptions or title tags, and missing pages.
Conduct regular health checks on a website to spot and fix any issues.
Check if any pages on a website are being penalised. If so, why?

Why is it important to focus on technical SEO?

As algorithms become more advanced, search and SEO have evolved. For Google to rank pages, the website needs to be visible. This is where technical SEO becomes a valuable commodity. It helps improve the usability of a website so that Google can crawl and index pages effectively.

Technical SEO helps maximise a site’s crawl rate and eradicates the risk of any pesky penalties from Google. As Google’s updates are tailored to prioritise users, it’s key the website provides an optimum user experience. Users want rapid access to valuable information as they search, so optimising the website experience will help push a site further up the SERPs.

The cost of ignoring technical SEO

Visibility for Google = visibility for a website. It works both ways, but without carrying out any due diligence to check, improving factors such as the user experience and search engine rankings will be a real struggle.

Google penalises websites with poor usability, and a website will drop in rankings and become less visible to users. It’s a spiral all site owners and marketers alike want to avoid. Without visibility, search rankings plummet, and conversion rates start to dry up – this is when the impact on a business is compounded.

Neglecting technical SEO can have severe consequences for websites and businesses, so it’s crucial to consider what positive impact technical SEO has to ensure a strategy can thrive. That said, it’s time to explore technical SEO’s intricacies in greater detail.

User experience & website loading speed

As Google algorithms are rolled out each year, one thing remains true. The end goal is to provide a great user experience. And poor website loading speed = poor user experience.

In May 2021, Google released their Page Experience Algorithm Update, which focuses on creating an optimal browsing experience for users. This algorithm assesses:

Mobile-friendliness
Website security
And yes, you guessed it… website loading speed

Did you know the average mobile user will wait as little as 3 seconds for a page to load before leaving the website? So, you could be losing money every time someone visits your website if your loading speed isn’t in check.

So, how does poor loading speed impact your SEO visibility?

Poor loading speed impacts your visibility in the following ways:

It can waste your crawl budget and prevent key pages from being crawled and indexed.
It can result in Google penalising your website, resulting in ranking drops.

So, now you know the importance of user experience and website loading speed. You’ll probably be wondering how to spot and fix any issues. The next part of our guide will break down the key metrics and common website loading speed issues that can impact a website’s user experience.

Core Web Vitals

Following the Page Experience Update, Google released a set of metrics called Core Web Vitals, which measure three core components of a website – page loading speed, responsiveness and visual stability.

These metrics provide a clear insight into the experience users have when clicking through to your website. Google analyses the metrics for mobile and desktop and concludes how the page performs.

FID (First input delay) – The time taken from when a user interacts with a page to the time the browser takes to load. The chart below highlights the ideal speed for a page to react.

LCP (Largest contentful paint) – Demonstrates the time for the biggest visual element to load. You can optimise LCP by introducing lazy loading to increase load speeds and make images ‘load only’ so they only load as users scroll down a web page.
CLS (Cumulative layout shift) – A user-centric metric that measures the frequency of changes in layout shifts for users. A CLS average of 0.1 or less is the goal. To eradicate sudden shifts, you can change the size dimensions of images and graphics to inform the user’s browser of how much space is needed.

Google Search Console provides a breakdown of how each page on your website performs against these Core Web Vital metrics under the ‘Experience’ tab. This data represents how your website has performed in the previous 28 days. Optimising your website’s loading speed will show changes within the next 28 days on Google Search Console. But if you can’t wait that long, you can test single-page performance using Google’s Page Speed Insights.

For more information, read our guide on why you should pay attention to Core Web Vitals.

Any key website pages recorded as ‘poor’ in this report should be assessed as a priority. Here are some common issues that can cause the FID and LCP reports to fail:

Images

Believe it or not, images substantially impact the speed of a web page. Images not optimised (compressed) properly contribute to slower load speeds, particularly from a mobile device.

Large, high-resolution images consume a lot of bandwidth when loading, which causes your website to slow down. If your website has many images that haven’t been optimised, this could cost you greatly.

Some ways to fix these issues

Compress images
Serve images from a Content Delivery Network (CDN)
Serve images (i.e. png and jpgs) in next-generation formats (such as WebP, jpeg 300 and jpeg XR)
Lazy load images (great for e-commerce websites with lots of product images on a category page)

Minifying code

The minification process reduces the number of characters from the source code to reduce file sizes. Minifying source codes such as HTML, CSS and JavaScript means the source codes of file sizes are rewritten via compression, condensing their size whilst sustaining the functionality.

Benefits of minification include:

Smaller file sizes that reduce load times
Optimised performance for an improved user experience
Faster page speeds that keep users on your site for longer

JavaScript and render-blocking

Although JavaScript is present on almost every website, it is resource-heavy to load, which impacts a site’s usability. We require JavaScript and CSS to add a small dynamic sprinkle across websites in the form of images and graphics. But what does this mean for the technical health of a website? Well, inundating a site with lots of JavaScript files can make it difficult for Google to understand all that juicy content on a web page.

Before a user is presented with a web page, the browser loads the page and parses the HTML. A link to a CSS or JavaScript file causes the browser to pause HTML parsing to obtain and run the code, which slows down the entire process. This code is also referred to as render-blocking. The portions of code in CSS and JavaScript files hinder websites by preventing fast loading speeds.

Google’s PageSPeed Insights is great for checking render-blocking resources’ impact on page load speeds. Following this formula means any issues that crop up can be identified, and a site can be optimised to inject more speed for page loads and improve overall performance.

Understanding how search engines index your website

Ensuring URLs are indexed correctly is an absolute must to preserve and enhance the technical SEO of a website. Indexation is how the URLs of a site are structured. They should be indexed in a clear format for Google to crawl and render. After all, if Google can’t crawl or render URLs properly, that impacts visibility and damages the health of a website.

The best way to check this is to use Google Search Console and inspect all the URLs of a website. Health checks are the most effective way to assess indexation issues:

Submitted and indexed – This checks that the number of pages submitted and indexed matches the sitemap. If there is an anomaly, further exploration into what pages aren’t indexed and the reasons why is needed.
Indexed not submitted in sitemap – This figure demonstrates the number of indexed pages not currently in a sitemap. There may be a good reason for this, but thoroughly checking all pages and discovering why is necessary.
Crawl anomaly – This means Google hasn’t been able to request these URLs. The protocol here is to check for any excluded URLs that should be indexed and identify any clear issues.
Crawled not currently indexed – Some site owners purposely ensure specific URLs aren’t indexed, as the page may be of low quality or contain similar content to another already indexed page. Carrying this process out will indicate a clearer picture of any URLs that should be indexed.
Discovered, currently not indexed – An inspection of this may lead to discovering URLs that need indexing. Requesting them to be indexed quickly will minimise the impact on your site.
Excluded by no index tag – A no-index tag may be applied to a URL that doesn’t offer any value to the site. Ensuring no URLs have slipped through the net and need indexing is vital.
Dropped from index – If a URL has dropped from the index, Google can’t detect the page. Checking the impact this may have on any quality backlinks is crucial so they can be redirected to another URL if necessary.

We’ve also created a helpful guide on how to make sure that your website is being crawled properly.

Understanding how duplicate content impacts your website

If more than one page has exactly the same content on your website, you could have a duplicate content issue. It’s a common misconception that Google will penalise your website for duplicate content, but this isn’t true.

The implications of duplicate content

So, how harmful is duplicate content for a website? Simply put, the two pages with duplicate content compete to rank competitively for the same terms, which often means neither can rank competitively.

Our in-depth duplicate content guide explains the common causes of duplicate content, how to spot them, and how to fix the issues.

Canonicalisation

For SEOs, canonicalisation is a valuable method to improve site structure and ensure visibility for Google. The process involves selecting a URL as the main page for ranking – this is what canonicalisation is.

A popular tactic used to combat the adverse effects of duplicate content is URL canonicalisation. Setting a URL as the canonical tells Google which one should be prioritised. It helps Google by pointing to the URL that it should recognise. Setting canonical URLs is a simple process:

Choosing the page – Depending on the content, which URL needs to be canonicalised may be apparent. If it is harder to decide, then use traffic rates as an indicator for prioritisation.
Adding a rel=canonical link – To notify Google when a page needs to be canonicalised, a rel=canonical tag will need to be added to the meta tag in the HTML header of the non-canonical page. Alternatively, the Yoast plugin can add a suitable canonical tag for a WordPress site, which is a nice little convenient feature!

Be wary when adding the rel=canonical link for mobile subdomains, as you’ll need a snippet of additional code ‘rel=alternate’, so Google knows that it should be redirected to an alternate version of desktop content.

Looking at canonical URLs is essential to aid the technical health of a website. A thorough check using Screaming Frog and Google Search Console should analyse the following elements for canonicalisation:

Alternate page with proper canonical tag – Indicates the page correctly points to a canonical.
Duplicate, Google chose a different canonical than user – An occurrence where Google chooses a different canonical. It generally means there is an issue with duplicate content, which may cause indexation issues.
Duplicate, submitted URL not selected as canonical – Similar to the previous element, where a canonical URL is missing, Google may choose other URLs as it considers the submitted URLs as duplicates.
Duplicate without user-selected canonical – Highlights any duplicates that haven’t been marked explicitly with a canonical. Quickly addressing these helps minimise the impact on a website.

How to fix duplicate content

Notifying Google of any canonical changes is crucial. Without explicitly communicating to Google any changes, there isn’t likely to be a dramatic overnight surge in rankings. Creating duplicate content should be avoided, but redirecting duplicate content to the canonical URL is the next best option if it can’t be. Adding a canonical link element to a duplicate page or an HTML link from a duplicate page to a canonical page are also effective solutions.

Aside from canonicals, other methods can be used to minimise duplicate content:

301 Redirects – These are response codes that inform Google a page has permanently moved to a new URL. They effectively condense duplicates to original URLs and combine multiple duplicates into one page.
Noindex tags – When a page is no-indexed, it tells Google not to index a page, so it doesn’t show in search results. A tag can be added to the <head> of a webpage or added to a ‘no-index’ header in the HTTP request. It is imperative to allow the crawling of pages for Google to recognise the submitted tags.
URL parameter tools – Google’s URL parameter tool will give websites with over 1000 URLs an insight into how to treat a site’s parameters.

Duplicate content is a common issue that can be found on most websites. The key is highlighting it by conducting frequent technical checks and dealing with it swiftly to minimise the impact on rankings and visibility.

Website architecture

Imagine entering a supermarket without any signs indicating the location of specific food items. If you know the store’s layout, then it’s likely not too much of an issue. For somebody who has walked into the store for the first time, they likely change which store they visit when shopping next.

Applying this analogy to the structure of websites is how site architecture is viewed. Time is precious, and users are likely to click away from a site if the structure is jumbled and hard to navigate to access the information they wish. To retain users and boost conversions having a healthy website structure is necessary.

A good website structure ensures bounce rate is low, traffic is good, and engagement rates are high. It aids rankings and propels a website further up the SERPs.

Site architecture is important for the following elements:

It enables Google to crawl websites effectively.
A strong internal linking structure provides ensures good topical authority.
Greater depth of pages to improve site navigation.
Google can identify pages easily so content generates leads and increase conversions.
Evenly distributes page authority across pages.

How to improve site architecture for SEO

To ensure the architecture of a website is optimised for SEO, always consider the following factors:

Website consistency – Links, designs, and formatting should all resemble each other, maximising the potential for users to remain on a page.

Internal linking – Develop pillar content with cluster pages containing related topics that internally link. It strengthens linkage across a website and helps users discover helpful content related to the subject matter.

URL structure – URLs should be user-friendly and hierarchical in structure. It elevates the user experience and aids Google when crawling.

Simplified navigation – Minimising top-level menu options keeps a website streamlined, so it’s easier to develop specific content for users to view. It also helps ease the process when creating navigation paths.

The benefits of structured data

When Google converts content on a webpage into code, this is what’s known as structured data. It’s used in conjunction with Schema (a language-based markup that Google uses to decipher structured data on a web page). Structured data reads the content on a website and translates it into readable code. This is so it can easily detect and display rich results on SERPs based on how rich in detail the content is. The main benefit is it makes life simpler for Google to rank pages.

As Google advances and Search becomes more technologically advanced, the need for structured data becomes more apparent. It helps crawlers understand the content on a page to serve the SERPs with better content.

How to audit structured data

If schema is added to a website, then it’s best to check how it is performing, as this can affect the health of a website. The Google Search Console’s Enhancements tab helps web users identify schema markup prevalent across the website (E.g. Logos, Products, FAQs).

Any structured data with valid or invalid data will display as a bar chart. Any working structured data is validated in the report. All valid URLs are eligible for detection for Google’s rich results search, meaning more potential for visibility and rankings.

Types of structured data

Several structured data types can be applied to a website to increase the potential for rich results. These include:

Breadcrumbs – helps Google and users understand page position within a site’s hierarchy.
FAQs – Mark up FAQ pages or sections on a web page to increase the possibility of a search query appearing in ‘People also ask’ or Frequently asked questions on the SERPs.
Sitelinks searchbox – Adding the structured data enables Google to add a SERP feature if it recognises value for a user.
How-To – Users will see content on a website when searching for ‘how-to’ content on SERPs.

How to add structured data to your website

Google’s Structured Data Markup Helper is the best way to add structured data to a website. Navigate to the webpage where to add schema and highlight the appropriate areas to add data tags. An option to create the HTML should appear, and the schema markup will then generate. It should appear on the right-hand side of the page, ready to be published. Copy and paste the markup into a CMS or source code of a web page and click ‘Finish’ so it completes the process.

Google’s structured data testing tool is adept at diagnosing any issues with HTML markup. We recommend downloading the Structured Data Testing Chrome plugin as it is suitable for use with websites in their development stage to check all structured data is clean before launching.

A handy tip is to use tracking software like getSTAT to highlight potential opportunities for structure data markup. It shows priority keywords with SERP features like popular products or FAQs to increase the scope for appearing in Google’s rich results.

HTTP response codes

When a web browser attempts to connect to a server, a HTTP response code is returned. Depending on the nature of the user’s request will determine what response code generates. They can easily be mistaken for errors, but not all response codes mean there is an error. 200s are success codes that require no further action. 300s response codes indicate redirections. 400s codes are client errors, and 500s codes are server errors. Let’s dive into some of the most common response codes and the best solutions to fix them:

301 – A webpage has moved permanently to another URL. Check the redirect setup is correct.
302 – A URL temporarily redirected to a different URL. Set up a redirect using a WordPress plugin to solve this.
401 – Unauthenticated request received by the server, so the page doesn’t load. The URL may be incorrect. Alternatively, clearing the cache and cookies may solve the issue.
404 Not Found – The server cannot detect the page as the user searches. Fix this by using a plugin to amend the broken link and set up a redirect to a different URL.
500 – A concerning sign of technical health, the 500 code indicates an internal server generic error. It means it detected several issues that require an extensive fix.
502 Bad Gateway – If an invalid response is received by one server from another, a 502 status code returns as the request has taken too long. Clearing caches from a browser or simply waiting a couple of days if a site has newly migrated are common fixes.

Finding HTTP response codes

Use Screaming Frog to crawl the URL of a website and inspect the response codes within the internal links tab. The right-hand side tab will exhibit the response codes from the full range of status code classes.

XML Sitemaps

An XML sitemap is a file on your website that provides information to search engines about pages, blogs and videos ready to be indexed. If managed correctly, this handy file is a surefire way for Google (and other search engines) to find your website’s priority pages easily.

Submitting your sitemap to Google Search Console or Bing Webmaster Tools ensures that search engine crawlers can find and index your key pages.

display under the ‘Sitemaps’ tab on the main dashboard. When conducting a health check, this helps to identify any anomalies with URLs not indexed that are submitted to the sitemap.

Signposting your sitemap in your robots.txt file ensures that search engine crawlers can easily find it as they enter your website, which should look like this:

Sitemap: https://candidsky.com/sitemap_index.xml

Common XML sitemap errors and solutions

Submitted URL has a crawl issue

The most common issue in Search Console is errors in an unspecified sitemap. Loading the page in the browser can provide the right answers, as the page may load slowly, contain too many redirects or display a specific response code. Submit a request to Google for this to be reindexed.

Submitted URL not found (404)

Any 404s should be removed from the sitemap and redirected to an alternative page on the website. Screaming Frog is the tool to use to discover the crawl status of all pages in a sitemap to ensure thorough error checking. Resubmit the sitemap to Google once these issues are fixed.

Submitted URL is a soft 404

A soft 404 means the page doesn’t exist to the user, but Google thinks it exists as it receives a 200 response code. To resolve this, a 301 redirect should be created to send the user to an existing page.

Submitted URL returns unauthorized request (401)

Only users with login access can access this type of page. The URL and internal links should be removed from a sitemap that directs to this page.

Submitted URL marked ‘noindex’

The ‘noindex’ tag allows site owners to prevent page indexing. The pages should be reviewed in Screaming Frog so that the noindex tag or the pages can be removed from the sitemap.

Submitted URL not selected as canonical

Site owners should specify the canonical URL they want included in the sitemap to ensure Google knows which URL to index. The source code must be checked if a sitemap lists a different version to the primary page.

Robots.txt

A robots.txt file is the master text file for Google to crawl, index and read the pages of a website effectively. It is an important file for crawlers, as it allows users to be shown the content that site owners want or do not want their visitors to see. Primarily, crawlers use this file to discover and index content that users search for. The robots.txt file is the first file a crawler reads, containing the requisite information so Google knows how to crawl the website.

There are several different robots.txt files that each have a different purpose. Let’s have a look at the main files, using the Candidsky website as an example URL.

Robots.txt file URL: www.candidsky.com/robots.txt

User-agent: * Disallow: / – This tells crawlers not to crawl pages on www.candidsky.com

User-agent: * Disallow: – Instructs crawlers to crawl all pages on www.candidsky.com

User-agent: Googlebot Disallow: /www.candidsky.com/candidsky-subfolder/ – Google’s crawler is told not to crawl pages from this URL string.

However, the robots.txt file isn’t one-dimensional and offers more than crawling access. It can be deployed to prevent duplicate content from appearing in SERPs, prevent specific elements on a page from being crawled and specify the location of a sitemap. They also come in handy for preventing staging sites from being crawled. All these elements help boost technical SEO.

Mistakes to avoid

Some pitfalls can be costly if implemented, so it’s always best practice to check for the following items:

Ensure all content that needs to be visible isn’t blocked to crawlers.
Any links blocked by a robots.txt file won’t be followed, so any links may be crawled but not indexed, which means no link equity will be passed through.
Any sensitive data should be blocked using an alternative means (e.g. a password or a noindex tag), as it may still be indexed.

Website security

Any website should always be secured to prevent harm to a website. It’s also a big factor in google’s page experience ranking. Strong website security profoundly impacts SEO performance, so it’s important to have airtight measures in place. Beefing up page security can increase webpage ranking signals, improve user trust signals, and boost connection time across the website network.

If you have a WordPress website make sure to check out our blog – 8 WordPress security tips

HTTP v HTTPS

There isn’t much difference between the two, aside from HTTPS uses encryption and verification to encrypt HTTP requests and responses through SSL encryption. Any potential attacker to a website would see a random generation of characters instead of the actual text. For this reason, HTTPS offers much greater security than HTTP.

Websites that don’t use HTTPS are more vulnerable to security breaches, as anybody monitoring the session can read all requests and responses. HTTP doesn’t ask for verification. There is a degree of trust. An SSL certificate HTTPS uses the server to prove its legitimacy as a website host. Aside from SSL certification, enabling HTTP3 protocols, reducing access to unsecured HTTP connections, or upgrading to TLS 1.3 are additional ways to enhance website security.

Conclusion

There are so many facets to maintaining the technical health of your website for SEO, and a deluge of information can be hard to digest. Each strand of SEO matters, and we hope our extensive guide has covered exactly why.

We understand there are lots of different strands to maintain and get to grips with, and that’s why the assistance of an award-winning SEO agency like Candidsky can be a great asset to help grow your business.

Our team of SEO experts will ensure the technical health of your website elevates your business. Want to find out more about why we’re regarded as one of the best SEO agencies in Manchester? Then contact us at with any queries to discuss how we can help you.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.