This article describes what log files are, why they are important, what you should be aware of, and the tools you use. Finally, here are nine practical ways you can analyze them for SEO.
What is a server log file?
A server log is a log file (or files) that is automatically Cork E-Bikes Zone created and maintained by a server that consists of a list of activities performed by the server.
For SEO purposes, I'm interested in web server logs that contain a history of page requests to websites from both humans and robots. This is sometimes called the access log, and the raw data looks like this:
seo log file analysis
Yes, the data looks a bit overwhelming and confusing at first, so let's break it down and take a closer look at the "hits".
All servers typically provide similar information organized into fields, although hit logging is essentially different.
Below is a sample hit to the Apache web server (which has been simplified-some fields have been removed).
220.127.116.11 – – [01 / March / 2018: 12: 21: 17 +0100] “GET” – “/wp-content/themes/esp/help.php” – “404” “-” “Mozilla / 5.0 ( Compatibility; Googlebot / 2.1; + http: //www.google.com/bot.html) ”– www.example.com –
seo log file analysis
As you can see, each hit provides important information such as the date and time, the response code of the requested URI (in this case 404), and the user agent from which the request originated (in this case Googlebot). ). As you can imagine, log files consist of thousands of hits every day. Every time a user or bot visits your site, many hits are recorded for each page requested, such as images, CSS, and other files needed for rendering. page.
Why are they important?
So you know what log files are, but why is it worth analyzing them?
In fact, there is only one real record of how search engines such as Googlebot processed your website. And that's by looking at your server log files for your website.
Search consoles, third-party crawlers, and search operators don't give us a complete picture of how Googlebot and other search engines interact with your website. Only access log files can provide this information.
How do I use log file analysis for SEO?
Analyzing the log files gives you a great deal of useful insights, including the ability to:
Exactly what you can and cannot crawl.
Shows the responses encountered by search engines during the crawl (eg 302, 404, soft 404).
Identify crawl shortcomings that can be accompanied by broader site-based effects (such as hierarchies and internal link structures).
You may see the pages that search engines prioritize and consider them to be the most important.
Discover areas of wasted crawl budget.
It introduces some of the tasks you can perform while analyzing log files and shows how they can provide practical insights into your website.
How do I get the log file?
This type of analysis requires raw access logs from all web servers in the domain without filtering or changing. Ideally, you'll need a lot of data to make your analysis valuable. The corresponding number of days / weeks depends on the size and permissions of your site and the amount of traffic your site generates. For some sites, a week may be sufficient, or a month or more of data may be required.
Web developers should be able to send these files. It is worth asking them before sending if the log contains requests from multiple domains and protocols, and if they are included in this log. Otherwise, this will prevent you from correctly identifying your request. It is not possible to distinguish between http://www.example.com/ and https://example.com/ requests. In such cases, you should ask the developer to update the log configuration to include this information for the future.