Which HTML elements are more in demand: analysis of 8 million pages from an SEO perspective

Which HTML elements are more in demand: analysis of 8 million pages from an SEO perspective








When you do SEO for a long time, many things become commonplace. Prescribe title, description, alt for images - what could be more obvious? But in practice, it turns out that on many sites even such simple recommendations of Yandex and Google are not implemented.







We present the results of a large study on the use of various HTML elements on site pages. What HTML elements are used most often? Do webmasters fill in meta tags and which ones? What is there with micro-marking? Spoiler: everything is far from perfect.







The content of the article
Meta tags that Google understands

<meta name = "description" content = "...">

<title>

<meta name = "robots | googlebot">

<meta name = "viewport" content = "...">

<meta charset = "...">

<meta http-equiv = "refresh" content = "...; url = ...">

<meta name = "rating" content = "..." />

<meta name = "google" content = "nositelinkssearchbox">

<meta name = "google-site-verification" content = "...">

<meta name = "google" content = "notranslate">

Structured Data (JSON-LD)

Rel = "canonical" attribute

meta name = "keywords"

Headers h1-h6

Alt attribute

Language definition

Google tag manager

Rel = "nofollow" attribute

Facebook Open Graph

Social Media Links

rel = "prev" / "next"

To summarize: there is still work to do



The study of HTML elements was conducted by specialists from AdvancedWebRanking . We analyzed 8 million pages from the TOP-20 of Google for 30 million search queries. The original study is here . Detailed analysis with refined data is in the Catalin Rosu article for MOZ.







We present the results of the study, which will be useful for understanding current trends in technical page optimization.







Meta tags that Google understands



This section discusses the meta tags listed in the Search Console Help list . These tags are recognized by the Googlebot when crawling pages.







Which HTML elements are more in demand: analysis of 8 million pages from an SEO perspective








<meta name = "description" content = "...">



Description is a meta description of the contents of the page. Usually consists of 110-150 characters. In fact, this is an annotation that allows the search engine and user to understand what content is placed on the page.







The content of the description is usually taken into account by Google when generating snippets in the search results (but Yandex often generates snippets at its discretion).


The study showed that 54.9% of sites have a completed description. On 4.7% of sites, the meta tag is registered, but the content attribute is empty (i.e. the tag is empty). And on 0.2% of sites, the tag is registered without the content attribute at all.







Which HTML elements are more in demand: analysis of 8 million pages from an SEO perspective






<title>



Title is technically not a meta tag, but since it is written in the <head> section and it conveys information about the contents of the page, it is also called a meta tag - similar to description.







From an SEO perspective, title is critical. Namely, the title content is displayed in the search results of Google and Yandex as a snippet title. Search engines pay attention to title when determining the relevance of a page.


Despite the importance of the title, it is 78.3% full. Interestingly, only 5.6% of the pages on which the title is written do the contents of the title completely duplicate h1. That is, most sites still make h1 and title different, which is correct from an SEO point of view.







As for the length of the title, it is believed that it should be no more than 60 characters - approximately as much is displayed in the headers of snippets. You can do a longer title, but then the most important information should be placed in the first 50-60 characters.







Which HTML elements are more in demand: analysis of 8 million pages from an SEO perspective






Title up to 30 characters is not so bad. For example, the name of a product or article may be short. On the other hand, you should not leave an empty space in the title if it can be filled (for example, in addition to the name of the product, indicate some feature - color, size, material, etc.).







Technical optimization is a long and meticulous work. If you don’t have time for this, automate the process. In the SEO module of the PromoPult platform, you can eliminate errors on the site, increase the level of its optimization, usability and conversion. The system will audit your site in more than 60 areas, create a checklist work plan and calculate the optimal budget. From you - only the approval of the budget, access to the site and acceptance of work. In fact, for you, everything happens "in the background."


<meta name = "robots | googlebot">



The robots meta tag tells search engines whether it is possible to index and follow links on a page. The googlebot meta tag defines the crawl rules only for Googlebot (for Yandex - yandexbot).







Unlike the disallow directives in robots.txt , rules in the robots meta tag are considered a more reliable way to prevent individual pages from being indexed.







The robots meta tag is found on 19.7% of sites, googlebot - by 1.7%. That is, on most pages, scanning rules are set using robots.txt, X-robots-tag or not set at all.







The most popular robots meta tag is with the values ​​"index, follow". It allows page indexing and click through. TOP 5 robots meta tags - on the histogram.







Which HTML elements are more in demand: analysis of 8 million pages from an SEO perspective






<meta name = "viewport" content = "...">



This meta tag tells the browser how to display the page on a mobile device. The presence of a viewport indicates to Google that the page is optimized for mobile devices. This meta tag is on 62.4% of the pages. That is, 37.6% of pages are not optimized for mobile.







In the context of the transition from July 2019 to Mobile-First indexing, optimization for mobile is very important. And it's not only about the viewport meta tag, but also the improvement of mobile usability, and page loading speed. We conducted research on these topics, and so far the situation is far from ideal.







These studies are:









<meta charset = "...">



The charset attribute indicates the encoding of the document (usually UTF-8) in HTML 5. Using this meta tag, the encoding is set on 48.8% of the sites.







This does not mean that no encoding is specified on other sites. After all, it can be defined in another way - for example, the meta tag <meta http-equiv = "Content-Type" content = "text / html; charset = utf-8">. The main thing is that the encoding is indicated, otherwise there may be problems with displaying the contents of the page in the browser (an incomprehensible character set instead of readable text - this still happens!).







<meta http-equiv = "refresh" content = "...; url = ...">



This meta tag sends the user to a different URL and is used as a simple redirect method. Please note that it is not supported by all browsers and may mislead users.







W3C does not recommend using redirects with meta tags. The best option is 301 redirects. It is not surprising that only 0.1% of sites use the refresh meta tag. These are probably pages on outdated designers where there is no access to management at the server level and the ability to configure 301 redirects in the admin panel.







<meta name = "rating" content = "..." />



This meta tag marks the page as containing adult content. This page does not appear when using SafeSearch.







Google recommends using these meta tags for adult content:







<meta name = "rating" content = "adult" />







<meta name = "rating" content = "RTA-5042-1996-1400-1577-RTA" />







These meta tags are on 1.7% of the pages.







<meta name = "google" content = "nositelinkssearchbox">



This is an exotic meta tag that is used on only one thousand sites out of 8 million.







Google search results sometimes display a search box on your site. The nositelinkssearchbox meta tag tells Google that you don’t need to display this field.







It is difficult to imagine a situation in which it is worth giving Google such a recommendation. Therefore, there is nothing surprising in the low demand for this meta tag.







<meta name = "google-site-verification" content = "...">



One way to verify site ownership in the Google Search Console is through this meta tag. It is placed on 16.6% of the analyzed sites.







<meta name = "google" content = "notranslate">



Another specific meta tag. He tells Google not to offer to translate the page if the user's language is different from the language of the page. This tag uses only 0.1% of sites.







Structured Data (JSON-LD)



Structured data is a standardized format for providing information about a page and classifying its content.







Structured data is described using Microdata, RDFa, or JSON-LD. Google understands all of these formats, but recommends using JSON-LD (in Yandex search this format is not yet supported ).


Data markup (regardless of its type) using JSON-LD is available on 34.1% of sites. Most often, JSON-LD is used to mark up a site search. Because of this, an additional search string for a site in a snippet may appear in the Google search results. Layout of social profiles, logo, data of local business is also popular.







Which HTML elements are more in demand: analysis of 8 million pages from an SEO perspective






Read more about JSON-LD here . We also wrote about the quick setup of micro-layout in different ways.







Rel = "canonical" attribute



This attribute tells the search engine that the page on which it is placed is a priority for indexing. It is used to combat duplicate pages that appear for various reasons (print versions, pagination pages, pages with dynamic parameters, etc.). Occurs on 40% of the pages.







meta name = "keywords"



It has been 10 years since Google announced that it does not take into account the keywords meta tag when ranking pages. However, it is 32.2% full. On 3.2% of the pages, the meta tag is registered, but has empty values.







It is interesting that Yandex can take into account the keywords meta tag when determining the correspondence of pages to search queries. But now it is most often not filled.


Headers h1-h6



The h1-h6 headers allow you to structure your document. Therefore, they are important for SEO. Despite this, h1 headings are found only on 59.6% of pages, h2 - by 58.9%, h3 - by 49.6%.







After collecting information on all headers, it turned out that the most popular one was h3 (by 42% of the total number of headers) in terms of frequency of use.







Interestingly, the study found 23,116 h7 headers and even 7,276 h8 headers. How justified their use is the question, because even few people prescribe h5-h6.







Alt attribute



This is an attribute of the <img> tag. It indicates alternate text for images. This text is displayed instead of the image if its display is disabled in the browser.







For SEO, the alt attribute is important because search engines understand what the image is about. This allows you to take a good position in the search for images and attract additional traffic. But in fact, alt is filled only in 11.9% of the images. In 6.4% of cases alt is registered, but with an empty value.







Which HTML elements are more in demand: analysis of 8 million pages from an SEO perspective






Read about techniques for optimizing images for search on our blog.







Language definition



To indicate the common language of the page (document) or individual words in the content, the lang = "*" attribute is used in the HTML markup. This attribute is present on 65% of the pages analyzed.







The hreflang = "*" attribute is used to indicate alternative language versions. Google recommends hreflang specifically for localized versions of pages. A similar recommendation is given by Yandex. This attribute was found on 21.6% of the pages.







Google tag manager



Google Tag Manager is a tag management system that allows you to add / update tracking codes and other code fragments (tags) to a website or mobile application.







The analysis of the sites revealed that the fragment * googletagmanager.com / gtm.js is only 4.3% of the pages.







Rel = "nofollow" attribute



If Google sees a link with the attribute rel = "nofollow", then it does not follow it and does not transmit the link weight. Typically, this attribute is used in links to untrusted sources and in advertising content.







An analysis of 8 million pages revealed 12.8 million links with the attribute rel = "nofollow". That is an average of 1.6 nofollow links to a page.







In September 2019, Google announced that in addition to the rel = "nofollow" attribute, it would recognize two more attributes:









In other cases, when you do not want to transfer the weight of the page, rel = "nofollow" is still used.







Little by little, the webmaster’s new attributes began to be used: two weeks after the news, 278 sponsored links and 123 ugc links were found.







Yandex has not yet introduced any innovations regarding rel = "nofollow".







Facebook Open Graph



The Open Graph markup (protocol) allows the content that you share on social networks to look exactly the way you need it. The protocol is developed for Facebook, but it also supports VKontakte, Pinterest, Twitter, LinkedIn, Telegram, WhatsApp, Viber, etc.







Which HTML elements are more in demand: analysis of 8 million pages from an SEO perspective






The most common tags are:









The og: description (page description) and og: locale (site localization - language) tags are found about 2 times less often.







Social Media Links



Analysis of links to social networks showed that the most popular social network is Facebook. 77.3% of pages link to it. In second place is Twitter (65.2%). But one must understand that this is a Western study. Naturally, in Russia the situation is different.







Which HTML elements are more in demand: analysis of 8 million pages from an SEO perspective






Interestingly, 12.7% of pages still link to Google+ - despite the fact that this social network was closed in April 2019. This is probably just inertia.







rel = "prev" / "next"



Since March 2019, Google has not supported these page navigation attributes. Interestingly, attribute support was disabled in fact several years ago, but Google officially announced this only this year.







Now, instead of the rel = "prev" / "next" attributes, Google recommends placing content on one page, rather than splitting it into multiple pages.







The study revealed that the attribute rel = "prev" is used on 0.3% of the pages, rel = "next" - on 3% of the pages.







To summarize: there is still work to do



Understanding what an average web page looks like gives an idea of ​​current trends. And then questions arise.







Why do webmasters rarely prescribe alt for images? Why are there h1 headings on only 60% of the pages? Why is title and description far from complete? Why are they in no hurry to implement JSON-LD micro-markup? But this is basic SEO ...







I want to say that all this does not matter - nevertheless, the pages from the TOP-20 of Google were analyzed. That is, Google considers them authoritative in the mass - even without altos.







But do not forget that SEO is a complex of factors. We considered only a small technical aspect. Even if we take only website optimization, then there are more than 60 different works. And also links, mentions, localization, behavioral factors ...







So from the "twist" of basic technical SEO, your site will definitely not suffer. And if you consider that not everything is so good on other sites, then the growth is more than real.








All Articles