Algorithms

Last Updated: 

In the recent year I have come across the heated debates about the Latent Semantic Indexing in SEO industry: is it used by Google and can it help your search engine optimization strategy? Many well-known websites claim that LSI keywords can help the rankings – but what does it really mean?

 What is LSI? 

Latent Semantic Indexing or latent semantic analysis refers to a technique developed in 1980s for natural language processing that is aimed at identifying relationships between the terms contained in various documents to produce sets of concepts for these terms.

With LSI, computer systems can understand relations between different terms. Most importantly, latent semantic indexing connects related phrases in clusters so that these systems can better understand and retrieve results. These aren’t necessarily synonyms of the original keyword but instead are phrases that can be found in the same articles.

 What Is the Meaning of ‘LSI Keywords’ in SEO? 

There are plenty of influencers, like Hubspot, across the web who claim that including LSI keywords in your keyword research and implementing in your onsite optimization can deliver a lot of benefits to your rankings. Can it really?

Before we decide whether ‘LSI keywords’ can boost your SEO efforts, it is important to first understand what SEO specialists refer to.

When talking about Search Engine Optimization, Latent Semantic Indexing is referred to vocabulary terms, or synonyms, of the words and phrases the content or page is optimized for. When indexing the page, Google looks for the context of the page in order to interpret its purpose and deliver the most relevant search results. And it does not only refer to using the synonyms, it rather refers to using strategic keywords together.

LSI Examples

For instance when you are doing a keyword research for ‘SEO Services’ and optimizing your page, think of any other keywords that you would like to include in your page. These keywords might not even be synonyms, meaning that one word might not ‘replace’ the other. However we know that these keywords are likely to resonate with the ‘intent’ of the person searching for the certain service.

Good example: when you are searching for ‘institutions in Toronto’, Google comes back with a variety of schools, educational websites, universities some of which do not even contain the word ‘institutions’. Google simply interprets the intent – it assumes I am looking for educational institutions based in Toronto.

And that is what the search engine optimization specialists refer to when talking about LSI keywords: using the variation of keywords for context  vocabulary. In their paper published in 2016 Google refers to this approach as “Improving semantic topic clustering for search queries with word co-occurrence and bigraph co-clustering”.

Pampers and Scotch: What Do They Have In Common?

 

Being of Russian origins, I can’t help but refer to my native language.

When talking about diapers Russians will often say ‘pampers’. There is of course an old school name for diapers in the Russian language however ‘pampers’ is the more dominant word in the spoken language. Pampers was the first and for a long time the only brand that sold disposable diapers on the Russian market, therefore people started calling disposable diapers ‘pampers’. So people will say things like “I bought Huggies pampers”, “Can you tell me more about the Moonies pampers”, “buy pampers for adults” etc.

Another example is ‘scotch’, which is something I actually had to deal with with one of my Russian clients couple of years back. Russians call the sticky tape – scotch, of course after the famous brand Scotch. So when searching for the sticky tape, they will search for ‘scotch’ written in Cyrillic alphabet. However Scotch company has claimed the name (rightfully of course) written in Cyrillic, so the sellers are no longer allowed to use it to name the category on their website, or optimize the page for it unless it is referred exclusively to the Scotch brand. Whoever tries – gets an angry email threatening lawsuit unless removed. However I doubt Russians will ever stop calling the sticky tape ‘scotch’.

So the same thing happened with the LSI, although it does not refer to the original technology and the word got ‘re-adopted’, we all are well aware what one refers to when talking about LSI keyword benefits for SEO optimization.

 So Does Google Use LSI Technology? 

The answer is a firm ‘NO’. However does Google use the ‘LSI keywords’ in a sense that search engine optimization specialists mean it? Absolutely. We could of course argue that LSI and Google’s semantic approaches to keywords are completely different things, but the bottom line is that whatever you prefer to call it – semantic searches, clusterization of the niche keywords based on the intent, context vocabulary – will most certainly deliver the benefits to your overall rankings and traffic.

So even though the term LSI has been somewhat misused here, the intent with which we use it for SEO remains correct. Much like the Russians with their pampers and scotches. They might be wrong to call every single brand of diapers ‘pampers’, but in Russia when you say it, everyone knows that you are not looking for that specific brand, you are just looking for diapers.

Same here, when talking about LSI, everyone realizes that you are not talking about some technology developed in the 80s, right? So what is the point of debate? Sure, SEO industry ‘re-adopted’ the meaning of the word LSI, but doesn’t it happen with a lot of words and terms currently in use today?

Why Doesn’t Google Use LSIs?

Based on everything said so far, it would seem logical that Google would use this technology. However, there are good reasons why this isn’t the case.

First and foremost, LSI is an obsolete technology. While the basic premise is sound, there have been much better solutions for determining content relevancy for specific search queries. Furthermore, LSIs work much better when assessing smaller clusters, as it doesn’t work that well when skimming through the entire web.

Regardless, that doesn’t mean semantics is completely useless for marketing and search engine optimization. Most notably, implementing the optimal semantic phrases is crucial for search intent. Basically, when covering a specific topic, Google expects that you go through all these different related categories, each exemplified through LSIs.

This methodology pertains to both the appearance of unique phrases and repetitions. For example, if your main keyword is “winter coat,” it is expected that the keyword will appear numerous times in the post. Furthermore, the text will also include related phrases like “fashion,” “apparel,” and “winter clothing.”

Google’s Methodology

With that in mind, here’s how Google’s algorithms assess different articles and match them with specific search queries:

  • Knowledge Graph is a type of semantic network that holds a plethora of information about different subjects and entities. It can also establish relationships between different terms, making it crucial for establishing relevancy
  • Natural language processing is a technology that allows programs and systems to comprehend human language. Most importantly, it allows algorithms to better interpret the meaning behind certain phrases, sayings, and terminology
  • Artificial intelligence takes a look at the global meaning of works and puts them into context. The best thing about AI is that it has become increasingly sophisticated over time, which makes its analysis and use better and better

Obviously, Google doesn’t rely on just one thing to interpret phrases. Instead, it utilizes different systems to deliver the best-possible results to its users.

Utilizing Latent Semantic Analysis in SEO

Basically, LSA, or latent semantic analysis, is a mathematical process that allows us to find related words in various types of texts. These concepts might not be directly connected to the main keyword (i.e., synonyms, extended or shortened versions of the keyword) and, instead, are just phrases you’d expect to find in a specific article.

During analysis, computer systems can better understand relations between related and seemingly unrelated phrases. It allows them to discover patterns and most frequent usages. For example, a semantic cluster for “SEO” could be “search engine optimization” but also “marketing,” “content writing,” “link building,” and “website.”

In other words, when you write an article about SEO, it is expected that all these keywords will appear in it. Although we might inherently understand some phrases that are part of a semantic corpus, we need analysis to find the other ones.

That being said, here are five main reasons why marketing experts utilize this concept to their advantage:

Analyze Top-Ranking Pages

The best way to determine what Google is looking for from a page is by going through search results. Checking out the top-ranking pages and reading the content is the best way to find patterns that can be utilized for your content creation process. Among others, this type of analysis can be utilized for finding recurring phrases and keywords.

Then, by utilizing this knowledge, you can create pages that are tailored for specific search intent.

Keep in mind that this analysis goes way beyond just LSIs. You can also learn more about the ideal categories and subtopics to include in your piece. For example, if several articles have similar structures, this is a good indication you should follow their lead.

While this is the best way to go about things, it is also the most time-consuming methodology. Not only do you have to read all these articles, but you also need to perform an extensive analysis that can take hours upon hours to finish. This is why most marketers nowadays rely on tools to assist them with this task.

Use Content Tools

The three most popular solutions on the market right now are SurferSEO, Market Muse, and Frase. Keep in mind that each one of them has a significantly different approach and algorithms, producing noticeably different results. Still, they all work under the same basic premise:

  • A user should input their required keyword and start the analysis process
  • The tool will produce a list of keyword suggestions that you can introduce in your piece
  • Besides receiving the exact phrases you need to add, you’ll also get keyword density (the number of repetitions)
  • This type of software can also generate subheading ideas
  • Each platform comes with its content optimization score, which allows you to track how many relevant keywords you’ve added and how you compare to other top-ranking pages

It’s worth mentioning these tools don’t utilize LSI analysis. Similar to Google, they rely on advanced technology and are tailored for modern search engine algorithms. In other words, they utilize natural language processing but also AI. That way, their processes are in line with what the search engine expects to see.

While the retrieved keyword list isn’t the same as the potential LSI keyword list, you can easily notice some similarities (of course, this might vary depending on the tool you’re using). There is a strong correlation between optimal LSI phrases and sets produced by natural language processing algorithms.

Use Logic

For the most part, you can optimize keywords for specific clusters by using common sense. This is especially true if you’re a veteran content writer. For example, it doesn’t take a genius to determine that “marketing” related topics will usually go with phrases like “business” and “customer.”

In fact, the more you use tools, the more you can figure out what type of keywords would be optimal for specific queries. Generally speaking, if several main keywords are a part of the same semantic cluster, each one of them will have similar LSIs to go with them.

Of course, this doesn’t mean you shouldn’t utilize the two previously mentioned analyses. Each process has its worth, as it might help you further optimize a post for a specific search query.

Write Naturally

Sometimes, marketers go out of their way to optimize something that doesn’t necessarily require extensive optimization. What do we mean by that?

There are a lot of situations where content creators and SEO experts will make all these small tweaks in an attempt to improve an article’s relevancy for particular queries. While doing that, they often completely forget about creating content for people. So, even if you hit all the LSI marks, you’ll still lose traffic due to low engagement.

Furthermore, you’ll likely introduce the majority of relevant phrases just by properly covering the topic. As long as your article is comprehensive and answers users’ potential questions, there’s a good chance LSI will be added one way or another.

In fact, writing naturally can also eliminate some overoptimization drawbacks. Among others, you won’t overstuff your posts with LSI, something that actually reduces your optimization in content tools. The practice will also spread keywords around in a natural fashion, which could be another potential benefit.

Increase Word Count

One of the coolest tricks for content optimization is increasing the total word count.

The basic premise is simple: the longer your post, the more likely that you’ll add relevant keywords and LSIs. While word count isn’t a ranking metric, nor will bumping the word count have a direct impact on rankings, it can provide lots of secondary benefits. Aside from introducing more LSIs, this method can also improve user engagement as it might force people to stay longer on the page.

However, this practice should be utilized with a bit of caution. Increasing word count for word count’s sake can cause some issues. This is especially true if you’re covering a relatively shallow topic or if you’re being too repetitive. In other words, Google will likely notice you sacrificed quality for quantity, which might reflect in your rankings.

Conclusion

While the LSI concept is a bit convoluted and might not be important for Google’s algorithms, that doesn’t mean it’s completely irrelevant. Search engines determine relevancy based on the phrases included within the posts, so the semantics is a major consideration when creating pages. Even if you’re not relying on LSI clusters, the required keywords will be somewhat similar.

In a nutshell, your best bet is to create organic posts that would answer all users’ questions. To do so, you can check what other top-ranking pages are doing and mold your approach according to them. Of course, if you think something relevant was omitted, you should also add it to your copy.

When everything’s said and done, you can run the piece through content tools like SurferSEO and add keywords that are missing.

Algorithms