Summary of English Google SEO office-hours from December 31, 2021

google seo office hours dec 31

This is a summary of the questions and answers from the Google SEO Office Hours with John Mueller on December 31st, 2021.

See the full recording below

1. Blog and pages are not getting indexed on a regular basis but tags are getting indexed on google. Why?

– Post in the webmaster help forum with the exact URL and searches you are doing so other people can take a look. The reason could be the homepage being indexed with a link to your blog post or maybe have a separate tag page where you list all of the blog posts that has that tag and that page is being indexed. And from a practical point of view, that can sometimes happen and these get indexed before the actual content.

2. Sometimes our URL is indexed properly but we are unable to find the cache file.

– That can happen normally in the sense that the cache pages in the search are handled separately from the indexing. It can happen that we have a page in the index that doesn’t have a cache file. That’s essentially how the system works.

3. Can a poor translation on a new language (Automated generated) version of a website affect negatively on SEO of an established domain with the main language version.                                                                                                                          

– Short answer is yes. The main issue here is less about these being translated versions of the content, but more that for some things google looks at the quality of the site and when they look for the quality of the site overall if you have a significant portion of lower quality it doesn’t matter for google why they are low in quality if they are just bad translation or if they’re terrible content. But if google sees that they’re significant parts that are low in quality then google might think overall the website is not good.

4. How does Google assess if something is an automated translation or if something is of poor quality?

– I don’t know if we have anything that specifically looks for low-quality translations. So at least the way that I understand it’s more the matter of us trying to understand the quality of the website overall. And that’s usually not something where they’re individual things that we could just point and say that if you have five spelling mistakes on a page that’s a sign of low quality. These things happen individually. And all of these factors I think individually are hard to say that they’re a sign of something being low quality, but rather we have to take everything together and figure out what mix is together. 

And that’s also a reason why sometimes when you significantly improve the quality of a website overall or when things get significantly worse, it just takes a lot of time for our systems to figure out overall the view of this website is now better or worse. So from that point of view, it’s not that we have anything specific that we could point at. The best that I could do, if you want individual items, is to look at the blog post we did on core updates last year which has a bunch of different questions in there which you could ask yourself about the website.

5. We have a site where there are 100 pages that are tagged with meta robots noindex tags, but they are accessible to the users. There are a lot of good authority sites in the industry that are linking back to these pages. So though we are getting referral traffic but we aren’t getting anything because obviously, they are not indexed. What if we set up 301 redirects for Googlebot on these URLs to some relevant pages? Will that be against Google guidelines?

– That seems kind of borderline. It also feels like the kind of thing where you might just use a rel=” canonical” and leave it at that to point at the page that you do want to have indexed. Because if you’re doing this redirect specifically for Googlebot, then I think, on the one hand, from a technical point of view, it’s very easy to get that wrong and to have something messed up. From a user point of view, I don’t think it would be a big issue because we would probably only index the target page anyway. So it’s not that a user would click on a link in the search results and end up on one page that looks very different from what they clicked on. So, on the one hand, I think it feels like there are easier solutions to this with a rel=” canonical” to do essentially the same thing, but I don’t think it would be super problematic. 

From my point of view, I would prefer to try to use the rel=”canonical” as much as possible to make sure that you don’t have to set up any separate infrastructure to cloak to Googlebot and all of the problems associated with that because it feels like you have to put a lot of work in to make that work the way that you want it. And the other approach is just so much simpler and easier and less error-prone.

6. If a text block is available in source code, but there’s no way to see that content by users. Can that text get indexed?

– Maybe. If it’s in a normal HTML on the page and it’s just hidden, then it’s possible that we pick that up and use it for indexing. I don’t think it’s a great idea to do it on purpose, but it can happen. And it is something to keep in mind, especially if you’re trying to avoid indexing some specific kind of text. So, for example, one thing that I saw recently is someone had an error message in the part of the page that was hidden. And it was only shown if there was an error on the page, but it was always on the page. And our systems picked that up and thought, well, this page is an error page we can ignore. So from that point of view it’s something where if you want it indexed, make sure it’s visible and indexable. If you don’t want it indexed, then make sure it’s not indexable and not actually on the page at all.

7. I did a website migration three months ago to a new domain. I cloned the whole website and updated internal and external links before doing it. I had AMP enabled on the old one, and my old AMP articles are always ranked in Google Top Stories. But now my new one is not. I have AMP disabled right now on my new domain because I didn’t like it, and it gives too much trouble so I don’t want to use it anymore. But AMP is not needed to be included in Top Stories right now. Why is my new domain not ranked in Google Top Stories?”

– It’s hard to say. I think if you’re doing a domain migration and switching off AMP at the same time, then especially with something like Top Stories, that might be a little bit confusing. But it sounds like otherwise, things are being picked up well. So probably, you’re on the right track there. The thing with Top Stories, in particular, is it’s an organic search feature. And it’s not something that the site gets because they deserve it, but it’s more that we try to figure out what we should be showing in a Top Stories section. And sometimes that can be more, sometimes that can be less. Sometimes that includes content from individual sites or individual types of articles and sometimes less. What I would consider doing here is, on the one hand, giving it a little bit more time.

The other thing is to double-check things around the Page Experience setting, because like we mentioned in the blog post when we turned that off, we essentially said, pages with a very good Page Experience score can essentially appear in Top Stories as well. So it’s not the case that we would take any page and show it in the Top Stories, but rather we would use the Page Experience score almost as a ranking factor to determine what we would show within the Top Stories section.

8. I sell handmade shoes. They’re all produced for a specific age range, within the same material, technique, etc. But only the design is different. Would it be counted as duplicate content by Google if I write one high-quality product description for all? Or is it better to have unique descriptions for each one, which reduces the quality of the content?

– I don’t know if unique descriptions would reduce the quality of the content. So from that point of view, I would argue that you can have both unique and high-quality descriptions. So kind of that last part I would ignore.

But the general question with regards to duplicate content is, we would probably see this as duplicate content, but we would not demote a website because of duplicate content. So from a practical point of view, what would happen is, if someone is searching for a piece of text that is within this duplicated description on your pages, then we would recognize that this piece of text is found on a bunch of pages on your website, and we would try to pick maybe one or two pages from your website to show. It’s not that we would demote or penalise your website in any way because it has duplicate content. It’s more from the practical point of view that we recognize you have this content on a lot of pages. So if someone is searching specifically for that content, it doesn’t make sense for us to show all of those pages. And that’s reasonable when people are searching for a piece of content. They don’t need to find all of the pages within your website that have that piece of content.

The thing to watch out for here is, if you don’t have anything in the textual content at all that covers the visual element of your products, then it makes it very hard for us to show these properly in the search results.

9. I have a question about notification from Search Console. It’s about my author archive pages that are missing a field URL. I would like to noindex my author archive pages. Will it have an impact on my site appearing in search? The author archive pages don’t have any keywords anyway. Are they important for E-A-T? Will my E-A-T score go down if I noindex the author archive pages?

– We don’t have an E-A-T score. So from that point of view, you don’t have to worry about that. In general, the notification you received in Search Console is probably about structured data that you’re using on these pages. And if you don’t want those pages indexed, then by noindexing those pages, you will remove that notification as well. If you’re using a plugin on your site that generates structured data there, then maybe you can disable it for those author pages, and it’ll be fixed as well. Or maybe you can fix the fields that the structured data provides, and that will also solve the problem.

My guess is that the structured data that you’re using on these pages is not critical for your site, is not something that we would show in the search results directly anyway. So from that point of view, probably you’re fine with either removing the structured data from those pages, noindexing those pages if they’re not critical for your site. All of that would be fine.

I would see this slightly differently if I knew that this was a site that focused a lot on the authority, knowledge, and the name of the authors, where if people are actively searching for the name of the author, then your collection of content by that author might be useful to have in the search results. So for those sites, I think it would be useful to keep that indexed. But then you would probably already want to keep that indexed because they’re getting traffic from search. So if you’re not seeing any traffic at all to these author pages and they’re just random people who are writing for your blog, then probably noindexing them would be fine.

What are the office hours? –

How to find the next session –


(5 Posts)

Leave a Reply

Your email address will not be published. Required fields are marked *