Magento makes it easy to group and bundle products, but that can lead to SEO problems.
Magento is the world’s most-used e-commerce platform, powering online retail sites of all shapes and sizes. Although Magento is widely used, lots of merchants still have the same issues from an SEO perspective, which impacts their visibility, traffic and revenue. This piece covers a few of the more technical issues that Magento has out of the box—you can also read the definitive guide piece I wrote.
Product setup
Lots of merchants choose to use Magento because of the options around product setup, including configurable products, simple products, grouped products and bundled products. Although this is a great feature, it can cause issues around duplication and also cannibilisation. Lots of retailers will use a configurable product alongside individual simple products, for example—either to showcase the variants on product list and search pages, or for Google Shopping and its Product Listing Ads. Doing this means that you have several different variants (e.g. shirts with colour and size variants) all being crawled and indexed and, in most cases, with the same copy and other information.
In this scenario, I would advise that you use the canonical tag to prevent all of the different versions of the product from being indexed. The simple fix would be to set the canonical URL to the configurable version for all of the representative URLs, providing a primary version to be served via search engines.
This will be complex to do (I’ve pushed developers to create a back-end field at product level in most cases), but it will solve issues around duplicate content and help to improve product-level visibility. It’s worth noting that if you’re doing this for an existing site, make sure you do your due diligence into the amount of traffic and revenue the variants are generating beforehand, as there is an element of risk and it’s a big structural change.
Magento rewrites
Magento’s rewrite engine is well known for being troublesome at times—causing issues around random redirect loops, old URLs being indexed alongside the new URLs, etc. The most common issue with this is for the old /catalog/product/xxx URLs to be indexed, alongside the clean version of the product, as in this example:

In the early days, Magento structured URLs in this way. Now they’re re-written to appear cleaner and be more search friendly. But lots of sites have issues where the rewrites stop working and the two URLs are both accessible, and is some case served via category and  product list pages. This can be an annoying issue and you should be performing regularly crawls and checks around these being indexed. You should then work with your developers to fix the rewrite logic.
Another issue around this comes from when product URLs are rewritten and numbers are appended to the URLs. This is very common with Magento and is known for causing issues with product-level rankings. In some versions of Magento, there is a bug that causes URLs to be rewritten repeatedly, meaning infinite loops are created. I’ve seen this have a big impact on organic visibility in the past and it can be a tricky one to fix: It often requires an upgrade, depending on the version you’re using. The same problem can also be caused by using the same URL keys on products (also an issue with products being used in different multi-store instances).
Dynamic pages
Dynamic pages being indexed is probably the biggest and most common issue for Magento sites. Magento stores commonly use layered navigation out of the box and they’re left indexable by default. The same principle applies to search pages and sort and order parameters, which also represents a common issue for e-commerce sites.
Layered navigation can result in thousands and thousands of URLs being indexed, and I’ve seen it have a huge impact on a store’s visibility. I usually suggest trying to prevent the URLs from being crawled, particularly for larger sites. That said, it’s important to keep an eye on how search engines are crawling your website (using the server logs) if you do go down this route. I’ve seen issues arise where some products aren’t being crawled as regularly because they’re not linked to from other pages. In this instance, you may want to be more strategic about which types of pages are being crawled and maybe use meta-robots tags to instruct search engines to crawl the links on the pages, but not to index them.
Lots of merchants use the canonical tag, but find that it rarely works when the pages are changing a lot as a result of products being filtered. I’ve seen parameter handling work quite well in the past too, but I’m also very cautious relying on just this.
Pagination is also another issue. I’d recommend using a noindex, follow meta-robots tag alongside rel next and prev. Out of the box, Magento would use the canonical tag, which should also really be removed if you’re using the robots tag.
Session IDs are also often used on Magento stores—usually as part of tracking sessions through the checkout. These URLs can very easily be crawled and indexed and need to be blocked.
Magento 2
Lots of merchants are now starting to look at migrating over to Magento 2, which is the recently released rewrite of version 1. Magento 2 has a number of key improvements, particularly around performance, which will in turn impact SEO, particularly from a crawlability perspective.
Magento 2 also has structured data out of the box (unlike Magento 1.x) and they’ve also moved sorting and ordering parameters to AJAX.
These are just a few of the changes I’ve noticed—they’re promising lots of new feature requests over the next few months, so it’ll be interesting to see what else is added.
Paul Rogers provides enterprise-level Magento consulting and auditing services.
