There are lots of error reports of users of the WPML and W3TC WordPress plugins experiencing error 404 pages, though they all seem related to incorrect configuration; either in the permalink settings or by incorrect language installation.

What I have been experiencing at work is different. Our Analytics reports show that existing pages are frequently – though not always – emitting HTTP 404 errors. Aside from the negative SEO impact this is having on our website, in some cases our users are unable to access our login page.

After considerable time debugging, I found the culprit and wanted to share the solution. I’ll be submitting a bug report to the affected plugins shortly.

Background and Root Cause

The WPML plugin stores multiple entries in the wp_6_posts table, each with the same post_name value. The wp_6_icl_translations table maps these records to language codes.

When attempting the load a page, WordPress will use the get_page_by_path() function to execute a query on the database looking for the page that matches the requested page_name. WPML uses a hook on the query filter (SitePress::filter_queries()) to rewrite the query generated by get_page_by_path(); specifically, it adds a JOIN to the wp_6_icl_translations table and the following order clause:

This ensures that the localised version of the page is matched first. If the post found does not match the current language the user is either i) redirected to the default language version if the current language is not the default, or ii) shown a 404 error page if the current language is the default.

That behaviour is all fine. By WordPress standards, at least.

The W3TC plugin is where conflicts are introduced.

For performance reasons, the W3TC plugin includes a wrapper method for wpdb::query(), W3_DbCache::query(). This will cache the resultset to reduce load on the database and improve response times.

However, the "cache key" – used to identify queries and their resultset – is computed as the MD5-sum of the query before the query filter is applied. That behaviour conflicts with the behaviour of WMPL as, when available in cache, the result of get_page_by_path() will always return the same page instance without consideration of the current language. Ultimately, this causes WordPress to erroneously think that a page it not available in the current language, thus an error 404 page is shown.

Example

A Thai-version of the About page does not yet exist. However, consider the following requests…

Request 1: /th-th/about/

  1. The get_page_by_path("about") function is called, requesting that the results are ordered with the "th-th" version at the top of the list, with any other language versions in no particular order.
  2. There are no cached results to return, so the query filter is applied, executed on the database and the resultset is stored in cache.
  3. Results (from DB):

    1. About (en-us)
    2. About (en) Default language
  4. The returned page (en-us) is not the current language (th-th).
  5. The current language is not the default; the user is redirected to /about/.

Request 2: /about/

  1. The get_page_by_path("about") function is called, requesting that the results are ordered with the "en" version at the top of the list, with any other language versions in no particular order.
  2. There are cached results to return, so the query filter is not applied and the resultset is returned from cache.
  3. Results (from cache):

    1. About (en-us)
    2. About (en) Default language
  4. The returned page (en-us) is not the current language (en).
  5. The current language is the default; a 404 error is shown.

Note, however, that if the default language was "en-us", the bug will be disguised in this example.

Solution

  1. Use the Ignored query stems feature of the W3TC settings to disable caching of the get_page_by_path() query:
  2. Patch W3TC so that the query filter is applied before generating the cache key.
  3. Patch WPML to use an alternative method of modifying the get_page_by_path() query.

I will soon submit a bug report to W3TC and WPML for them to consider applying a patch (options 2 or 3). In the meantime, go with the first option as an interim solution.