Bug #1211

"SEO-friendly" URLs pretty much duplicate the first page of anything

Added by Peter Hall over 1 year ago. Updated 4 months ago.

Status:Closed Start date:09/04/2010
Priority:Normal Due date:
Assignee:Tom Moore % Done:

100%

Category:Other
Target version:1.6.5
Reproducibility:Always Database Type:
Reported In MyBB Version:1.6.0 Database Version:
PHP Version: SQA assignments:Stefan T.
Browser:

Description

When using SEO-friendly URLS, the first page of a forum or topic has two different URLs: one used by the pagination, one used by the rest of the software. For example, the index page links to the forum list like this:

http://community.mybb.com/forum-127.html

But the pagination on page 2 of that forum links to it like this:

http://community.mybb.com/forum-127-page-1.html

A search engine or a browser will see this as two pages with duplicate content - the "newpost" and "lastpost" links have a similar effect. Perhaps not a major issue, but...

Associated revisions

Revision 5236
Added by Tom Moore over 1 year ago

Fixes SEO-friendly URLs pretty much duplicate the first page of anything (fixes:1211)

Revision 5358
Added by Tom Moore 12 months ago

Fixes SEO-friendly URLs pretty much duplicate the first page of anything (fixes:1211)

Revision 5469
Added by Tom Moore 8 months ago

Fixes "SEO-friendly" URLs pretty much duplicate the first page of anything (fixes #1211)

Revision 5571
Added by Tom Moore 5 months ago

Fixes SEO-friendly URLs pretty much duplicate the first page of anything (fixes #1211)

Revision 5575
Added by Tom Moore 5 months ago

Fixes SEO-friendly URLs pretty much duplicate the first page of anything (fixes #1211)

History

Updated by Ryan Gordon over 1 year ago

I'm no expert in SEO, how would you fix the issue? Header redirection?

Updated by Tom Loveric over 1 year ago

Couldn't you just check if page-1 is in the URL and strip it out? Or just make it so if you can't find page-1 in the URL, it redirects to the URL with page-1 included?

Updated by Peter Hall over 1 year ago

Just make sure only one or the other is linked to. Either use "page-1" in all links to forum view or topic view, or stop the pagination from putting "page-1" in the links.

Updated by Nayar Joolfoo over 1 year ago

this also exists with non-seo urls. http://community.mybb.com/thread-79043.html

Updated by Michael Malin over 1 year ago

  • Status changed from New to Confirmed

You could check whether it is the 1 page and when it is, it is forwardet to the normal page or the normal link is showed from beginning (that would be some work to change it).

Updated by Tom Moore over 1 year ago

  • Category set to Other
  • Status changed from Confirmed to Assigned
  • Assignee set to Tom Moore
  • Target version set to 1.6.1

Preparing patch.

Updated by Tom Moore over 1 year ago

  • Status changed from Assigned to Resolved
  • % Done changed from 0 to 100

Applied in changeset r5236.

Updated by Stefan T. over 1 year ago

  • Status changed from Resolved to Closed

Updated by Chris Köcher about 1 year ago

r5236 causes a little issue for some plugins which use multipage but have another kind of URLs than the default ones. For example: http://www.example.com/{page}/
The "{page}" was parsed for all numbers except the first page. The link to the first page wasn't parsed.
A possible fix would be to replace lines 963-969:

        $find = array(
            "-page-{page}",
            "&page={page}",
        );

        // Remove "Page 1" to the defacto URL
        $url = str_replace($find, array("", ""), $url);

with:
        $find = array(
            "-page-{page}",
            "&page={page}",
            "{page}" 
        );

        // Remove "Page 1" to the defacto URL
        $url = str_replace($find, array("", "", $page), $url);

Would be nice to see this fixed in MyBB 1.6.2... :-)

Updated by Andreas Klauer about 1 year ago

Good catch...

This issue even affects MyBB. In private.php the parameter for multipage is not page= but read_page= and unread_page=. So if you go to your private message folder, and to page 2 of either the PM inbox or tracking folder, and hover over the link back to page 1, you see it literally points to page={page} instead of page=1 or nothing.

However you still end up in the right place :) because "{page}" evaluates to 0 and page 0 and page 1 both cause page 1 to be shown, usually... so most people wouldn't even ever notice this bug. Google SEO is affected by this as well (uses ?page= and not &page= - and before you get any ideas, you can't remove the ?), however it also removes the page parameter by itself so no one notices that one either...

The suggested fix solves this issue as it provides a fallback - it ensures {page} is replaced properly, if it couldn't be removed.

Updated by Tom Moore about 1 year ago

I'm not sure we should be fixing issues that occur from plugins adding extra pages, although if there are places in MyBB (I've noticed the breadcrumb still has it in, for example) where there is still page=1, then please report it.

If the page is not publicly viewable - as Andreas mentioned - then this issue doesn't really relate to it (unless it causes issues somewhere else). It was originally to fix duplicate content from search engines.

Updated by Andreas Klauer about 1 year ago

multipage() or more specifically fetch_page_url() produces broken links for example private.php?action=tracking&unread_page={page} - this is a new issue in MyBB (regression introduced by r5326). Fix provided above by Chris Köcher seems to work fine, please apply it. Thank you.

Updated by Chris Köcher about 1 year ago

It was originally to fix duplicate content from search engines.

Yes, but the intention wasn't th breaking of plugins... :-P

As Andreas said, its only a fallback for the case where the {page} couldn't be replaced/removed. The fix doesn't cause any performance issues. It's only a little change of 3 lines which isn't a big amount of work... ;-)

Updated by Stefan T. about 1 year ago

  • Status changed from Closed to Feedback

Updated by Stefan T. about 1 year ago

  • SQA assignments set to Stefan T.

Updated by Michael Malin 12 months ago

Stefan, maybe you forgot to write some text? :P

Updated by Stefan T. 12 months ago

Everything was explained in the comments above...

Updated by Tom Moore 12 months ago

  • Status changed from Feedback to Resolved

Applied in changeset r5358.

Updated by Chris Köcher 12 months ago

Thank you for fixing that! :-)

But I've found another little issue: On showthread the breadcrumb with popop links to forum-*id*-page-1.html instead of forum-*id*.html
I don't know if this is fixed since the release of 1.6.1. Maybe someone could check this...

Updated by Damian Moskalski 10 months ago

In 1.6.2 package on repo this bug didn't fixed.

Updated by Stefan T. 9 months ago

  • Status changed from Resolved to Feedback

Chris Köcher wrote:

Thank you for fixing that! :-)

But I've found another little issue: On showthread the breadcrumb with popop links to forum-*id*-page-1.html instead of forum-*id*.html

The breadcrumb popup seams to use different urls. forum-*id*.html?page=1
But I don't see the reason for this.

Updated by Tom Moore 8 months ago

  • Status changed from Feedback to Resolved

Applied in changeset r5469.

Updated by Stefan T. 8 months ago

  • Status changed from Resolved to Closed

Updated by Stefan T. 6 months ago

  • Status changed from Closed to Feedback
  • Target version deleted (1.6.1)

The link in the navigation still shows thread-*-page-1.html.

Updated by Tom Moore 5 months ago

  • Status changed from Feedback to Resolved

Applied in changeset r5571.

Updated by Andreas Klauer 5 months ago

doesn't this turn page=123 to 23?

Updated by Tom Moore 5 months ago

  • Status changed from Resolved to Feedback

Indeed it will. Looks like another case for preg_x.

Updated by Tom Moore 5 months ago

  • Status changed from Feedback to Resolved

Applied in changeset r5575.

Updated by Stefan T. 4 months ago

  • Status changed from Resolved to Closed
  • Target version set to 1.6.5

Also available in: Atom PDF