Wednesday, April 21, 2010 at 9:05 AM
Webmaster Level: IntermediateThat is the question we hear often. Onward to the answers! Historically, it’s common for URLs with a trailing slash to indicate a directory, and those without a trailing slash to denote a file:
http://example.com/foo/ (with trailing slash, conventionally a directory)
http://example.com/foo (without trailing slash, conventionally a file)
But they certainly don’t have to. Google treats each URL above separately (and equally) regardless of whether it’s a file or a directory, or it contains a trailing slash or it doesn’t contain a trailing slash.
Different content on / and no-/ URLs okay for Google, often less ideal for users
From a technical, search engine standpoint, it’s certainly permissible for these two URL versions to contain different content. Your users, however, may find this configuration horribly confusing -- just imagine if www.google.com/webmasters and www.google.com/webmasters/ produced two separate experiences.
For this reason, trailing slash and non-trailing slash URLs often serve the same content. The most common case is when a site is configured with a directory structure:
http://example.com/parent-directory/child-directory/
Your site’s configuration and your options
You can do a quick check on your site to see if the URLs:
- http://<your-domain-here>/<some-directory-here>/
(with trailing slash) - http://<your-domain-here>/<some-directory-here>
(no trailing slash)
- If only one version can be returned (i.e., the other redirects to it), that’s great! This behavior is beneficial because it reduces duplicate content. In the particular case of redirects to trailing slash URLs, our search results will likely show the version of the URL with the 200 response code (most often the trailing slash URL) -- regardless of whether the redirect was a 301 or 302.
- If both slash and non-trailing-slash versions contain the same content and each returns 200, you can:
- Consider changing this behavior (more info below) to reduce duplicate content and improve crawl efficiency.
- Leave it as-is. Many sites have duplicate content. Our indexing process often handles this case for webmasters and users. While it’s not totally optimal behavior, it’s perfectly legitimate and a-okay. :)
- Rest assured that for your root URL specifically, http://example.com is equivalent to http://example.com/ and can’t be redirected even if you’re Chuck Norris.
What if your site serves duplicate content on these two URLs:
http://<your-domain-here>/<some-directory-here>/
http://<your-domain-here>/<some-directory-here>
meaning that both URLs return 200 (neither has a redirect or contains rel=”canonical”), and you want to change the situation?
- Choose one URL as the preferred version. If your site has a directory structure, it’s more conventional to use a trailing slash with your directory URLs (e.g., example.com/directory/ rather than example.com/directory), but you’re free to choose whichever you like.
- Be consistent with the preferred version. Use it in your internal links. If you have a Sitemap, include the preferred version (and don’t include the duplicate URL).
- Use a 301 redirect from the duplicate to the preferred version. If that’s not possible, rel=”canonical” is a strong option. rel=”canonical” works similarly to a 301 for Google’s indexing purposes, and other major search engines as well.
- Test your 301 configuration through Fetch as Googlebot in Webmaster Tools. Make sure your URLs:
http://example.com/foo/
http://example.com/foo
are behaving as expected. The preferred version should return 200. The duplicate URL should 301 to the preferred URL. - Check for Crawl errors in Webmaster Tools, and, if possible, your webserver logs as a sanity check that the 301s are implemented.
- Profit! (just kidding) But you can bask in the sunshine of your efficient server configuration, warmed by the knowledge that your site is better optimized.


42 comments:
Excellent. The great debate is officially settled. It seems like either configuration is ok so long as you're redirects are implemented properly, and you are covering all your bases with regard to duplicate content. Thanks for the clarification.
Good, I have a question about that..
I think that is not necessary to keep a version with slash working. do you think?
My website have this pattern, Is it good or bad?
www.example.com/my-first-page -> 200
and
www.example.com/my-first-page/ -> 404
And what about this?
http://www.google.com/codesearch
http://www.google.com/codesearch/
:P
1. /underpants
2. /underpants/
3. Profit!
why the root cannot be redirected? chuck norris? What do you mean?
http://www.linkingdesign.com have it's root redirected.
Greetings,
Bart.
And what about this too?
http://www.google.com/adwords
http://www.google.com/adwords/
:P
I love how Chuck Norris was thrown in here haha.
Hi,
For a directory is good :)
http://cedricsolignac.free.fr/creations/
For a file is good :)
http://cedricsolignac.free.fr
Cheers
@Nei Rauni S:
Thanks for your comment. In a vast majority of cases, this behavior:
www.example.com/my-first-page/ -> 404
causes a poor user experience. It's often best for the slash and no-trailing-slash version to serve the same content. For example, I'm happy that when I type www.google.com/webmasters I'm not served a 404 -- I'm redirected to the content at the trailing slash URL.
However, there may be a specific reason why:
www.example.com/my-first-page -> 200
www.example.com/my-first-page/ -> 404
works for your users. If so, that's great!
@Kangarrou, @Dayvid:
Did you see our SEO report card? Sometimes we're not a shining example of best practices...
* Matt's ignite talk on our SEO Report Card
* SEO Report Card blog post
@30something: That's a money maker!
There is a slight performance hit with configuring a server to drop the trailing slash from sub-directories as it's natural behaviour would be to leave the slash appended to the URL.
Removing the slash invokes additional requests and seeing as speed is now an important consideration this may help the decision some webmasters have to make.
The Below code when put in an .htaccess file for you apache users should help, it automatically redirects the trailing slash version to the none trailing slassh
#remove and redirect trailing slashes
RewriteCond %{HTTP_HOST} !^\.yourdomain\.co\.uk$ [NC]
RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301,L]
Good highlight, many SEO's and search analyst miss in there busy schedule :)
Best,
Jag
SEWM
wordpress permalinks structure uses without slash version. That's why google blog mentioned this post.
I've often wondered about this so I always take the slash off. At least then it's one less character to type!
If Google is only indexing one version, is there anything to worry about???
I know you're google and all but Chuck Norris can redirect. http://example.com to http://example.com/
Ditto to Lance's comment. You can't say Chuck Norris can't do something, because he can do everything. The http://example.com URL actually redirected itself to http://example.com/ while begging for mercy from Chuck Norris.
That's good to hear as I very rarely end my URL's with a trailing slash as always perceived either are treated equally.
Total, which choice is better: slash or not?
I don't have much to add except to note that I LOVE the STYLE of the authors approach!
Maile,
Google reps have said that 301 redirects dampen the amount of PageRank passed. Perhaps in light of this post you can clarify how Google handles internal 301 redirects and how it differs from redirecting external links?
Will 301 redirecting a / to no-/ 'bleed' PageRank?
Thank you
i learn something now.
and what about dash (-) sign and | ?
thank google :)
Xbox 360 Hackz
It's good that google have given us this information, but one glaringly obvious fault here is this. How often do you serve different content on the same url with or without backslash? Like me your answer will proabbly be 'never'! So the best solution given here for me is to 301 redirect all non-backslashed pages to backslashed versions, seems more like a quick fix plaster than a long term solution. Or look into server config to do the redirects behind the scenes. I would rather not implement thousands of redirects so why doesn't Google just list the pages as the same URL?
I idly ran 'Fetch as Googlebot' on my Elto EMA-1 Power Meter page hosted on Google Sites:
This returns 200:
http://sites.google.com/site/ema1powermeter/home/friendly-instructions
This returns 404:
http://sites.google.com/site/ema1powermeter/home/friendly-instructions/
curiously, if I use the second link in firefox or safari the correct page is returned to the browser.
so, there are no problems here - google will index the correct page, and any user mistakenly adding a trailing slash will also be redirected to the correct page.
i'm unsure what the mechanism is that makes it work in the browser and not googlebot, but the outcome overall is good.
I have always followed the HTTP specs on this one.
example.com/page - is the extensionless URL for a page in the root.
example.com/folder/ - points to the index file in the folder.
example.com/folder/page - is the extensionless URL for a page in that folder.
and that works well.
For folders, the non-canonical "without-slash" is always 301 redirected to "with-slash".
For pages, the "without-slash" version serves the content, and the non-canonical version "with-slash" either returns 404 or is 301 redirected to the "without-slash" URL.
A URL is a /folder/ if there are further URLs like /folder/page on the site, otherwise it is a /page only. That is, /folder/page is valid and /page/page is not.
This scheme also affects the linking to other objects on the page.
A link to href="logo.png" from "example.com/something" points to "example.com/logo.png"
A link to href="logo.png" from "example.com/something/" points to "example.com/something/logo.png"
That's something that often trips the unwary.
I believe that URLs for folders should always have a trailing slash.
It's good to have posts like this one confirming how certain URL scenarios are treated by Google. Moving on from this discussion it would be good to clarify how Google treats non-encoded URLs vs encoded URLs as well. Would this essentially be treated as a dupicate content scenario?
Now I got it but how about underscores and dash in the URL, does it have an effect? This has been ask but no one answered it yet. Hopefully this one will be answered.
Thanks in advance.
@designtorontoweb: Yes, if you display both a version with underscores and a version with dash in the url it makes two unique url's (=duplicate content). Either underscore or slash is well suited for separating words, just make sure there is only one url per unique page.
Thanks for the update. I have frequently wondered about this issue.
As indicated by Dayvid above, I am also curious why Google is setup this way with adwords:
http://www.google.com/adwords - works as expected
http://www.google.com/adwords/ -404 error
Is this 404 error better for the user experience?
Maybe Google can get Chuck Norris to fix this for them. I am sure he could do that.
Hi webmaster,
Nice Post,
I have one confusing, Are PR is constant when we make the redirection from one to another.
Help! Our site is giving really conflicted "Fetch as Googlebot" results - not to mention that the root URL is the ONLY indexed URL (out of 40) from our sitemap.
We're using wordpress. When our permalink structure is setup as /%postname%/ - Googlebot only reports success for URLs that DO NOT have the trailing slash.
So when I tell Googlebot to fetch
http://www.example.com/about-us
Googlebot reports "Success!" with a 301 redirect to http://www.example.com/about-us/
BUT - when I tell Googlebot to fetch http://www.example.com/about-us/ - Googlebot returns a 404 Not Found!
Both URLs open the appropriate webpage when loaded in a browser. . . but my concern is that Google can't index our site!
I had a lot of trouble getting our permalinks to work when setup in Wordpress as /%postname%/. To finally get it working, I had to edit our .htaccess file and insert the following:
# BEGIN WordPress
ErrorDocument 404 /index.php?error=404
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
# END WordPress
Now, I'm wondering if that bit of code is preventing Google from seeing any pages deeper than the root. Does anyone know what's happening here?
"Leave it as-is. Many sites have duplicate content." :-/
What about duplicate page made for synonims, it should not be considered as a cheat for spiders as it improve user experience.
It is hard to follow google guidelines. Some people are blacklisted, some people not. Some people are sandboxed, others not.
Excellent. The great debate is officially settled. It seems like either
.
This page explains the problem well enough. But where is the actual CODE? I need the code to make non slash RUL go to slash URLs and be 301 friendly. What is the code?
Great thanks for clearing this up! I've been wondering about this for a very long time. What I want to know is how bad can it be if I get this wrong?
The current CMS of choice that I use actually doesn't redirect and has been creating duplicate pages of content, but we've never seen any negative effect.
Notice that if you use url's with trailing slash, Internet Explorer will usually strip the trailing slash away when you revisiti the URL from the briwser history.
So even if you always use tyrailing slashes on your site, you should 301 redirect non-tralinge slashes.
Now you need to tell us that Google doesn't prefer dir structured URL's (commonly called SEO optimized) to ones with query strings in. /keyword/anotherkeyword gets as much page rank as index.html?this=keyword&that=anottherkeyword
We can control it as long as the linking is done by us, but when someone else links to us - there's no way we can control it.
I'm glad to know that there's no difference :)
While I have not tested them thoroughly (use at your own risk!), I have written Apache mod_rewrite directives that force a trailing forward slash (/) for non-real URIs and perform the QSA (query-string-append) rewrite that most CMSs (Drupal, Joomla, WordPress, etc.) require for clean-URLs:
RewriteEngine on
Options All
# Modify the RewriteBase if you are using a subdirectory and the
# rewrite rules are not working properly:
#RewriteBase /
# Force a trailing slash on non-real URIs:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ %{REQUEST_URI}/ [R=301,L]
# Rewrite URLs of the form 'index.php?q=x':
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
It should be noted that the line
RewriteRule ^(.*)$ %{REQUEST_URI}/ [R=301,L]
seems preferable to the alternative
RewriteRule ^(.*)$ $1/ [R=301,L]
because the former does not require the RewriteBase to be defined for sites that exist within sub-directories (relative to the Web-server's Document Root), whereas the latter does.
It should be trivial to reverse the behavior and force URLs WITHOUT the trailing slash; the following two lines should be changed from
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ %{REQUEST_URI}/ [R=301,L]
to
RewriteCond %{REQUEST_URI} (.*)/$
RewriteRule ^(.*)$ %{REQUEST_URI} [R=301,L]
I did not get one point. Is it better to have the root domain
www.example.com or
www.example.com/
I mean usually people type the url without a slash. Isn´t it bad if they get redirected while doing this. As well most of the links going to the root of a website are without a slash.
Second question can I treat the rest of the URL in another way than the Root-URL? Meaning if I do the root without a slash, can I have the rest with a slash or the other way around?
Hi everyone,
Since over a year has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Central Help Forum.
Thanks and take care,
The Webmaster Central Team
Post a Comment