Apache Redirect Craziness

July 21, 2008

Update July 30, 2008: The Google crawl appears to be fixed, the search results look good now, so this issue is resolved! Hoorah.

Update July 23, 2008: There is great discussion going on on Sphinn about this issue.

I was shocked after searching Google today, Google is currently redirecting to my homepage and _ not the actual page searched for. _


“Why is this?”I thought to myself. I tried the url that you should get… http://marcgrabanski.com/pages/code/jquery-ui-datepicker … ok that works fine.


Now try the old url that has most of the link juice attached to it which has a 301 redirect to the new page… http://marcgrabanski.com/code/ui-datepicker/ … ok that works fine too. So what is going on?


First, I go to Google Webmaster and see this, an SEO’s nightmare:



Time to check my 301 redirects :



And on another 301 redirect tool :



Well that worked great.Still, what is the issue?


To make absolutely sure my 301 redirects work, I dumped the text redirects into a htaccess file.My htaccess code now looks like this:



<FilesMatch"\.(htm|html|css|js|php)$">
AddDefaultCharset UTF-8 DefaultLanguage en-US
</FilesMatch>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php http://marcgrabanski.com/articles [R=301,L]
RewriteRule ^code\.html http://marcgrabanski.com/pages/code [R=301,L]
RewriteRule ^code/beyond-flash(/?) http://marcgrabanski.com/pages/code/beyond-flash [R=301,L]

  1. MANY RedirectRules … ALL WORK FINE
    RewriteCond %{HTTP_HOST} ^www.marcgrabanski.com$ [NC]
    RewriteCond %{REQUEST_URI} !.tags.php. [NC]
    RewriteRule ^(.)$ http://marcgrabanski.com/$1 [R=301,L]
    RewriteCond %{REQUEST_FILENAME}.php -f
    RewriteCond %{REQUEST_URI} !/$
    RewriteRule (.
    ) $1\.php [L]
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^(.+)/$ /$1 [R=301,L]
    RewriteRule ^$ webroot/ [L]
    RewriteRule (.*) webroot/$1 [L] AddHandler php5-script .phpbq.



Update July 23, 2008 1:24AM: I changed all of the Rules to @RewriteRule ’s which has cleaned up most of my 301 redirects.One issue remains, I need to figure out how to make a RewriteRule to convert:
http://marcgrabanski.com/tags.php?tag=FreeTools
to…
http://marcgrabanski.com/tag/free-tools


13 comments

#1. Meredith Speier on July 21, 2008

You’re right…everything checks out on your pages, so the problem’s got to be on Google’s end. But what? No clue. I would keep checking online to see if anyone is reporting the same thing, misery loves company! Good luck figuring it out.

MS

#2. Herus Armstrong on July 21, 2008

Did you have tried reupload to webmaster tool an new sitemap from your site? I’ve some problems about it on eGames.com and Huge/Big problems with SEO on it… some 404 errors disappear after I uploaded a new sitemap. Try it, don’t hurt.

#3. Marc Grabanski on July 21, 2008

Herus Armstrong: I did upload a new sitemap right away when I saw this problem. I hope things clear up shortly! Thanks for your suggestion.

#4. aimClear on July 21, 2008

Let’s see who reads you blog at the ’plex :). Great case study Marc

#5. g1smd on July 21, 2008

There are several errors in your .htaccess file.


  1. Do not mix rules from different modules, otherwise you will not be able to guarantee the order they are processed in. Redirect comes from Mod_Alias and RewriteRule comes from Mod_Rewrite. Use only Mod_Rewrite for all of these.

  2. Make sure that all Redirects are processed before all of your Rewrites. Failure to do so will expose your internal filepaths to browsers and bots.

  3. Make sure that the most specific Redirects are processed first, and that they fix the domain etc within their own redirect. List the general “catch-all” stuff last 9like the generic non-www to www redirect).

  4. Test, test, and test again – for all expected inputs and as many UNexpected inputs that you can think of (www and non-www, with and without trailing “/” on folder names, etc. Use the Live HTTP Headers extension for Mozilla Firefox to make sure that you do not have “chained redirects” as those can cause major issues.

That said, there are several problems noted with the data in WMT at the moment. There are several threads over at WebmasterWorld that have been running since the end of 2008 June with related various notes and observations.

#6. Marc Grabanski on July 21, 2008

Thank you very much for your comment, I will check my website against your points and post my findings and changes.

#7. Marc Grabanski on July 22, 2008

I changed all of the Rules to RewriteRule’s which has cleaned up most of my 301 redirects. One issue remains, I need to figure out how to make a RewriteRule to convert:
http://marcgrabanski.com/tags.php?tag=FreeTools
to…
http://marcgrabanski.com/tag/free-tools

#8. Mr. LSI on July 23, 2008

You should be able to do that by simply escpaing the ? like you escape a .

I think you ment to say you want to rewrite /tag/bla to tags.php?tag=bla though right?

You can use variables in .htaccess for that much like you use them in this line:
RewriteRule ^(.*)$ http://marcgrabanski.com/$1 [R=301,L]

Except here you’d use:
RewriteRule ^tags/(.*)$ tags.php?tag=$1 [L]

(I’d drop the 301 and just make your link show up as /tags/bla in the url

#9. Marc Grabanski on July 23, 2008

Mr. LSI: Thanks for your comment, I get what you are doing and it may help in the future, but it doesn’t work for my case…

I think you ment to say you want to rewrite /tag/bla to tags.php?tag=bla though right?


I meant what I said originally. I need to map tags.php?tag=FreeTools to tag/free-tools.
I tried escaping the ? and it doesn’t work. It seems like RewriteRule doesn’t factor in anything but the URL – it seems to ignore the query params.

#10. Mike on July 23, 2008

You need RewriteCond to match the query string portion, something like (untested):

RewriteCond %{REQUEST_URI} /tags.php$
RewriteCond %{QUERY_STRING} ^tag=([A-Za-z0-9\])
RewriteRule ^(.*)$ /tag/%1 [R=301,L]

#11. Marc Grabanski on July 23, 2008

Mike: That would work well for the normal keywords, so thanks for that!

The catch is that I have a few keywords that are camel-cased, such as, “FreeTools” which need to be translated to, “free-tools”. The rewrite rule was too complex for me, so I opted to create a tags.php file that detects these specific types of words and 301 redirects them in PHP. That seems to work fine and is much simpler for me to do.

Everything appears to be buckled up here in 301 redirect land.
Thank you everyone for your help! Now I have to wait for googlebot’s next crawl.

#12. terry kernan on July 28, 2008

you should just use wordpress!
i myself didn’t want to use it because i often preferred the hand coded option, but after seeing the feature list and the even bigger list of plugins that are available, well, it made my legs turn to jelly and made me cry. it’s wonderful.

#13. Marc Grabanski on July 28, 2008

Wordpress is great and I do a lot of contracting work with it – I do recommend it for clients. However, this is my personal website and it is meant to be a sandbox and a development environment for me to test completely custom code. I get to innovate and do whatever I want with my own website and it works great. Many of my great successes in the workplace have come as a direct result of my work on this website and other personal projects.
I recommend Wordpress for most people. I do however recommend writing your own website to people who want to invest a lot of time into personal career growth and code development skills.

Leave a comment

Comment in textile images by gravatar