URL redirection
Web Design & Development Guide
URL redirection
Home Up
URL redirection, also called URL forwarding, domain
redirection and domain forwarding, is a technique on the
World Wide Web for making a web page available under many URLs.
Purposes
There are several reasons for a webmaster to use redirection:
Similar domain names
Users might search for the same information under slightly different URLs,
e.g. gooogle.com and googel.com. An organization can register these domains and
re-direct them to the correct location: google.com . Alternatively, an
organization can register these domains and re-direct them to its own website,
thus catching the traffic of careless typers.
Moving a site to a new domain
A Web site might change its domain name for several reasons. An author might
move his or her pages to a new domain or two sites might merge. With URL
redirects, incoming links to the old URL can be directed to the new location.
These links might be from other sites that have not realized that there is a
change or from bookmarks/favorites that users have saved in their browsers.
The same applies to
search engines. They have the older domain in their database and will link
visitors to the URLs found previously. By using a "moved permanently" redirect
to the new URL, visitors will still end at the correct page. Also, in the next
crawl, the search engine should detect and use the newer URL.
Load balancing
Redirects issued by the server or a redirect page are sometimes used to
distribute requests to reduce bandwidth usage, the redirects usually being
rotated between the main site and
site mirrors.
Logging outgoing links
The access logs of most web servers keep detailed information from where
visitors came and how they browsed the hosted site. They do not, however, log
which links visitors left by. This is because the visitor's browser has no need
to communicate with the original server when the visitor clicks on an out-going
link.
This information can be captured in several ways. One way involves URL
redirection. Instead of sending the visitor straight to the other site, links on
the site can direct to a URL on the original website's domain that automatically
redirects to the real target. This added request will leave a trace in the
server logs saying exactly which link was followed. This technique is also used
by some corporate websites to have a "warning" page that the content is off-site
and not necessarily affiliated with the corporation. This technique does bear
the downside in the delay of an additional request to the original website's
server. For websites that wish to display a "warning" page before automatically
forwarding, the length of time the warning is displayed is an additional delay.
Short, meaningful, persistent aliases for long or
changing URLs
Currently, web engineers tend to pass descriptive attributes in the URL to
represent data hierarchies, command structures, transaction paths and session
information. This results in a URL that is aesthetically unpleasant and
difficult to remember. Sometimes the URL of a page changes even though the
content stays the same.
Manipulating search engines
Some years ago, redirect techniques were used to fool search engines. For
example, one page could show popular search terms to search engines but redirect
the visitors to a different target page. There are also cases where redirects
have been used to "steal" the page rank of one popular page and use it for a
different page, usually involving the 302
HTTP status code of "moved temporarily."
Search engine providers noticed the problem and took appropriate actions.
Usually, sites that employ such techniques to manipulate search engines are
punished automatically by reducing their ranking or by excluding them from the
search index.
As a result, today, such manipulations usually result in less rather than
more site exposure.
Satire and criticism
In the same way that a
Google
bomb can be used for satire and political criticism, a domain name that
conveys one meaning can be redirected to any other web page, sometimes with
malicious intent.
Manipulating visitors
URL redirection is sometimes used as a part of
phishing
attacks that confuse visitors about which web site they are visiting.
Techniques
There are several techniques to implement a redirect. In many cases,
Refresh
meta tag is the simplest one. However, there exist several strong opinions
discouraging this method.
Manual redirect
The simplest technique is to ask the visitor to follow a link to the new
page:
Please follow <a href="http://www.example.com/">link</a>!
This method is often used as a fallback for one of the following methods: If
the visitor's browser does not support the automatic redirect method, the
visitor can still reach the target document by clicking on the link.
HTTP status codes 3xx
In the HTTP computer protocol used by the World Wide Web, a redirect is a response with a
status code beginning with 3 that induces a browser to go to another
location.
The HTTP standard defines several status codes for redirection:
- 300 multiple choices (e.g. offer different languages)
- 301 moved permanently
- 302 found (e.g. temporary redirect)
- 303 see other (e.g. for results of cgi-scripts)
- 307 temporary redirect
All of these status codes require that the URL of the redirect target is
given in the Location: header of the HTTP response. The 300 multiple choices
will usually list all choices in the body of the message and show the default
choice in the Location: header.
Within the 3xx range, there are also some status codes that are quite
different from the above redirects (they are not discussed here with their
details):
- 304 not modified
- 305 use proxy
- 306 not used
This is a sample of a HTTP response that uses the 301 "moved permanently"
redirect:
HTTP/1.1 301 moved permanently
Location: http://www.example.org/
Content-type: text/html
Content-length: 78
Please follow <a href="http://www.example.org/">link</a>!
Using server side scripting for Redirection
Often, web authors don't have sufficient permissions to produce these status
codes: The HTTP header is generated by the web server program and not read from
the file for that URL. Even for CGI scripts, the web server usually generates
the status code automatically and allows custom headers to be added by the
script. To produce HTTP status codes with cgi-scripts, one needs to enable
non-parsed-headers.
Sometimes, it is sufficient to print the "Location: 'url'" header line from a
normal CGI script. Many web servers choose one of the 3xx status codes for such
replies.
The HTTP protocol requires that the redirect be sent all by itself, without
any web page data. As a result, the web programmer who is using a scripting
language to redirect the user's browser to another page must ensure that the
redirect is the first or only part of the response. In the
ASP scripting language, this can also be accomplished using the methods
response.buffer=true and response.redirect "http://www.example.com". Using PHP,
one can
use header("Location: http://www.example.com");.
According to the
HTTP standard, the Location header must contain an absolute URI. When
redirecting from one page to another within the same site, it is a common
mistake to use a relative URI. As a result most browsers tolerate relative URIs
in the Location
header, but some browsers display a warning to the end user.
Using .htaccess for Redirection
When using
Apache web server directory-specific .htaccess file can be used.
To Move a single page:
Redirect 301 /oldpage.html http://www.example.com/newpage.html
To Change domain names:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^.*oldwebsite\.com$ [NC]
RewriteRule ^(.*)$ http://www.preferredwebsite.net/$1 [R=301,L]
This method usually does not require Admin permissions.
Refresh Meta tag and HTTP refresh header
Netscape introduced a feature to refresh the displayed page after a certain
amount of time. This method is often called meta refresh. It is possible to specify the URL of the new page, thus replacing
one page after some time by another page:
A timeout of 0 seconds means an immediate redirect.
This is an example of a simple HTML document that uses this technique:
<html><head>
<meta http-equiv="Refresh" content="0; url=http://www.example.com/">
</head><body>
Please follow <a href="http://www.example.com/">link</a>!
</body></html>
- This technique is usable by all web authors because the meta tag is
contained inside the document itself.
- The meta tag must be placed in the "head" section of the html file.
- Note the strange syntax of the content field.
- The number "0" in this example may be replaced by another number to
achieve a delay of as many seconds.
- Many users regard a delay of this kind as annoying unless there is a
reason for it.
- This is a proprietary/non-standard extension by Netscape. It is
supported by most web browsers.
This is an example of achieving the same effect by issuing a HTTP refresh
header:
HTTP/1.1 200 ok
Refresh: 0; url=http://www.example.com/
Content-type: text/html
Content-length: 78
Please follow <a href="http://www.example.com/">link</a>!
This response is easier to generate by CGI programs because one does not need
to change the default status code. Here is a simple CGI program that affects
this redirect:
#!/usr/bin/perl
print "Refresh: 0; url=http://www.example.com/\r\n";
print "Content-type: text/html\r\n";
print "\r\n";
print "Please follow <a href=\"http://www.example.com/\">link</a>!"
Note: Usually, the HTTP server adds the status line and the Content-length
header automatically.
This method is considered by the
W3C to be a poor method of redirection, since it does not communicate any
information about either the original or new resource, to the browser (or
search engine). The W3C's
Web Content Accessibility Guidelines (7.4) discourage the creation of
auto-refreshing pages, since most web browsers do not allow the user to disable
or control the refresh rate. Some articles that they have written on the issue
include
W3C Web Content Accessibility Guidelines (1.0): Ensure user control of
time-sensitive content changes and
Use standard redirects: don't break the back button!
JavaScript redirects
JavaScript offers several ways to display a different page in the current
browser window. Quite frequently, they are used for a redirect. However, there
are several reasons to prefer HTTP header or the refresh meta tag (whenever it
is possible) over JavaScript redirects:
- There are several reasons for some users to disable JavaScript:
- Security considerations
- Some browsers don't support JavaScript
- many crawlers (e.g. from search engines) don't execute JavaScript.
- There is no "standard" way of doing it: A search for "you are being
redirected" will find that virtually each JavaScript redirect employs
different methods. This makes it difficult for Web client programmers to
honor your redirect request without implementing all of JavaScript.
Frame redirects
A slightly different effect can be achieved by creating a single HTML
frame that contains the target page:
<frameset rows="100%">
<frame src="http://www.example.com/">
</frameset>
<noframes>
Please follow <a href="http://www.example.com/">link</a>!
</noframes>
One main difference to the above redirect methods is that for a frame
redirect, the browser displays the URL of the frame document and not the URL of
the target page in the URL bar.
This technique is commonly called cloaking. This may be used so that
the reader sees a more memorable URL or, with fraudulent intentions, to conceal
a phishing
site as part of
website spoofing.[1]
Redirect loops
It is quite possible that one redirect leads to another redirect. For
example, the URL
http://www.wikipedia.com/wiki/URL_redirection (note the differences in the
domain name) is first redirected to
http://www.wikipedia.org/wiki/URL_redirection and again redirected to the
correct URL:
http://en.wikipedia.org/wiki/URL_redirection. This is appropriate: the first
redirection corrects the wrong domain name. The second redirection selects the
correct language section. Finally, the browser displays the correct page.
Sometimes, however, a mistake can cause the redirection to point back to the
first page, leading to an infinite loop of redirects. Browsers usually break
that loop after a few steps and display an error message instead.
The
HTTP standard states:
- A client SHOULD detect infinite redirection loops, since such loops
generate network traffic for each redirection.
Previous versions of this specification recommended a maximum of five
redirections; some clients may exist that implement such a fixed limitation.
Services
There exist services that can perform URL redirection on demand, with no need
for technical work or access to the webserver your site is hosted on.
URL redirection services
URL redirection services exist to shorten long
URLs.
Some web publishers have criticized the use of these services, arguing that
replacing a URL with an encoded shortcut effectively erases information from a
document. For instance, a redirected URL may link to a blacklisted site.
Hyperlinks involving URL redirection services are frequently used in spam
messages directed at blogs and wikis. Thus, one way to reduce spam is to reject
all edits and comments containing hyperlinks to known URL redirection services;
however, this will also remove legitimate edits and comments and may not be an
effective method to reduce spam.
URL obfuscation services
There exist redirection services for hiding the referrer using META refresh.
This is very easy to do with PHP, such as in this example.
<?php
/* This code is placed into the public domain */
/* Will redirect a URL */
$u=$_GET['url'];
?>
<html>
<head><title>Redirect</title>
<meta http-equiv="refresh" content="0; URL=<?php echo($u); ?>">
</head>
<body>
You should be able to be redirected to <a href="<?php echo($u); ?>"><?php echo($u); ?></a>.
</body></html>
This code can then be accessed by example,
http://example.org/redirect.php?url=http://www.google.com
See also
- For URL redirection on
Wikipedia, see
Wikipedia:Redirect.
References
-
^
Anti-Phishing Technology", Aaron Emigh, Radix Labs, 19 January 2005
External links
Home Up Cascading Style Sheets Printer friendly Brochureware Digital strategy DOM scripting Fahrner Image Replacement Microformat Progressive enhancement Rollover Spacer GIF Techniques for creating a User Centered Design URL redirection Web Interoperability Web modeling Web template Web-safe fonts Website architecture Website wireframe
|