Web Design & Development Guide
Cross-site scripting (XSS) is a type of
computer security vulnerability typically found in
web applications which allow
code injection by malicious web users into the web pages viewed by other
users. Examples of such code include HTML code and client-side scripts.
An exploited cross-site scripting vulnerability can be used by attackers
to bypass access controls such as the same origin policy. Recently,
vulnerabilities of this kind have been exploited to craft powerful
phishing attacks and browser exploits. Cross-site scripting was
originally referred to as CSS, although this usage has been
to send executable code to a browser (even if only in a browser
sandbox). One key problem with this is the case where users have more than one
browser window open at once. In some instances, a script from one page should be
allowed to access data from another page or object, but in others, this should
be strictly forbidden, as a malicious Web site could attempt to steal sensitive
information this way. In order to fix this problem, browsers introduced the same
origin policy. Essentially, this policy allows any interaction between objects
and pages that originated from the same domain and over the same protocol. That way, a malicious Web site would not be able to access
Since then, other similar access-control policies have been adopted in other
browsers and client-side scripting
languages to protect users from malicious Web sites. In general, cross-site
scripting holes can be seen as vulnerabilities present in web pages which allow
attackers to bypass these mechanisms. By finding clever ways of injecting
malicious script into pages served by other domains, an attacker can gain
elevated access privileges to sensitive page content, session cookies, and a
variety of other objects.
The acronym CSS was often used in the early days to refer to cross-site
scripting vulnerabilities, but this quickly became confusing in technical
circles because both
Cascading Style Sheets and the
Content-scrambling system shared the same acronym. Perhaps the first use of the
abbreviation XSS was by Steve Champeon in his Webmonkey article
"XSS, Trust, and Barney". In 2002, Steve also posted the suggestion of using
XSS as an alternative abbreviation to the
mailing list. In a rare show of unity, the security community quickly adopted
the alternative, and CSS is seldom used today to refer to cross-site scripting,
although a few existing pages still use it this way.
As of now three distinct types of XSS vulnerability are known. (These will be
labeled Type 0, Type 1, and Type 2 for the purposes of this
discussion, but these names are by no means industry standard nomenclature.
Where possible, other names for these will be provided.)
This form of XSS vulnerability has been referred to as DOM-based or
Local cross-site scripting, and while it is not new by any means, a recent
cross-site scripting) does a good job of defining its characteristics. With
Type 0 cross-site scripting vulnerabilities, the problem exists within a page's
request parameter and uses this information to write some HTML to its own page,
and this information is not encoded using HTML entities, an XSS hole will likely
be present, since this written data will be re-interpreted by browsers as HTML
which could include additional client-side script.
In practice, exploiting such a hole would be very similar to the exploit of
Type 1 vulnerabilities (see below), except in one very important situation.
Because of the way
Internet Explorer treats client-side script in objects located in the "local
zone" (for instance, on the client's local hard drive), an XSS hole of this kind
in a local page can result in remote execution vulnerabilities. For example, if
an attacker hosts a malicious website, which contains a link to a vulnerable
page on a client's local system, a script could be injected and would run with
privileges of that user's browser on their system. This bypasses the entire
client-side sandbox, not just the cross-domain restrictions that are normally
bypassed with XSS exploits.
This kind of cross-site scripting hole is also referred to as a
non-persistent or reflected vulnerability, and is by far the most
common type. These holes show up when data provided by a web client is used
immediately by server-side scripts to generate a page of results for that user.
If unvalidated user-supplied data is included in the resulting page without HTML
encoding, this will allow client-side code to be injected into the dynamic page.
A classic example of this is in site search engines: if one searches for a
string which includes some HTML special characters, often the search string will
be redisplayed on the result page to indicate what was searched for, or will at
least include the search terms in the text box for easier editing. If all
occurrences of the search terms are not HTML entity encoded, an XSS hole will
At first blush, this does not appear to be a serious problem since users can
only inject code into their own pages. However, with a small amount of
social engineering, an attacker could convince a user to follow a malicious
URL which injects code into the results page, giving the attacker full access to
that page's content. Due to the general requirement of the use of some social
engineering in this case (and normally in Type 0 vulnerabilities as well), many
programmers have disregarded these holes as not terribly important. This
misconception is sometimes applied to XSS holes in general (even though this is
only one type of XSS) and there is often disagreement in the security community
as to the importance of cross-site scripting vulnerabilities.
This type of XSS vulnerability is also referred to as a stored or
persistent or second-order vulnerability, and it allows the most
powerful kinds of attacks. A type 2 XSS vulnerability exists when data provided
to a web application by a user is first stored persistently on the server (in a
database, filesystem, or other location), and later displayed to users in a web
page without being encoded using HTML entities. A classic example of this is
with online message boards, where users are allowed to post HTML formatted
messages for other users to read.
These vulnerabilities are usually more significant than other types because
an attacker can inject the script just once. This could potentially hit a large
number of other users with little need for
social engineering or the web application could even be infected by a
cross-site scripting virus.
The methods of injection can vary a great deal, and an attacker may not need
to use the web application itself to exploit such a hole. Any data received by
the web application (via email, system logs, etc) that can be controlled by an
attacker must be encoded prior to re-display in a dynamic page, else an XSS
vulnerability of this type could result.
Attackers intending to exploit cross-site scripting vulnerabilities must
approach each class of vulnerability differently. For each class, a specific
attack vector is described here. (The names below come from the
cast of characters commonly used in computer security.)
- Mallory sends a URL to Alice (via email or another mechanism) of a
maliciously constructed web page.
- Alice clicks on the link.
- The malicious web page's
- The vulnerable HTML page contains
- Mallory's malicious script now may run commands with the privileges
Alice holds on her own computer.
- Alice often visits a particular website, which is hosted by Bob. Bob's
website allows Alice to log in with a username/password pair and store
sensitive information, such as billing information.
- Mallory observes that Bob's website contains a reflected XSS
- Mallory crafts a URL to exploit the vulnerability, and sends Alice an
email, making it look as if it came from Bob (ie. the email is
- Alice visits the URL provided by Mallory while logged into Bob's
- The malicious script embedded in the URL executes in Alice's browser, as
if it came directly from Bob's server. The script steals sensitive
information (authentication credentials, billing info, etc) and sends this
to Mallory's web server without Alice's knowledge.
- Bob hosts a web site which allows users to post messages and other
content to the site for later viewing by other members.
- Mallory notices that Bob's website is vulnerable to a type 2 XSS attack.
- Mallory posts a message, controversial in nature, which may encourage
many other users of the site to view it.
- Upon merely viewing the posted message, site users' session cookies or
other credentials could be taken and sent to Mallory's webserver without
- Later, Mallory logs in as other site users and posts messages on their
Please note, the preceding examples are merely a representation of common
methods of exploit and are not meant to encompass all vectors of attack.
There are literally hundreds of examples of cross-site scripting
vulnerabilities available publicly. Just a few examples to illustrate the
different types of holes will be listed here.
- An example of a type 0 vulnerability was once found in an error page
document.location variable, to the page without any filtering or encoding.
In this case, an attacker who controlled the URL might have been able to
inject script, depending on the behavior of the browser in use.
This vulnerability was fixed by encoding the special characters in the
document.location string prior to writing it to the page.
- A famous example for type 1 XSS vulnerabilities: Two XSS vulnerabilities
in Google.com website were identified and published by Yair Amit in December
2005. The vulnerabilities allowed an attacker to impersonate legitimate
members of Google's services or to mount a phishing attack. This publication
presented an obscure way to bypass common XSS countermeasures by using UTF-7
- Two type 1 XSS vulnerabilities were exploited humorously, in August
2006, through a
fake news summary which claimed President Bush appointed a 9 year old
boy to be the chairperson of the Information Security Department. This claim
was backed up with links to cbsnews.com and www.bbc.co.uk, both of which
were vulnerable to separate XSS holes which allowed the attackers to inject
an article of their choosing.
- An example of a type 2 vulnerability was found in Hotmail, in October
2001 by Marc Slemko, which allowed an attacker to steal a user's Microsoft
.NET Passport session cookies. The exploit for
this vulnerability consisted of sending a malicious email to a Hotmail
user, which contained malformed HTML. The script filtering code in Hotmail's
site failed to remove the broken HTML and
Internet Explorer's parsing algorithm happily interpreted the malicious
code. This problem was quickly fixed, but multiple similar problems were
found in Hotmail and other Passport sites later on.
- Netcraft announced on June 16, 2006 that a security flaw in the PayPal
web site is being actively exploited by fraudsters to steal credit card
numbers and other personal information belonging to PayPal users. The issue
was reported to Netcraft via
anti-phishing toolbar. Soon after,
Paypal reported that a "change in some of the code" on the Paypal
website had removed the vulnerability.
- On October 13, 2005 Samy exploited a security flaw in MySpace resulting
in over one million friend requests being made to its creators profile.
Qualifying as a type 2 vulnerability, it used multiple XMLHttpRequests to propagate itself.
XSS vulnerability in
Community Architect Guestbook was disclosed by
Susam Pal on April 19, 2006 which can
be exploited by malicious people to conduct script insertion attacks. As a
result, many free web-hosting services which used the guestbook were
vulnerable to such attacks.
- On November 8th 2006 Rajesh Sethumadhavan discovered a type 2
vulnerability in the social network site Orkut which would make it possible
. Rodrigo Lacerda used this vulnerability to create a
cookie stealing script known as the
Orkut Cookie Exploit which was injected into the Orkut profiles of the
attacking member(s). By merely viewing these profiles unsuspecting targets
had the communities they owned transferred to a fake account of the
attacker. On December 12th, Orkut fixed the vulnerability.
Avoiding XSS vulnerabilities
Reliable avoidance of cross-site scripting vulnerabilities currently requires
the encoding of all HTML special characters in potentially malicious data. This
is generally done right before display by web applications (or client-side
script), and many programming languages have built-in functions or libraries
which provide this encoding (in this context, also called quoting or
An example of this kind of quoting is shown below, from within the
Python 2.3.5 (#2, Aug 30 2005, 15:50:26)
Type "help", "copyright", "credits" or "license" for more information.
>>> import cgi
>>> print "<script>alert('xss');</script>"
>>> print cgi.escape("<script>alert('xss');</script>")
Here, the first print statement produces executable client-side script,
whereas the second print statement outputs a string which is an HTML-quoted
version of the original script. The quoted versions of these characters will
appear as literals in a browser, rather than with their special meaning as HTML
tags. This prevents any script from being injected into HTML output, but it also
prevents any user-supplied input from being formatted with benign HTML.
The ultimate problem with trying to avoid XSS vulnerabilities is that every
situation is different. For any given situation, the needs and the issues
change. For instance, if user input is going into the src attribute of a
hyperlink, cgi.escape() would not be sufficient. Let's say a picture was to be
added to a page of pictures, in this fashion:
An attacker could enter "doesntexist.jpg' onerror='alert(document.cookie)" to
add an event which triggers when the browser fails to load "doesntexist.jpg",
executing the code.
If one were to implement a function likecgi.escape() (which comes with
Python), one would be best off converting <, >, &, " and ' characters to their
equivalent HTML entity.
As stated above, the unfortunate consequence of this fix is that users are
prevented from embedding non-malicious HTML into pages. Because HTML standards
do not provide any simple mechanism to disable client-side scripts in specific
portions of a web-page, it is difficult to reliably cleanse script from normal
HTML. The most reliable method is for web applications to parse the HTML, strip
tags and attributes that do not appear in a whitelist, and output valid HTML.
However even this parsing can sometimes be exploited by providing a
malformed HTML code, breaking the parsing algorithm. As a
similar attack targeted Hotmail, on
Other forms of mitigation
The easiest way to eliminate XSS vulnerabilities is to encode (HTML quote)
all user-supplied HTML special characters, thereby preventing them from being
interpreted as HTML. Unfortunately, users of many kinds of web applications
(commonly forums and webmail) wish to use some of the features HTML provides.
There are some web applications, which attempt to identify all "evil" HTML, and
neutralize it, either by removing it or encoding it. These algorithms usually
end up being incredibly complex, and for this reason it is almost impossible to
know for sure if all possible injections are eliminated. This is because
fact that browser and web technologies are still heavily under development. In
order to eliminate certain injections, any server-side algorithm must either
reject broken HTML, understand how every browser will interpret broken HTML, or
(preferably) fix the HTML to be well-formed using techniques akin to those of
Besides content filtering, other methods for XSS mitigation are also commonly
used. One example is that of cookie security. Many web applications rely on
session cookies for authentication between individual HTTP requests, and because
client-side scripts generally have access to these cookies, most simple XSS
exploits are written to steal these cookies. To mitigate this particular threat
(though not the XSS problem in general), many web applications tie session
cookies to the IP address of the user who originally logged in, and only permit
that IP to use that cookie. This is effective in most situations (if an attacker
is only after the cookie), but obviously breaks down in situations where an
attacker is behind the same NATed IP address or web proxy. Internet Explorer also has a feature, called the HTTP Only flag, which
allows a webserver to set a cookie which is unavailable to client-side scripts.
While this seems like a useful feature, it does not prevent the use of XSS to
steal passwords or perform
cross-site request forgery attacks.
An additional common mitigation, is to use input validation of all
potentially malicious data sources. This is a common theme in application
development (even outside of web development) and is generally very useful. For
instance, if a form accepts some field, which is supposed to contain a phone
number, a server-side routine could remove all characters other than digits,
parentheses, and dashes, such that the result cannot contain a script.
(Incidentally, this can be used to prevent other injection attacks, such as
injection, from being successful.) While effective for most types of input,
there are times when an application, by design, must be able to accept special
HTML characters, such as '<' and '>'. In these situations, HTML entity encoding
is the only option.
Finally, some web applications are written to (sometimes optionally) operate
completely without the need for client-side scripts. This allows users, if they
choose, to completely disable scripting in their browsers before using the
application. In this way, even potentially malicious client-side scripts could
be inserted unescaped on a page, and users would not be susceptible to XSS
attacks. Unfortunately external content can still be loaded into the page with
tags like <iframe> or <object>, which often is enough to trick users.
Many browsers can also be configured to disable client-side scripts on a
per-domain basis. This is a common mistake, whose failure is that it blocks bad
sites only after the user knows that they are bad, which normally of
course is too late. Therefore only a functionality that by default blocks all
scripting and external inclusions, and then allows the user to enable it on a
per-host-basis is the only way to greatly enhance the convenience of such a
system. Doing this has been possible for a long time in Internet Explorer (since
version 4) setting up the so called "Security Zones", and in Opera since version
9 using its "Site Specific Preferences", slightly more discoverable and usable.
A user friendly solution for Firefox and other Gecko based browsers is provided by the open source
NoScript extension, featuring also a specific
Anti-XSS protection functionality.
One of the major drawbacks to this mitigation is that most users are ignorant
of such measures, and would not know how to properly secure their browsers for
such applications, if these security settings were disabled by default. The
other major drawback is that many insecure sites simply don't work without
client-scripts, thereby forcing the user to disable the protection for that site
and opening their system to the threat.
There are several classes of vulnerabilities or attack techniques which are
related, and worth mentioning:
Cross Zone Scripting vulnerabilities, which exploits "zone" concepts in
software, usually execute code with a greater privilege.
HTTP Header Injection vulnerabilities, which can be used to create
cross-site scripting conditions in addition to allowing attacks such as
HTTP response splitting.
Cross-site request forgery (CSRF/XSRF) is almost the opposite of XSS, in
that rather than exploiting the user's trust in a site, the attacker
exploits the site's trust in the client software, submitting requests that
the site believes come from its own pages.
Cross-site request forgery
Evil twin (wireless networks)
HTTP response splitting
IDN homograph attack