Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a guide page on XSS #36412

Merged
merged 34 commits into from
Dec 17, 2024
Merged

Add a guide page on XSS #36412

merged 34 commits into from
Dec 17, 2024

Conversation

wbamberg
Copy link
Collaborator

@wbamberg wbamberg commented Oct 19, 2024

Here's a guide page on XSS, covering (1) what it is and (2) defenses. It's a replacement for https://developer.mozilla.org/en-US/docs/Web/Security/Types_of_attacks#cross-site_scripting_xss.

I think it could be part of a series of articles on "attacks" so I've put it under Web/Security/Attacks. Other articles here might include:

  • clickjacking
  • xs leaks (CORP, COEP etc)
  • MITM perhaps
  • CSRF
  • supply chain attacks?

I wanted it to be quite accessible and not assume too much knowledge.

I thought of including more on "what an XSS can do" but didn't, although there is little about that. I also thought of something specifically on cross-site styling since that doesn't get talked about much.

@github-actions github-actions bot added Content:Security Security docs size/m [PR only] 51-500 LoC changed labels Oct 19, 2024
Copy link
Contributor

github-actions bot commented Oct 19, 2024

Preview URLs

External URLs (6)

URL: /en-US/docs/Web/Security/Attacks/XSS
Title: Cross-site scripting (XSS)

(comment last updated: 2024-12-17 01:36:31)

* origin/xss-guide: (284 commits)
  Add information on default entryPoint property values (mdn#36633)
  Bump husky from 9.1.6 to 9.1.7 (mdn#36863)
  fix(performance): Typo '50ms seconds' (mdn#36861)
  Add spec_url & add note for bcd for `<frequency>` and `<frequency-percentage>` (mdn#36848)
  addresses 36583: summary icon styles (mdn#36691)
  Remove "simple" part 3: change to "basic"  (mdn#36762)
  the default option of a select (mdn#36658)
  docs(css): Add support for `<string>` in `syntax` descriptor of @Property at-rule (mdn#36655)
  Fix parameter syntax for `Navigation.updateCurrentEntry()` (mdn#36852)
  Update CSP source expression reference (mdn#36792)
  chore(http): Refresh headers docs (d-k) (mdn#36075)
  chore(http): Refresh headers r-s (mdn#36590)
  Updated index.md (mdn#36845)
  fix : wrong method name (mdn#36843)
  Remove all redirects to other locales (mdn#36811)
  fix typos (mdn#36837)
  docs: update Accept-Charset status (mdn#36822)
  updateSelection: make more intuitive (mdn#36834)
  updateText: Remove false information (mdn#36832)
  DOMRect instance properties (mdn#36704)
  ...
@wbamberg wbamberg marked this pull request as ready for review November 28, 2024 01:08
@wbamberg wbamberg requested a review from a team as a code owner November 28, 2024 01:08
@wbamberg wbamberg requested review from hamishwillee and removed request for a team November 28, 2024 01:08

In one architecture, the user navigates to the blog post by following a normal link. The browser makes an HTTP GET request to the server, which retrieves the blog post and comments from a database, then feeds them into a template to construct the response as an HTML document. The browser receives the HTML document containing the malicious code, and renders it.

![Diagram of server-side XSS](server-side-xss.svg)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope these diagrams are understandable! Really I just want to illustrate the two different architectures, to show that the XSS can happen in either client or server.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the diagrams. I'm just not sure I believe the point :-( See comment below https://github.com/mdn/content/pull/36412/files#r1865193925

Comment on lines 199 to 214
- **HTML contexts**: input inserted between the tags of most HTML elements (except for {{htmlelement("style")}} or {{htmlelement("script")}}) is in the HTML context, as in the example above, and the encoding applied by template engines is mostly concerned with this context.
- **HTML attribute contexts**: inserting input as HTML attribute values is sometimes safe and sometimes not, depending on the attribute. In particular, event handler attributes like `onblur` are unsafe, as is the [`src`](/en-US/docs/Web/HTML/Element/iframe#src) attribute of the {{htmlelement("iframe")}} element.

It's important to quote placeholders for inserted attribute values, or an attacker may be able to insert an additional unsafe attribute in the value provided. For example, this template does not quote an inserted value:

```django example-bad
<div class=\{{ my_class }}>...</div>
```

An attacker can exploit this to inject an event handler attribute, by using input like `some_id onmouseover="alert('XSS!')"`. To prevent the attack, we can quote the placeholder:

```django example-good
<div class="\{{ my_class }}">...</div>
```

- **JavaScript and CSS contexts**: inserting input inside {{htmlelement("script")}} or {{htmlelement("style")}} tags is almost always unsafe.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a rather simplified version of https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html#output-encoding. I didn't talk about URL contexts, perhaps I should? I wondered about whether to include this at all but I think especially understanding the dangers of using input variables in HTML attributes is important.

In a successful XSS attack, the attacker is able to subvert the same-origin policy, executing their code in the context of the target site. The code can then do anything that the site's own code can do, including, for example:

- access all the content of the site's loaded pages, and any content in local storage
- make HTTP requests with the user's credentials, enabling them to impersonate the user or access sensitive data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So above I point to the ability to modify local data. The worry here is access, but also modification. Dunno if we need to got to this level of details.

BTW note the MDN list style https://developer.mozilla.org/en-US/docs/MDN/Writing_guidelines/Writing_style_guide#lists

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand the MDN style guide. It says:

Sentences and phrases (i.e., sentence fragments missing a verb or a subject or both) in bulleted lists should include standard punctuation — sentences end with periods, phrases don't.

...but then gives as an example:

In this example, we should include:

  • A condition, with a brief explanation.
  • A similar condition, with a brief explanation.
  • Yet another condition, with some further explanation.

...but none of these are sentences, why do they end with periods?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say typo ^^^

I find correct punctuation ugly so what I hope all this means is that this should be:


for example:

  • Access all the content of the site's loaded pages, and any content in local storage
  • Make HTTP requests with the user's credentials, enabling them to impersonate the user or access sensitive data

Anyway, @dipikabh is on leave for the rest of the year, so I guess as long as it is consistent it goes through.


### Output encoding

_Output encoding_ is the process by which characters in the input string that potentially make it dangerous are escaped, so they are treated as text instead of being treated as part of a language like HTML.
Copy link
Collaborator

@hamishwillee hamishwillee Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker, but it would be nice if there was a glossary for "escaping text" or similar to link here.

We might perhaps use https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_escape

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's a very friendly page! Agree that a glossary page would be good -> #37206.

<p>You searched for \{{ search_term }}.</p>
```

Most modern templating engines automatically perform output encoding. For example, if you pass `<img src=x onerror=alert('hello!')>` into the Django template above, it will be rendered as text:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we dont have any link to escape we might modify this bit to more concretely demonstrate what is happening. Just a thought. This "pseudo-codes" one option:

Suggested change
Most modern templating engines automatically perform output encoding. For example, if you pass `<img src=x onerror=alert('hello!')>` into the Django template above, it will be rendered as text:
Most modern templating engines automatically perform output encoding. For example, if you pass `<img src=x onerror=alert('hello!')>` into the Django template above, it will be converted to Xxxxxx, and rendered as thought the original content was text:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did something a little different here, listing the conversions that Django does -> 4557801. I think this should make it more concrete. The "is converted to" is a repetitive but I couldn't think of anything prettier, I tried a table but the rendering of them in Yari is ugly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even so #36412 (review)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(which is to say I like what you did, but I still think what I suggested is useful)

@hamishwillee
Copy link
Collaborator

hamishwillee commented Dec 12, 2024

I love the changes. Excellent "flow" now - concepts build on each other progressively. No unnecessary repetition.

A couple of spot suggestions - happy to merge when you've had a look.

Edit: You might add comments like "this is referred to as reflected XSS" etc in appropriate sections. The argument against is that they aren't really useful. The argument for is so that someone reading this doesn't later get confused by these definitions in other places. I'm OK either way.

@wbamberg
Copy link
Collaborator Author

Edit: You might add comments like "this is referred to as reflected XSS" etc in appropriate sections. The argument against is that they aren't really useful. The argument for is so that someone reading this doesn't later get confused by these definitions in other places. I'm OK either way.

Initially I was going to keep this somewhere but once I'd done all this rework I thought it would be distracting and not very helpful. The thing is that "reflected XSS" is kind of hard to explain quickly (or at least, the explanations I've read, which are generally something like "input in the HTTP request is included in the response" are not easy for me to grasp). So either you say "this is sometimes called reflected XSS" which makes me think, what is? What is it about this example that's reflected XSS? Or you have to take time to explain it, which we don't really want to do...


In this case the code just displays an alert, but in a real banking website, the attacker code would be able to do anything that the bank's own front-end code could.

### Code injection in the back end
Copy link
Collaborator

@hamishwillee hamishwillee Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just me, or is this a double ententre?

Even if not, suggest it is more clear to use "server" and "browser" for these cases?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I've been living in Canada too long!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> 190f86a

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm a sick sick man.

Copy link
Collaborator

@hamishwillee hamishwillee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great - much better than what it replaces. Thanks for your patience with the nitpicking @wbamberg

@hamishwillee hamishwillee merged commit 023a970 into mdn:main Dec 17, 2024
8 checks passed
allan-bonadio pushed a commit to allan-bonadio/content that referenced this pull request Dec 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Content:Security Security docs size/m [PR only] 51-500 LoC changed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants