Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for Unicode escape sequences in CVE descriptions and/or anywhere else relevant #3

Open
Chris-Turner-NIST opened this issue Sep 14, 2023 · 2 comments

Comments

@Chris-Turner-NIST
Copy link

The CVE schema expects data to be in UTF-8. However, many records contain unicode escape sequences instead of the expected Unicode characters.

We should add this as a check for the lint tool.

@mprpic
Copy link
Owner

mprpic commented Sep 14, 2023

This might be a good candidate for a warning-level finding instead of an error since there may be valid uses of escape sequences such as in the CVE-2021-26929 record:

"value": "An XSS issue was discovered in Horde Groupware Webmail Edition through 5.2.22 (where the Horde_Text_Filter library before 2.3.7 is used). The attacker can send a plain text e-mail message, with JavaScript encoded as a link or email that is mishandled by preProcess in Text2html.php, because bespoke use of \\x00\\x00\\x00 and \\x01\\x01\\x01 interferes with XSS defenses."

but we probably want to flag this as a mistake (CVE-2021-27253):

"value": "Ho\\xc3\\xa0ng Th\\xe1\\xba\\xa1ch Nguy\\xe1\\xbb\\x85n, Lucas Tay"

Wdyt, @Chris-Turner-NIST?

@Chris-Turner-NIST
Copy link
Author

I do not have an issue with this being a warning since some cases may be legitimate.
Generally I see this tool as a first step in ensuring better data quality amongst CVE Program participants.
So long as a known problem area is being tracked in some way, we can have proper awareness and thus drive better submissions by those responsible.
A secondary goal would be to retroactively clean up older records that don't meet expectations, which a warning still helps achieve.

Scope creep: Add a false positive filter for rules to filter out known acceptable use edge cases?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants