Check for Unicode escape sequences in CVE descriptions and/or anywhere else relevant #3

Chris-Turner-NIST · 2023-09-14T19:06:38Z

The CVE schema expects data to be in UTF-8. However, many records contain unicode escape sequences instead of the expected Unicode characters.

We should add this as a check for the lint tool.

mprpic · 2023-09-14T19:50:42Z

This might be a good candidate for a warning-level finding instead of an error since there may be valid uses of escape sequences such as in the CVE-2021-26929 record:

"value": "An XSS issue was discovered in Horde Groupware Webmail Edition through 5.2.22 (where the Horde_Text_Filter library before 2.3.7 is used). The attacker can send a plain text e-mail message, with JavaScript encoded as a link or email that is mishandled by preProcess in Text2html.php, because bespoke use of \\x00\\x00\\x00 and \\x01\\x01\\x01 interferes with XSS defenses."

but we probably want to flag this as a mistake (CVE-2021-27253):

"value": "Ho\\xc3\\xa0ng Th\\xe1\\xba\\xa1ch Nguy\\xe1\\xbb\\x85n, Lucas Tay"

Wdyt, @Chris-Turner-NIST?

Chris-Turner-NIST · 2023-09-15T16:20:16Z

I do not have an issue with this being a warning since some cases may be legitimate.
Generally I see this tool as a first step in ensuring better data quality amongst CVE Program participants.
So long as a known problem area is being tracked in some way, we can have proper awareness and thus drive better submissions by those responsible.
A secondary goal would be to retroactively clean up older records that don't meet expectations, which a warning still helps achieve.

Scope creep: Add a false positive filter for rules to filter out known acceptable use edge cases?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check for Unicode escape sequences in CVE descriptions and/or anywhere else relevant #3

Check for Unicode escape sequences in CVE descriptions and/or anywhere else relevant #3

Chris-Turner-NIST commented Sep 14, 2023

mprpic commented Sep 14, 2023

Chris-Turner-NIST commented Sep 15, 2023

Check for Unicode escape sequences in CVE descriptions and/or anywhere else relevant #3

Check for Unicode escape sequences in CVE descriptions and/or anywhere else relevant #3

Comments

Chris-Turner-NIST commented Sep 14, 2023

mprpic commented Sep 14, 2023

Chris-Turner-NIST commented Sep 15, 2023