Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created ordered list for HTTP headers #313

Closed
wants to merge 4 commits into from

Conversation

nakulbajaj
Copy link

Modified ZGrab2's HTTP package to extract headers as a list of
HeaderField objects and then extract as a map via the textproto package.

Created new HeaderField object with UTF-8 string for header names
paired with a byte slice for header values.

Modified ZGrab2's HTTP package to extract headers as a list of
HeaderField objects and then extract as a map via the textproto package.

Created new HeaderField object with UTF-8 string for header names
paired with a byte slice for header values.
@nakulbajaj nakulbajaj marked this pull request as ready for review June 4, 2021 18:27
@mzpqnxow
Copy link
Contributor

mzpqnxow commented Jun 21, 2021

@nakulbajaj does this change how the headers are represented in the NDJSON output?

(Aside from ordering, of course)

@nakulbajaj
Copy link
Author

@mzpqnxow This doesn't change the "headers" dictionary as part of the JSON output at all. Instead, it creates a new field in the JSON output called "header_list," which composed of a list of header objects (which include a name and a value string).

@phillip-stephens
Copy link
Contributor

phillip-stephens commented Jan 27, 2025

Hey @nakulbajaj, thanks for the PR and apologies it's taken 4 years to get any eyes on it!

I'm curious though what the value-add is for this ordered list. From a quick http scan, it looks like header_list's fields are the same as headers but having the values be base-64 encoded rather than decoded into human-readable strings as they are in headers.
Are you wanting a way to store the exact order of header fields sent by the server, or something else? From initial impressions, this seems to require twice the output for the one benefit of preserving the header order sent by the server, which I'd imagine we could do in a less verbose way: "header_ordered_list": ["Server", "Date",...] .

          "headers": {
            "content_length": [
              "102"
            ],
            "content_type": [
              "text/html"
            ],
            "date": [
              "Mon, 27 Jan 2025 19:40:23 GMT"
            ],
            "last_modified": [
              "Mon, 27 Jan 2025 19:38:52 GMT"
            ],
            "server": [
              "SimpleHTTP/0.6 Python/2.7.9"
            ]
          },
          "header_list": [
            {
              "name": "Server",
              "value": "U2ltcGxlSFRUUC8wLjYgUHl0aG9uLzIuNy45"
            },
            {
              "name": "Date",
              "value": "TW9uLCAyNyBKYW4gMjAyNSAxOTo0MDoyMyBHTVQ="
            },
            {
              "name": "Content-Type",
              "value": "dGV4dC9odG1s"
            },
            {
              "name": "Content-Length",
              "value": "MTAy"
            },
            {
              "name": "Last-Modified",
              "value": "TW9uLCAyNyBKYW4gMjAyNSAxOTozODo1MiBHTVQ="
            }
          ],

@phillip-stephens
Copy link
Contributor

I'm going to close this due to the length of time this has been opened. Please feel free to re-open if the PR still feels justified.

@Seanstoppable
Copy link
Contributor

Seanstoppable commented Jan 28, 2025

So, my understanding is that in some scenarios header order matters for things like fingerprinting some services/systems. So returning them in a map, where ordering is non-deterministic + concatenated, prevents this.
My org maintains a fork of zgrab2 with some modifications to the http module to capture headers in their more 'raw' representation (though as a block of strings, akin to curl output) to make sure we can do similar fingerprinting.

@Seanstoppable
Copy link
Contributor

This is probably still obsoleted by #349

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants