-
Notifications
You must be signed in to change notification settings - Fork 854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libpcap should fully support pcapng files #1321
Comments
exactly. I don't really know what the right model is going to be. |
Some initial requirements for pcapng supportSupport for all defined pcapng blocks, as well as vendor and custom blocksAll blocks should be made available to readers and should be writable by writers. This includes locally-defined block types and custom blocks. Readers should not have to byte-swap fields in officially-defined blocks or the officially-defined fields in custom blocks. However, they might have to provide code to byte-swap fields in locally-defined blocks and, if necessary, subfields of the Custom Data field in custom blocks, although we could provide that code ourselves for some such blocks. A program using these APIs should be able to read both pcapng and pcap filesThere should not be separate "open a pcap file" and "open a pcapng file", unlike what Apple did in their internal (and not publicly available) APIs in libpcap. Having to try opening the file as pcapng and, if that fails, try opening it as pcap is clumsy and unnecessarily awkward. If a pcap file is opened, the program should receive a simulated Section Header Block (SHB), followed by a simulated Interface Description Block (IDB) (without if_name or if_description options, as that information is not available in pcap files), followed by a sequence of Enhanced Packet Blocks (EPBs). Capturing and reading should have calls similar to
|
@guyharris take a look at https://github.com/Technica-Engineering/LightPcapNg for inspiration, pure C pcapng library. |
This sounds like a good idea. Some questions: The ticket title specifies pcapng files, but you also intend live capture to support the pcapng format, blocks etc? The API for modules will also need appropriate updates. When configuring live capture, should libpcap offer IDBs to describe the available interfaces (pcap_findalldevs() equivalent)? Should libpcap create a SHB when starting live capture, and emit it as the first block (followed by IDBs etc), or is that the application's responsibility? The application may or may not be intending to write a pcapng file. |
Yes. For example, capturing on either the Linux or macOS "any" device should provide IDBs for each interface, and EPBs for packets that indicate the interface on which the packet arrived, as well as providing various EPB options such as the flag word, so that packet direction, etc. can be provided.
Yes.
Yes, as per the above.
Yes.
The application can ignore blocks that contain no information of interest to it. If the application is writing a pcap file:
and return 0 on success and a warning or error code on a warning condition or error. |
With Section Length set to -1 presumably. In support of lossless Section rotation (and potentially file rotation) in live capture it would be helpful for the application to be able to request a new SHB. In particular it would be helpful to support requesting a new SHB before a specific timestamp is passed or size is exceeded. For example starting capture at 1:30pm and requesting a new SHB be emitted at 2:00pm. Libpcap would then continue supplying EPBs until a block with timestamp after 2:00pm is captured, at which point it would emit the new SHB first, followed by IDBs, and the EPB inside the new Section. The application could then choose whether to additionally rotate the files at the SHB boundary (without packet drop). Alternatively the application could cache the initial SHB and IDBs and do the Section rotation itself, including updating details which may have changed in the meantime. |
Yes, given that it doesn't know how bit the section will be - and, given that it might be writing to a pipe, it might not even be able to fix it later (I guess it could check whether it's writing to something other than a regular file, and fix it up if not when it closes the file, but Wireshark doesn't bother doing that, so it's not a high priority).
The only reason I've ever seen for multiple SHBs was to allow simple concatenation of pcapng files, which I think was the original reason for supporting it. Is there another use case for that? |
That is a shame as the fix up is relatively cheap, provided the file is seekable.
If the Section Lengths are updated after the fact, then starting new Sections within a file periodically could make a large file more easily navigable/seekable. You could potentially direct Wireshark to dissect only a specific Section within a file to avoid memory exhaustion. Alternatively the application could simply break the file at the Section boundary and continue capturing without losing packets. |
Presumably for "navigable" you mean "navigable by the end user", e.g. having a display of sections, with start and end dates. Is the same the case for "seekable"? (Wireshark does all seeks in most file types, including pcapng, as random accesses to particular file offsets.) |
Right, navigable meaning an index of Sections like book chapters in the UI. With Section Lengths present this could typically be generated in less than a second. Seeking meaning selecting and loading one (or more) Section(s) from a large file without having to scan linearly through the file, or dissect all blocks into memory. |
But section boundaries are arbitrary; there's no guarantee that, for example, a request won't be in one section and the reply to it in another. Do they really correspond to book chapters, unless you have a book in which one character asks a question at the end of a chapter the reply to which appears early in the next chapter?
Wireshark makes only one linear scan through the file, building a table of frames, each entry in which has an offset in the file to the beginning of the block/record for that frame. It doesn't load all blocks into memory, nor does it save in memory the dissection of all frames. And to find a given frame within a section requires scanning linearly through the section and building an index for the section. |
Sure, file rotation in live capture is always arbitrary, but still useful. Only a relatively small number of flows will be broken across the boundary. If you can navigate by Section within a single file you might choose to start filling your packet list a few thousand packets before the end of the previous Section, just as you might flip a few pages back before the start of a new chapter to remind yourself of context. Reverse seeking in pcapng makes this easy to implement and fast.
Right, but even the 'table of frames' grows in memory linearly with the number of packets scanned? Isn't there also some stream indexing done at the first pass, e.g. for TCP?
True, but linear scanning a single 1GB Section in a 100GB file is still ~100x faster. I apologise this discussion is getting off topic, Wireshark features are likely best discussed there. I think that multiple Sections per capture or file can be useful, and would be worth supporting in libpcap. The capturing application can do most of the work. |
So a call that causes the next block provided to the callback to be an SHB, followed by a set of IDBs? Providing a mechanism to patch the section size in the previous SHB can't be done in the capture/read code path, as there's no guarantee that the application is writing a capture file. That's probably best done in the "write a pcapng file" code when it sees a new SHB to be written; it would do that when writing to a regular file. |
...and, of course, only do that if the total number of bytes worth of blocks written to the section, after the SHB was written, is < 07FFFFFFF; otherwise, it won't fit in a signed 32-bit integer field. |
Either that, or require the application to either cache them or explicitly re-request them for the new Section. Perhaps that is simpler.
Yes, it would be helpful if closing a Section in the writing code updated the SHB automatically. |
SHB Section Length is 64-bits? |
I.e., having an API to switch sections is simpler? |
If packet blocks and other blocks are compressed, but SHBs are not, then updating the Section Length would still be possible. |
Yes, provided the Section can be switched after a block has been received for the new section. I.e. dumpcap wants to rotate files on a 5 minute boundary. It receives an EPB with a timestamp in the new period from the capture API, so closes the existing Section/file with the write API, opens a new file/Section/IDBs, then writes the current EPB into the new file/Section. No packets/EPBs should be dropped during the rotation process, provided capture buffers are not overrun. |
The most common forms of compression are 1) running a compression program or 2) putting a compression library at a low level in the file-writing path; neither those programs nor those library have any clue about pcapng blocks, so individual blocks will not be compressed, the entire file will be compressed. The pcapng-extras document has a "Compression Block (experimental)", which says that the contents of the "Compressed Data" is, when decompressed, "made of other blocks", so it can presumably contain one or more blocks, presumably full blocks. As far as I know, there are no implementations of that whatsoever. So there is currently no mechanism in existence to arrange that "packet blocks and other blocks are compressed, but SHBs are not". That might be something to consider for the future. |
Or the new file; a "start a new section" call would also be useful when rotating files.
That's a bit more complicated, as "[the program writing a file, which isn't necessarily dumpcap] receives an EPB with a timestamp in the new period from the capture API" means that program's callback for a new block has already been called. Perhaps having libpcap trigger a file/section break before each block would be the right answer here. I'd prefer not to have all rotation policies wired into libpcap itself, so perhaps having a separate "time to rotate/stop/whatever" callback, which can use any criterion it wants to rotate/stop, and hand the block to that callback before handing it to the "process this block" callback, would be the right answer. |
Understood. It sounds like the experimental Compression Block would achieve what I suggested. Ideally someone would contribute an implementation. I think there may be more demand for this capability (compared to just compressing the entire file) if reading applications implemented multi-Section 'navigation' discussed previously. |
Agreed, implementing rotation policies in libpcap would not be ideal. Better to keep that logic in the application. An additional 'rotation check' callback before every 'new block' callback seems redundant, provided the application could implement the rotation logic from the 'new block' callback? |
libpcap should have full support added for pcapng files.
The current APIs of libpcap for capturing packets and reading from files was designed before pcapng existed; it was designed for pcap, which was the file format created for tcpdump and supported by libpcap when the libpcap code was extracted from tcpdump and put into a separate library.
Thus, it has no notion of a pcapng block type, and provides only packet block information to the callback, as that's the only type of record pcap has.
To support capturing and reading, new APIs, designed to fully support pcapng, should be added; those APIs should also support pcap files by, for example, providing a fake Section Header Block synthesized from the byte-order in the pcap file header, a fake Interface Description Block synthesized from the time stamp resolution indicated by the magic number in the pcap file header and a link-layer type and snapshot length using the values from the pcap file header (it can't provide an interface name as that's not stored in pcap files), and fake Enhanced Packet Blocks synthesized from packet headers in the packet records in the pcap file.
The APIs for writing files are also pcap-only; new APIs should be added to support writing pcapng files. Those APIs should also support, for example, reporting errors when writing - and when closing a file, as the last part of the file might not be written until the file is closed (all the way down to the lowest level OS call to close the file, such as
close()
in UN*Xes, as, for NFS and possibly other remote file systems, the write to the underlying file system isn't done immediately, it may be buffered on the client side and later asynchronously sent to the server, with errors, such as "file system full" or "quota exceeded" or, rarely, a real I/O error, returned by the server reported in subsequent writes, which might be done as part of a close operation).The text was updated successfully, but these errors were encountered: