-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments on 3.1. General Block Structure - endian and padding #41
Comments
Is backward navigation really a necessity? It adds 4 bytes at the end of each recorded packet, the overhead which in cases of very chatty protocols quickly adds up and penalizes applications that don't need backward navigation. |
Good point, backwards navigation can hardly be a necessity. It would be nice to see these redundant 4 bytes removed from the general block structure. However, I'm guessing the Wireshark community might wanna use the current structure for the final 1.0 spec, since they have been using this structure for quite some time already. Hopefully there will be a version 1.1 where we can discuss improvements like dropping the tailing block total length. |
Backwards navigation is critical for me to have to be able to navigate backwards. There's two main reasons for this:
For me the 4 additional bytes are easily worth the overhead. |
Though it seems that (1) is easily achievable by storing file's checksum (e.g. SHA/MD5) in the database. Regarding (2), usually the damage happens at the end of a file due to an abnormal writer's termination. In which case the backwards navigation would be useless anyway, since the end of the file might be partially written, so the reader would have to read from the beginning until the last consistent packet is found. On the other hand, in very chatty protocols where payload of each packet is very small (e.g. financial FAST market data), containing 8-16 bytes per packet with 100's of Gigabytes of data per day, any extra bytes of overhead create an unnecessary storage burden. So I would really like to see this "backward navigation" option either obviated or made optional. |
@packetfoo This is why it's a bad idea to allow multiple SHB's with different endian in the same pcapng file. There is no way you can know whether to interpret the Block Total Length of the last frame as big endian or little endian. You can, of course, make an educated guess, but it would be nice if we didn't have to rely on guesswork to implement the pcapng spec. |
+1 |
@saleyn no, MD5 or any kind of hash is not going to cut it. Not even the file size is, because you can add comments (or other things) to a pcapng file and the packets stay completely the same (which is all that matters to meta data regarding packets), but hash and size change. Also, calculating the hashes would require reading the full file each and every time, and I can't afford to do that. @erik4711 I know what you mean - so far it worked fine since all I encounter are little endian files. Forcing a single endianess is something we should look at in the future, but first 1.0 needs to be done. |
By the way, I need to find out what the reason for the 32bit padding is - I think there is a very specific reason for this, probably extremely high capture performance. Meaning: not talking about lazy PC hardware, but 10G/40G/100G/200G FPGA based captures where writing to disk needs to be ultra fast - it's possible that there are I/O problems if not writing flat 32bit values. |
@packetfoo, I understand that there might be legitimate reasons for backward navigation. The question is if this requirement is general enough to make it a "hard-core" feature of the format. |
I think backwards movement is more useful to all capture files than avoiding 4 bytes for an extremely special capture application regarding tiny packets. Which wouldn't even be Ethernet compliant at that size, but there are other L2 protocols that do, of course, so I can see the point. But I still think 4 bytes overhead are completely insignificant compared to their usefulness of being able to move back and forth. And making the trailing size optional kills the "read the last timestamp" feature completely, because how would I know if the last four bytes are the block size or part of the frame data? |
Regarding the "usefulness" - it's really all use-case driven. In your case the benefit of backward navigation seems to outweigh the overall file size. In my case, it's the other way around, and when frequently dealing with very large pcap files, I really never encountered the need to read a file from the end, and if that ever occurred I would likely come up with a method to index the files. Regarding turning that to being an optional feature - one possibility could be to add a flag to the header block that would indicate that the file supports backward navigation, in which case the writer would append the size of last block at the end of each written block. |
You're right, it's a question of point of view. Problem with the flag is that the header block for the last frame could be anywhere since we can have multiple SHBs, so even if I read the first SHB I cannot be sure that the flag tells the truth about the last block in the file. |
You just named another reason to not allow multiple SHBs. :-) |
I know, they're complicating things. It has to be determined how useful it is to allow these, but that is again not something for the 1.0 spec. 1.0 will have to live with multiple SHBs. |
3.1. General Block Structure
My recommendation is to not allow SHB's to change the endianness, or even better only allow ONE Section Header Block per pcapng file.
The text was updated successfully, but these errors were encountered: