J. Doe J. Doe - 28 days ago 11
C Question

Parsing a .pcap file in plain C

I'm trying to create my own pcap files parser. According to Wireshark's docs:

Global Header

This header starts the libpcap file and will be followed by the first packet header:

typedef struct pcap_hdr_s {
guint32 magic_number; /* magic number */
guint16 version_major; /* major version number */
guint16 version_minor; /* minor version number */
gint32 thiszone; /* GMT to local correction */
guint32 sigfigs; /* accuracy of timestamps */
guint32 snaplen; /* max length of captured packets, in octets */
guint32 network; /* data link type */
} pcap_hdr_t;

magic_number: used to detect the file format itself and the byte ordering. The writing application writes 0xa1b2c3d4 with it's native byte ordering format into this field. The reading application will read either 0xa1b2c3d4 (identical) or 0xd4c3b2a1 (swapped). If the reading application reads the swapped 0xd4c3b2a1 value, it knows that all the following fields will have to be swapped too. For nanosecond-resolution files, the writing application writes 0xa1b23c4d, with the two nibbles of the two lower-order bytes swapped, and the reading application will read either 0xa1b23c4d (identical) or 0x4d3cb2a1 (swapped).
version_major, version_minor: the version number of this file format (current version is 2.4)
thiszone: the correction time in seconds between GMT (UTC) and the local timezone of the following packet header timestamps. Examples: If the timestamps are in GMT (UTC), thiszone is simply 0. If the timestamps are in Central European time (Amsterdam, Berlin, ...) which is GMT + 1:00, thiszone must be -3600. In practice, time stamps are always in GMT, so thiszone is always 0.
sigfigs: in theory, the accuracy of time stamps in the capture; in practice, all tools set it to 0

snaplen: the "snapshot length" for the capture (typically 65535 or even more, but might be limited by the user), see: incl_len vs. orig_len below

network: link-layer header type, specifying the type of headers at the beginning of the packet (e.g. 1 for Ethernet, see tcpdump.org's link-layer header types page for details); this can be various types such as 802.11, 802.11 with various radio information, PPP, Token Ring, FDDI, etc.

/!\ Note: if you need a new encapsulation type for libpcap files (the value for the network field), do NOT use ANY of the existing values! I.e., do NOT add a new encapsulation type by changing an existing entry; leave the existing entries alone. Instead, send mail to tcpdump-workers@lists.tcpdump.org , asking for a new link-layer header type value, and specifying the purpose of the new value.


The first integer in the file should be either 0xA1B2C3D4 or 0xD4C3B2A1, but my code's output:

#include <stdio.h>

typedef unsigned int guint32;
typedef unsigned short guint16;

int main()
{
FILE * file = fopen("test.pcap", "rb");

guint32 magic_number;

fscanf(file, "%d", &magic_number);

printf("%x\n", magic_number);

return 0;
}


Is 0x8. Why is that?

Answer

The magic number are the first 4 bytes of the file. With fscanf(...%d you don't read these 4 bytes but you instead try to interpret the beginning of the file as the ASCII representation of a number, i.e. "1234" instead of "\x01\x02\x03\x04". Thus instead of fscanf you need to use fread to read exactly 4 bytes.

  fread((void*)&magic_number, 4, 1, file)