Leif Leif - 3 months ago 84
Python Question

Parsing pcap files with dpkt (Python)

I'm trying to parse a previously-captured trace for HTTP headers using the dpkt module:

import dpkt
import sys

f=file(sys.argv[1],"rb")
pcap=dpkt.pcap.Reader(f)


for ts, buf in pcap:
eth=dpkt.ethernet.Ethernet(buf)
ip=eth.data
tcp=ip.data

if tcp.dport==80 and len(tcp.data)>0:
try:
http=dpkt.http.Request(tcp.data)
print http.uri
except:
print 'issue'
continue


f.close()


While it seems to effectively parse most of the packets, I'm receiving a NeedData("premature end of headers") exception on some. They appear to be valid packets within WireShark, so I'm a bit confused as to why the exceptions are being thrown.

Some output:
/ec/fd/ls/GlinkPing.aspx?IG=4a06eefebcc1495f8f4de7cb41f0ce5c&CID=2265e1228f3451ff8011dcbe5e0cdff7&ID=API.YAds%2C5037.1&1307036510547
issue
issue #misses one packet here, two exceptions
/?ld=4vyO5h1FkjCNjBpThUTGnzF50sB7QUGL0Ok8YefDTWNmO6RXghgDqHXtcp1OqeXATbCAHliIkglLj95-VEwG6ZJN3fblgd3Lh5NvTp4mZPcBGXUyKqXn9FViBAsmt1T96oumpCL5gm7gZ3qlZqSdLNUWjpML_9I8FvB2TLKPSYcJmb_VwwvJhiHpiUIvrjRdzqdVVnuQZVjQmZIIlfaMq0LOmgew_plopjt7hYvOSzBi3VJl4bqOBVk3zdhIvgZK0SfJp3kEWTXAr2_UU_q9KHBpSTnvuhY2W1xo3K2BOHKGk1VAlMiWtWC_nUaJdZmhzzWfb6yRAmY3M9YkUzFGs9z10-70OszkkNpVMSS3-p7xsNXQnC3Zpaxks


Help is appreciated; perhaps an alternative library recommendation is needed.

Answer

I have encountered the same problem while working with HTTP Requests and dpkt.

The problem is that the dpkt's HTTP headers parser uses wrong logic. This exception is raised when the HTTP doesn't end with \r\n\r\n. (And as you say, there are a lot of good packets with no \r\n\r\n at the end.)

Here is the bug report to your problem.