How do I decode gzipped HTTP replies from packet captures?
It's actually fairly straightforward with a simple tool called tshark.
Let's suppose you have a capture.pcap
file that was created by a program like this:
tcpdump 'port 80' -A -s65536 -w capture.pcap
And from that capture.pcap
file you want to extract HTTP replies that were gzipped when sent by the server.
Worry not, here is some Python code to do exactly that:
#!/usr/bin/env python3
import zlib
import json
import sys
import shlex, subprocess
cmd = '''\
tshark \
-r %s \
-Y 'http.content_encoding == "gzip"' \
-T fields -e data''' % shlex.quote(sys.argv[1])
args = shlex.split(cmd)
p = subprocess.Popen(args, stdout=subprocess.PIPE)
output, err = p.communicate()
decoded = []
for ll in output.splitlines():
for l in ll.split(b","):
if len(l) > 0:
decoded.append(bytes.fromhex(l.rstrip().decode('ascii')))
decoded = b"".join(decoded)
data = zlib.decompress(decoded,16+zlib.MAX_WBITS)
sys.stdout.buffer.write(data)
Run it as follows on your terminal: python3 <name of file> capture.pcap
. You'll be greeted with the output of the requests in the capture file.