taterhead/PCAP-Index
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
UPDATE - APR 2012 It looks like you're best bet for "production" (not that this was ever anywhere near prod...) performance is cxtracker (https://github.com/gamelinux/cxtracker). It turns out my approach of indexing every packet wasn't a winning one :) But I'll leave this code up as it might be interesting to someone else, especially the pyparsing code. OVERVIEW This is a set of scripts to index a PCAP file into a SQLite3 DB and allow you to recover them VERY quickly. WHY Going through PCAPs with tcpdump is slow. Precious minutes eat away at concentration. HISTORY The first results for PCAP indexing from google are: - http://geek00l.blogspot.com/2008/03/sancp-pcap-index.html - http://blog.vorant.com/2008/04/pcap-indexing.html This is the same idea I followed and showed very good speed improvements. The problem was too much coupling. SANCP is a nice tool, but I didn't want to couple a meta data extraction tool to direct packet recovery. One of the problems is with SANCP sessionizing - I'm not sure how good it is as I've seen it do some very odd things when analyzing PCAP files. Started this project because I was bored during a few presentations :) BUILDING The only thing that needs to be built is the index_pkts.c file. Simply run: $ gcc -lpcap -o index_pkts index_pkts.c USAGE To index a pcap file $ /path/to/pcap_index/index_pkts.sh /path/to/pcap_file /patch/to/new/sqlite_db To retrieve packets $ /path/to/pcap_index/get_pkts.py -s /patch/to/sqlite_db -f "src = 1.2.3.4 and dst = 5.6.7.8 and time < 6-apr-2009 12:03:05 and time > 6-apr-2009 12:03:00" The -f flag is for the database filter to use. It's pretty free form - you can bracket expressions, and/or/not things. A time span isn't required. If the -w flag isn't given, the PCAP data will be written to stdout. Possible filter keywords: - time - src - sport - dst - dport - ether_type - ip_proto Arguments can be integers (for time, ether_type, ip_proto, sport, dport). Must be decimal - I haven't built it to parse hex. Arguments to src and dst can be IPv4 or IPv6 addresses. For packets without an IP layer, you can recover them using their MAC addresses as src/dst. The major filter option missing is a BPF-like 'host' option that checks for both src and dst. You can still do this manually by plugging in 'src = y and dst = x or src = x and dst = y'. STATS For a 13GB PCAP, it takes 2h 36min to create the index. SQLite DB file size: 4.6GB. 9s to recover 3k pkts between two IPs. For a 1.8GB PCAP, it takes 2m 48s to create the index. SQLite DB file size: 764MB. A couple of seconds to recover approx 2k pkts between two hosts. FUTURE WORK This would likely be faster and more efficient using MySQL rather than SQLite. I avoided this for the first version in order to make it portable. It would also be neat to integrate this into something like OpenFPC, I think.