Priority: Best Effort --------------------- o remind users to set ReportPefixFormat wisely. E.g. if you have a class B network, and 256 subnets, configuring ReportPrefixFormat to "%H:%M" in "SubNetIO.cf" (to preserve only one day's worth of reports) will create 73,728 HTML files! o Some of CampusIO's options can't be used properly with multiple exporters. Perhaps the configuration should be changed so that you must specify the export IP address with each value in OutputIfIndexes and WebProxyIfIndex. E.g. border1.our.domain:1, border1.our.domain:2, border2.our.domain:3 o For LFAP/slate/lfapd flows records: o Add an option to lfapd to be able to specify how many bytes to subtract per packet. This is necessary because LFAP appears to include the layer to header/frame in the packet size where-as NetFlow is just the IP header and payload. o Have lfapd pay attention to the LFAP timestamps and discard any flows that are not within a certain seconds tolerance of the current time. (This is similar to Tobi's "X-Files-Factor" with RRDTool.) Currently, if you shutdown sfas for a while, it can write very old flows to the "flows.current" because the router might send old flows couldn't be sent in a timely fashion because sfas was down. I (During testing, I actually got flows in "flows.current" that were about 24 hours old!) o Figure out what's causing the huge spikes in FlowScan graphs based on bytes when config changes are made while LFAP is running. For instance, whenever I change the rate-limit on an interface running LFAP, I get an huge spike of traffic. Since there isn't a dramatic spike in the number of flows, it seems as though LFAP might be sending some huge pkt/byte update values? o use SNMP_Session to collect the ifNames so that users can use ifNames rather than ifIndexes to specify the OutputIfIndexes and WebProxyIfIndex values in CampusIO.cf. o If a large flood (such as a DoS) of TCP ACK packets with dynamically forged src/dst addresses is destined for port 21 (ftp), it causes %CampusIO::FTPSession to grow without bound. In one such DoS, I saw the flowscan process grow to >300MB in size, and it seemed to stopped functioning, blocked in an "uninterruptible sleep" under Linux, e.g.: 2000/11/11 11:20:26 %CampusIO::FTPSession -> 683/65536 2000/11/11 11:25:02 %CampusIO::FTPSession -> 59362/131072 2000/11/11 11:25:03 %CampusIO::FTPSession -> 59227/131072 2000/11/11 11:32:13 %CampusIO::FTPSession -> 424790/1048576 2000/11/11 11:32:20 %CampusIO::FTPSession -> 424633/1048576 2000/11/11 11:46:50 %CampusIO::FTPSession -> 591817/1048576 2000/11/11 13:02:48 %CampusIO::FTPSession -> 591723/1048576 This needs to be addressed, perhaps by surpressing maintenance of these hash/cache data objects once they reach a certain size, or perhaps just invoking the purge algorithm from within CampusIO's wanted function whenever the hash gets too large. (I don't think Net::Patricia will really help here as a Patricia Trie, while smaller than a hash, will become very large too.) o Jeff B. suggested that maybe we can detect suspected TCP retransmissions (due to packet drops from rate-limits) based on an imbalance in the number of inbound and outbound packets in a TCP flow. Perhaps we can match up pairs of TCP flows (that occur in the same 5 minute flow file) that have the same address/port pairs. Limiting this to just flows that have SYN|ACK|FIN is probably sufficient, then report discrepancies between the # of packets in one direction vs. the other. (This means retransmissions probably happened and may be very interesting to correlate with droped packets based no CAR stats.) o Change graphs Makefile ("graphs.mf.in") to do calculations in bits-per-second rather than megabits-per-second since RRDtool does a nice job of displaying things with the appropriate metric abbreviation on its own. Priority: LBE ------------- o Fix missing 554*.rrd problem that some folks saw with FlowScan-1.005. (For the time being the workaround is to create it manually with "rrdtool create" as posted to the mailing list.) o Add ICMPTypes option in CampusIO? This won't work with LFAP because it does not include ICMP type/code info in its flows. o Write a new AutoAS report. This will assume that peer-as is configured (so that we won't get too many AS src/dst pairs) and will automatically create RRD files for them. The list of RRD files that are updated after processing each raw flow file should be the entire set of all AS RRD files that exist, not just those AS pairs for which traffic was seen during this sample. Then we can use a utility like "maxfetch" to determine the most active AS pairs and automagically graph them (without using the graphs.mf Makefile technique). Perhaps the graph colors should be based on the 8(?) gnuplot default colors. o Add flowscan.rrd, flowscan_cpu.rrd functionality into "flowscan" script. (This should be configurable via an option since it requires that FlowScan needs RRDtool even if used w/o CampusIO.) These RRD files contain performance info about FlowScan itself. "flowscan.rrd" should contain: bytes, pkts, flows, and perhaps some stuff about caches such as: realservers, napservers, ftppasv, etc. "flowscan_cpu.rrd" should contain: find_real, find_user, find_sys, report_real, report_user, report_sys, report_latesecs o Attempt to identify other collaborative file sharing apps such as: scour, or gnutella which have no central rendesvous server(s). SX (Scour eXchange) - http://sx.scour.com/ SX spec: http://sx.scour.com/stp-1.0pre6.html psx (Perl Scour eXchange) http://sixpak.cs.ucla.edu/psx/, http://psx.sourceforge.net gnapster - http://download.sourceforge.net/gnapster/ Gnutella Homepage - http://gnutella.wego.com gnutella protocol spec: http://gnutella.wego.com/go/wego.pages.page?groupId=116705&view=page&pageId=119598&folderId=116767&panelId=-1&action=view Knowbuddy FAQ - http://www.rixsoft.com/Knowbuddy/gnutellafaq.html o Make the "--step" time configurable (according to the flowscan wait time). Currently, even though the "flowscan.cf" seems to indicate that it's configurable, it probably makes absolutely no sense to change the "WaitSeconds" (or with "-s" on the cflowd command line) because the "--step 300" is hard-coded in "CampusIO.pm". o Fix CampusIO.pm regarding ':' in ".rrd" file names Perhaps this should be written as a patch to RRDTOOL so that it handles ":" in file names? Currently, RRD files for the configured ASPairs contain a ':' in the file name. This is apparently a no-no with RRDTOOL since, although it allows you create files with these names, it doesn't let you graphs using them because of how the API uses ':' to seperate arguments. For the time being, if you want to graph AS information, you must manually create symbolic links in your graphs sub-dir . i.e. $ cd graphs $ ln -s 0:42.rrd Us2Them.rrd $ ln -s 42:0.rrd Them2Us.rrd Perhaps the simple fix is to do what packages such as Cricket do, i.e. change the ':' to '_'. o Fix "flowscan" and its rc script so that "/etc/init.d/flowscan stop" doesn't kill flowscan in a "critical section". Although I haven't seen it happen, I think if the timing is off it could kill(1) flowscan during RRD operations, possibly resulting in a corrupt ".rrd" file. This should probably be implemented by having the script "ask" flowscan to shutdown ASAP - possibly by creat(2)ing a file or writing into a fifo. Then flowscan should check for this signal before it starts RRD updates. It should also be, of course, able to be interrupted for shutdown while it's sleeping. o Allow flowscan logfile to be specified in "flowscan.cf". e.g.: LogFile /var/log/flowscan.log Then have flowscan open this and dup/select it for both STDOUT and STDERR to catch warnings from reporting packages. Have flowscan periodically rename the log file, and open an new one (every day or whatever) so that we don't have to shut-down flowscan to trim the log file. o ? Unify configuration files so that we don't need to redundantly specify things like "OutputDir" in the configuration file for each report class. Perhaps introducing a "FlowScan.cf" would suffice, and it would be accessed in the report packages as $self->{FlowScan}{OutputDir}. o Add a "by Application" graphs (Mbps, pkts, flows) to "graphs.mf.in" which show I/O by applications such as web client (http_src in + http_dst out + https_src in + https_dst out), web server (http_src out + http_dst in + https_src out + https_dst in), news (nntp), file transfer (ftp (+nfs?)), email (smtp + pop + imap), Napster (NapUser + NapUserMaybe), RealMedia (Real), MCAST, and unknown (based on subtracting from total). It would be nice if this graph split it out by in and out. Once this graph is done, "RealServer I/O" should be taken out of the "Well Known Services" graphs. o Write a new "FlowDivert" report which controls how flows are saved by diverting them to the files specified in this report's configuration. Note that Jay Ford has essentially this. See the discussion in the flowscan mailing list archive. (Nov 2, 2000) If source and destination address was the only selection criteria allowed, a sample "FlowDivert_subnets.boulder" file might look like this (note that a specific host can be specified as "/32" subnet): SUBNET=10.42.42.42/32 DESCRIPTION=our interesting host SAVEDIR=saved/host/our_host = SUBNET=10.0.1.0/24 DESCRIPTION=our first subnet SAVEDIR=saved/subnet/first = SUBNET=10.0.2.0/24 DESCRIPTION=our second subnet SAVEDIR=saved/subnet/second Alternatively, the entries in the configuration file could have arbitrary bits of perl code to be evaluated (like the expression to "flowdumper -e "), but I'm scared that that could be slow. E.g. "FlowDivert.boulder": SAVEDIR=saved/host/our_host DESCRIPTION=our interesting host EXPR=unpack("N", inet_aton("10.42.42.42")) == $srcaddr || unpack("N", inet_aton("10.42.42.42")) == $dstaddr = SAVEDIR=saved/subnet/our_subnet DESCRIPTION=our subnet EXPR=unpack("N", inet_aton("10.0.1.0")) == (0xffffff00 & $srcaddr) || unpack("N", inet_aton("10.0.1.0")) == (0xffffff00 & $dstaddr)