Priority: Best Effort
---------------------

  o remind users to set ReportPefixFormat wisely.  E.g. if you have a
    class B network, and 256 subnets, configuring ReportPrefixFormat to
    "%H:%M" in "SubNetIO.cf" (to preserve only one day's worth of
    reports) will create 73,728 HTML files!

  o Some of CampusIO's options can't be used properly with multiple
    exporters.  Perhaps the configuration should be changed so that
    you must specify the export IP address with each value in
    OutputIfIndexes and WebProxyIfIndex.  E.g.

       border1.our.domain:1, border1.our.domain:2, border2.our.domain:3

  o For LFAP/slate/lfapd flows records:

       o Add an option to lfapd to be able to specify how many bytes to
	 subtract per packet.  This is necessary because LFAP appears
	 to include the layer to header/frame in the packet size
	 where-as NetFlow is just the IP header and payload.

       o Have lfapd pay attention to the LFAP timestamps and discard
	 any flows that are not within a certain seconds tolerance of
	 the current time.  (This is similar to Tobi's "X-Files-Factor"
	 with RRDTool.)  Currently, if you shutdown sfas for a while,
	 it can write very old flows to the "flows.current" because the
	 router might send old flows couldn't be sent in a timely
	 fashion because sfas was down.  I (During testing, I actually
	 got flows in "flows.current" that were about 24 hours old!)

       o Figure out what's causing the huge spikes in FlowScan graphs
	 based on bytes when config changes are made while LFAP is
	 running.  For instance, whenever I change the rate-limit on an
	 interface running LFAP, I get an huge spike of traffic.  Since
	 there isn't a dramatic spike in the number of flows, it seems
	 as though LFAP might be sending some huge pkt/byte update
	 values?

  o use SNMP_Session to collect the ifNames so that users can use
    ifNames rather than ifIndexes to specify the OutputIfIndexes and
    WebProxyIfIndex values in CampusIO.cf.

  o If a large flood (such as a DoS) of TCP ACK packets with
    dynamically forged src/dst addresses is destined for port 21 (ftp),
    it causes %CampusIO::FTPSession to grow without bound.  In one such
    DoS, I saw the flowscan process grow to >300MB in size, and it
    seemed to stopped functioning, blocked in an "uninterruptible
    sleep" under Linux, e.g.:

       2000/11/11 11:20:26 %CampusIO::FTPSession -> 683/65536
       2000/11/11 11:25:02 %CampusIO::FTPSession -> 59362/131072
       2000/11/11 11:25:03 %CampusIO::FTPSession -> 59227/131072
       2000/11/11 11:32:13 %CampusIO::FTPSession -> 424790/1048576
       2000/11/11 11:32:20 %CampusIO::FTPSession -> 424633/1048576
       2000/11/11 11:46:50 %CampusIO::FTPSession -> 591817/1048576
       2000/11/11 13:02:48 %CampusIO::FTPSession -> 591723/1048576

    This needs to be addressed, perhaps by surpressing maintenance of
    these hash/cache data objects once they reach a certain size, or
    perhaps just invoking the purge algorithm from within CampusIO's
    wanted function whenever the hash gets too large.  (I don't think
    Net::Patricia will really help here as a Patricia Trie, while
    smaller than a hash, will become very large too.)

  o Jeff B. suggested that maybe we can detect suspected TCP
    retransmissions (due to packet drops from rate-limits) based on an
    imbalance in the number of inbound and outbound packets in a TCP
    flow.

    Perhaps we can match up pairs of TCP flows (that occur in the same
    5 minute flow file) that have the same address/port pairs.
    Limiting this to just flows that have SYN|ACK|FIN is probably
    sufficient, then report discrepancies between the # of packets in
    one direction vs. the other.  (This means retransmissions probably
    happened and may be very interesting to correlate with droped
    packets based no CAR stats.)

  o Change graphs Makefile ("graphs.mf.in") to do calculations in
    bits-per-second rather than megabits-per-second since RRDtool does
    a nice job of displaying things with the appropriate metric
    abbreviation on its own.

Priority: LBE
-------------

  o Fix missing 554*.rrd problem that some folks saw with
    FlowScan-1.005.  (For the time being the workaround is to create it
    manually with "rrdtool create" as posted to the mailing list.)

  o Add ICMPTypes option in CampusIO?
    This won't work with LFAP because it does not include ICMP
    type/code info in its flows.

  o Write a new AutoAS report.  This will assume that peer-as is
    configured (so that we won't get too many AS src/dst pairs) and
    will automatically create RRD files for them.  The list of RRD
    files that are updated after processing each raw flow file should
    be the entire set of all AS RRD files that exist, not just those
    AS pairs for which traffic was seen during this sample.  Then
    we can use a utility like "maxfetch" to determine the most active
    AS pairs and automagically graph them (without using the graphs.mf
    Makefile technique).  Perhaps the graph colors should be based on
    the 8(?) gnuplot default colors.

  o Add flowscan.rrd, flowscan_cpu.rrd functionality into "flowscan"
    script.  (This should be configurable via an option since it
    requires that FlowScan needs RRDtool even if used w/o CampusIO.)
    These RRD files contain performance info about FlowScan itself.
    "flowscan.rrd" should contain:
       bytes, pkts, flows,
    and perhaps some stuff about caches such as:
       realservers, napservers, ftppasv, etc.
    "flowscan_cpu.rrd" should contain:
       find_real,   find_user,   find_sys,
       report_real, report_user, report_sys,
       report_latesecs

  o Attempt to identify other collaborative file sharing apps such
    as: scour, or gnutella which have no central rendesvous server(s).
    SX (Scour eXchange) - http://sx.scour.com/
    SX spec: http://sx.scour.com/stp-1.0pre6.html
    psx (Perl Scour eXchange) http://sixpak.cs.ucla.edu/psx/, http://psx.sourceforge.net
    gnapster - http://download.sourceforge.net/gnapster/
    Gnutella Homepage - http://gnutella.wego.com
    gnutella protocol spec: http://gnutella.wego.com/go/wego.pages.page?groupId=116705&view=page&pageId=119598&folderId=116767&panelId=-1&action=view
    Knowbuddy FAQ - http://www.rixsoft.com/Knowbuddy/gnutellafaq.html

  o Make the "--step" time configurable (according to the flowscan wait
    time).  Currently, even though the "flowscan.cf" seems to indicate
    that it's configurable, it probably makes absolutely no sense to
    change the "WaitSeconds" (or with "-s" on the cflowd command line)
    because the "--step 300" is hard-coded in "CampusIO.pm".

  o Fix CampusIO.pm regarding ':' in ".rrd" file names

    Perhaps this should be written as a patch to RRDTOOL so that it
    handles ":" in file names?

    Currently, RRD files for the configured ASPairs contain a ':' in
    the file name.  This is apparently a no-no with RRDTOOL since,
    although it allows you create files with these names, it doesn't
    let you graphs using them because of how the API uses ':' to
    seperate arguments.

    For the time being, if you want to graph AS information, you must
    manually create symbolic links in your graphs sub-dir . i.e.

       $ cd graphs
       $ ln -s 0:42.rrd Us2Them.rrd
       $ ln -s 42:0.rrd Them2Us.rrd

    Perhaps the simple fix is to do what packages such as Cricket do,
    i.e. change the ':' to '_'.

  o Fix "flowscan" and its rc script so that "/etc/init.d/flowscan
    stop" doesn't kill flowscan in a "critical section".  Although I
    haven't seen it happen, I think if the timing is off it could
    kill(1) flowscan during RRD operations, possibly resulting in a
    corrupt ".rrd" file.  This should probably be implemented by having
    the script "ask" flowscan to shutdown ASAP - possibly by
    creat(2)ing a file or writing into a fifo.  Then flowscan should
    check for this signal before it starts RRD updates.  It should also
    be, of course, able to be interrupted for shutdown while it's
    sleeping.

  o Allow flowscan logfile to be specified in "flowscan.cf". e.g.:

       LogFile /var/log/flowscan.log

    Then have flowscan open this and dup/select it for both STDOUT and
    STDERR to catch warnings from reporting packages.  Have flowscan
    periodically rename the log file, and open an new one (every day or
    whatever) so that we don't have to shut-down flowscan to trim the
    log file.

  o ? Unify configuration files so that we don't need to redundantly
    specify things like "OutputDir" in the configuration file for each
    report class.  Perhaps introducing a "FlowScan.cf" would suffice,
    and it would be accessed in the report packages as
    $self->{FlowScan}{OutputDir}.

  o Add a "by Application" graphs (Mbps, pkts, flows) to "graphs.mf.in"
    which show I/O by applications such as web client (http_src in +
    http_dst out + https_src in + https_dst out), web server (http_src
    out + http_dst in + https_src out + https_dst in), news (nntp),
    file transfer (ftp (+nfs?)), email (smtp + pop + imap), Napster
    (NapUser + NapUserMaybe), RealMedia (Real), MCAST, and unknown
    (based on subtracting from total).  It would be nice if this graph
    split it out by in and out.

    Once this graph is done, "RealServer I/O" should be taken out of
    the "Well Known Services" graphs.

  o Write a new "FlowDivert" report which controls how flows are saved
    by diverting them to the files specified in this report's
    configuration.

    Note that Jay Ford <jay-ford@uiowa.edu> has essentially this.  See
    the discussion in the flowscan mailing list archive.  (Nov 2, 2000)

    If source and destination address was the only selection criteria
    allowed, a sample "FlowDivert_subnets.boulder" file might look like
    this (note that a specific host can be specified as "/32" subnet):

       SUBNET=10.42.42.42/32
       DESCRIPTION=our interesting host
       SAVEDIR=saved/host/our_host
       =
       SUBNET=10.0.1.0/24
       DESCRIPTION=our first subnet
       SAVEDIR=saved/subnet/first
       =
       SUBNET=10.0.2.0/24
       DESCRIPTION=our second subnet
       SAVEDIR=saved/subnet/second

    Alternatively, the entries in the configuration file could have
    arbitrary bits of perl code to be evaluated (like the expression to
    "flowdumper -e <expr>"), but I'm scared that that could be slow.
    E.g. "FlowDivert.boulder":

       SAVEDIR=saved/host/our_host
       DESCRIPTION=our interesting host
       EXPR=unpack("N", inet_aton("10.42.42.42")) == $srcaddr || unpack("N", inet_aton("10.42.42.42")) == $dstaddr
       =
       SAVEDIR=saved/subnet/our_subnet
       DESCRIPTION=our subnet
       EXPR=unpack("N", inet_aton("10.0.1.0")) == (0xffffff00 & $srcaddr) || unpack("N", inet_aton("10.0.1.0")) == (0xffffff00 & $dstaddr)