Preliminary information & background
Issues to consider
- How to set up the development environment? I was thinking about virtual machine with Lubuntu; is there any online tutorial about developing Cuckoo?
- Is there any coding guideline; am I supposed to use the Python coding guideline? What about documentation; any tips?
- Which data should I use for the initial development? Should we start with static dataset; maybe the sample that I used for writing the proposal? At some point it was mentioned that some kind of online learning will be required, should I worry about it now?
- Should the development be integrated into Cuckoo straight away or should I start writing it as a standalone application at first?
Short term plan
Once the development environment gets sorted I’d like to use some basic clustering algorithm and focus on features extraction. This naturally leads to defining the clustering goals and similarity metrics (heuristics).
Any comments are welcomed.