6Forest

6Forest partitions the seed set with repeated Divide Hierarchical Clustering (DHC), trains multiple decision trees over address features inside every partition, then combines the resulting Address Space Forest to emit probes from high-confidence leaves.

  • Reference: Placeholder citation (coming soon).

Train

rmap train --seeds seeds/hitlist.txt --output models/6forest.bin six-forest \
  --min-region-size 32 --iterations 2

Generate

rmap generate --model models/6forest.bin --count 250000 \
  --unique --output 6forest.txt

Configuration

  • --min-region-size <usize> – minimum number of seeds in a DHC region before it is considered stable (default 16).
  • --iterations <usize> – number of outlier refinement passes (default 1). Higher values reinject outliers into another DHC pass to mine additional patterns.

Model notes

  • Training records each cluster pattern and a queue of outliers so the generator can fall back to random emission when a partition would otherwise be empty.
  • Expect noticeably longer training time than 6Gen because each iteration runs multiple DHC passes and tree builders in parallel.