6Gen
6Gen starts with one cluster per seed address, repeatedly merges the most promising pairs, and grows the cluster pattern until a budget is exhausted. The algorithm prefers expansions that increase density while keeping the generated pattern as specific as possible.
- Reference: Placeholder citation (coming soon).
Train
rmap train --seeds seeds/hitlist.txt --output models/6gen.bin six-gen \
--budget 1000000Generate
rmap generate --model models/6gen.bin --count 200000 \
--unique --output sixgen.txtConfiguration
--budget <u128>– maximum aggregate size of all generated cluster patterns (default1_000_000). Increasing the budget allows more aggressive growth at the cost of runtime.
Model notes
- The serialized model stores every discovered cluster pattern and its relative weight. Generation samples clusters proportional to their size.
- Provide a few thousand high-quality seeds to avoid early convergence on sparse regions; noisy seeds can cause the budget to be wasted on low-density patterns.