FileRandomizer Guide: Automate Random File Selection & Distribution### Introduction
FileRandomizer is a workflow-focused tool designed to automate the random selection, shuffling, renaming, and distribution of files. Whether you need to create randomized test datasets, anonymize filenames for privacy, sample media for A/B tests, or distribute assets evenly among team members, FileRandomizer reduces repetitive manual work and adds reproducibility to random operations.
Key Use Cases
- Creating randomized dataset subsets for machine learning or QA testing.
- Anonymizing filenames to remove identifying metadata for privacy or blind review.
- Randomized media playlists for exhibitions, kiosks, or social content experiments.
- Balanced file distribution across folders, drives, or team members for manual review or processing.
- Shuffling and renaming files before archival to avoid predictable patterns.
Core Features
- Random selection by count, percentage, or conditional filters (extension, size, date).
- Deterministic shuffling via seed values to reproduce results.
- Batch renaming with patterns, prefixes, suffixes, timestamps, and randomized tokens.
- Distribution algorithms: round-robin, weighted distribution, and size-aware balancing.
- Dry-run mode to preview actions without changing files.
- Logging and output reports (CSV/JSON) detailing chosen files and operations performed.
- Cross-platform support and command-line interface for scripting.
Installation & Setup
- Download the appropriate binary or package for your OS (Windows/macOS/Linux) or install via package manager if available.
- Ensure you have permissions to read/write target directories.
- (Optional) Add FileRandomizer to PATH for convenience.
Example (Linux/macOS using a hypothetical installer):
curl -sSL https://example.com/filerandomizer/install.sh | bash filerandomizer --version
Command-line Basics
FileRandomizer uses a consistent CLI pattern: filerandomizer [action] [options] [targets]
Common actions:
- select — pick files randomly
- shuffle — reorder files within a folder or playlist
- rename — batch rename chosen files
- distribute — move/copy files into multiple destinations
Example: randomly select 50 JPGs and copy them to ./sample
filerandomizer select --count 50 --ext .jpg ./photos --out ./sample --copy
Selection Options
- –count N — choose N files.
- –percent P — choose P% of matching files.
- –ext EXT — filter by extension (accepts multiple).
- –min-size / –max-size — filter by file size.
- –older-than / –newer-than — filter by modification date.
- –seed N — set RNG seed for reproducible selections.
- –pattern — glob or regex to match filenames.
Example: deterministic 10% selection of .csv files modified in last 30 days
filerandomizer select --percent 10 --ext .csv --newer-than 30d --seed 42 ./datasets --out ./mini
Renaming Patterns
Rename tokens:
- {index} — sequential number (padding configurable).
- {rand:N} — random alphanumeric token of length N.
- {timestamp} — ISO timestamp.
- {orig} — original filename without extension.
- {ext} — file extension.
Example: rename files with randomized token and original name
filerandomizer rename --pattern "{rand:6}_{orig}.{ext}" ./sample --dry-run
Distribution Strategies
- Round-robin: distribute files evenly in turn across destinations.
- Weighted: give destinations different weights (e.g., reviewer A: 2, B: 1).
- Size-aware: aim to balance total bytes per destination rather than file count.
Example: distribute files to three reviewers with weights
filerandomizer distribute --strategy weighted --weights 2,1,1 ./batch --dest ./A ./B ./C --move
Integrations & Automation
- Use in CI pipelines to generate randomized test artifacts.
- Combine with rsync or cloud CLI tools for remote distribution.
- Integrate into media servers to create shuffled playlists automatically.
- Expose as a library or API for programmatic control in Python/Node scripts.
Example (bash + cron): generate daily randomized promo images
0 2 * * * /usr/local/bin/filerandomizer select --count 10 --ext .png /srv/promos --out /srv/today --copy
Safety & Best Practices
- Always run with –dry-run first to preview actions.
- Use –seed when you need reproducibility.
- Keep logs (CSV/JSON) for auditability.
- Test distribution weights on a small sample before full runs.
- Ensure backups exist before destructive operations like –move or –delete.
Troubleshooting
- Permission errors: verify user has read/write on source/destination.
- Missing files after distribution: check for filename collisions; enable overwrite warnings.
- Unexpected selection counts: confirm filters and hidden files handling (dotfiles).
- Performance issues on very large directories: use filesystem-level indexing or provide path lists via stdin.
Example Workflows
- Create a blind review set: anonymize 200 PDFs, copy to review folder, keep mapping in CSV.
- Prepare training subset: select 25% of labeled images per class using stratified selection (if supported), export list for reproducibility.
- Rotate kiosk playlist: shuffle media and create a timestamped playlist file every hour.
Output & Reporting
FileRandomizer can produce:
- CSV mapping original -> new path/name
- JSON report with metadata (size, checksum, selection method)
- Log file with timestamps and command options used
Example CSV header: filename,orig_path,new_path,size,sha256,seed,action,timestamp
Development Notes (for contributors)
- Keep RNG implementation compatible across platforms for seed reproducibility.
- Design modular filters so new criteria (e.g., image dimensions, audio length) can be added.
- Provide thorough unit tests for distribution algorithms and collision handling.
Conclusion
FileRandomizer streamlines repetitive file-handling tasks by combining flexible selection filters, deterministic randomness, and robust distribution strategies. With dry-run previews, logging, and CLI automation, it fits into testing, privacy, media, and team workflows while minimizing manual overhead.
Leave a Reply