Add summary flag to CLI and add scan_directory function to Python #12
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
When the user runs Magika in the CLI, they now have the option of providing a --summary flag.
If this flag is active (and the --json flag is not active), then Magika's normal output will be followed by a list of detected file types and their count.
When the user runs Majika in Python, they can now use the scan_directory function, which takes in 2 parameters: directory and recursive_scan. Adding a directory tells the function to scan all the files in the directory, and setting recursive_scan to
Truetells the function to recursively scan all subdirectories in the directory.Implementation
Updated
rust/cli/src/main.rsto keep track of how many times each file type was detected.Configured the summary list to display in order of descending file count, followed by alphabetical order.
Added a new scan_directory function to
python/src/magika/magika.py.Added a unit test to
python/tests/test_magika_python_module.py.Added relevant test files to
tests_data/directory.Screenshots
CLI: Output of
cargo run -- --recursive --summary ../../tests_dataPython: Printing the list returned by
magika.scan_directory(tests_data, True)