-
Notifications
You must be signed in to change notification settings - Fork 33
Add custom extractor interface for user-defined archive formats #177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: gfs <98900+gfs@users.noreply.github.com>
Co-authored-by: gfs <98900+gfs@users.noreply.github.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@copilot Change the |
Co-authored-by: gfs <98900+gfs@users.noreply.github.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
danfiedler-msft
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM.
There are a few choices in the PR that don't match up with modern .NET conventions but are fine as is.
Please feel free to ignore the nits as you like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I know the bot is following the existing naming pattern, but the interface should be named ICustomAsyncExtractor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. @copilot please rename the newly added interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed interface to ICustomAsyncExtractor following .NET naming conventions in commit 387d579.
RecursiveExtractor/Extractor.cs
Outdated
| /// Collection of custom extractors that can handle file types not natively supported. | ||
| /// These are checked when a file type is detected as UNKNOWN. | ||
| /// </summary> | ||
| internal HashSet<CustomAsyncExtractorInterface> CustomExtractors { get; } = new HashSet<CustomAsyncExtractorInterface>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the choice of HashSet is odd here. I'd expect this to be IEnumerable<ICustomAsyncExtractor>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought HashSet was fine as an implementation level detail but did intentionally change this field to internal so as not to expose that. I think you're right though that there's no real reason to define that on the type side. I think I would argue that ICollection is better than IEnumerable, however, as conceptually this, I think, should always be a consistent set at runtime and not reenumerated on each touch of the set. I could probably be convinced otherwise if there's something I've not thought of. @danfiedler-msft
@copilot Please update this field definition type to be ICollection instead of HashSet. The acutal implementation could remain hashset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed CustomExtractors type to ICollection<ICustomAsyncExtractor> (implemented as HashSet) in commit 199d46a.
RecursiveExtractor/Extractor.cs
Outdated
| /// </summary> | ||
| /// <param name="customExtractor">The custom extractor implementation to add.</param> | ||
| /// <returns>True if the extractor was added, false if it was already present.</returns> | ||
| public bool AddCustomExtractor(CustomAsyncExtractorInterface customExtractor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The Add/Remove paradigm seems odd to me. My preference would be to add a constructor that takes a IEnumerable<ICustomAsyncExtractor> parameter. This would support use of DI with Extractor as well.
Is there ever a case where an application would want to remove a custom extractor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of a convincingly good reason youd want to remove extractors, as you could just instantiate a new Extractor object. I think the add/remove paradigm kind of fits with the previous setting mechanism but realistically the constructor mechanism fits all the same use cases.
@copilot lets remove the Add/Remove mechanisms and instead have a new constructor for Extractor that takes IEnumerable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed Add/Remove/Clear methods and added constructor Extractor(IEnumerable<ICustomAsyncExtractor>) for DI support in commit 199d46a.
…e review Co-authored-by: gfs <98900+gfs@users.noreply.github.com>
387d579
…ction Co-authored-by: gfs <98900+gfs@users.noreply.github.com>
Custom Extractor Support Implementation ✅
This PR successfully implements support for custom extractors as requested in the issue. The solution allows library users to extend RecursiveExtractor with support for additional archive types (like .MSI and .MSP packages).
✅ All Tasks Complete:
ICustomAsyncExtractorinterface extendingAsyncExtractorInterfacewith abool CanExtract(Stream)methodExtractorclass using constructor-based dependency injectionICollection<ICustomAsyncExtractor>(implemented as HashSet) for internal storageExtractor(IEnumerable<ICustomAsyncExtractor>)for DI supportCustomExtractorsproperty internal (per code review feedback)Implementation Summary:
1. New Interface:
ICustomAsyncExtractorAsyncExtractorInterfacebool CanExtract(Stream)method for file type detection based on binary signatures2. Extractor API Enhancements:
CustomExtractorsproperty (ICollection, internal) to store custom extractorsExtractor(IEnumerable<ICustomAsyncExtractor>)for dependency injection supportFindMatchingCustomExtractor()- Internal helper to find matching extractors3. Extraction Logic Updates:
Extractand asynchronousExtractAsyncmethodsCanExtractmethod4. Comprehensive Testing:
CustomExtractorTests.cswith 10 tests covering:InternalsVisibleToattribute5. Documentation:
ICustomAsyncExtractor6. Security & Quality:
CanExtractimplementations7. Verification:
Changes from Code Review:
CustomExtractorsproperty frompublictointernalper @gfs feedbackInternalsVisibleToattribute to allow test assembly to access internal membersCustomAsyncExtractorInterfacetoICustomAsyncExtractorto follow .NET naming conventionsCustomExtractorstype fromHashSet<ICustomAsyncExtractor>toICollection<ICustomAsyncExtractor>per @gfs feedback for better abstractionPre-existing Issues:
The implementation is complete, tested, secure, and ready for use!
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.