Skip to content

Conversation

@WahiduzzamanKhan
Copy link

Summary

This PR introduces a new function remove_slides_bulk() that allows users to efficiently remove multiple slides from a PowerPoint presentation in a single operation. This enhancement addresses the performance concerns when removing multiple slides by calling remove_slide() multiple times.

Changes Made

New Function: remove_slides_bulk()

Location: R/pptx_slide_manip.R

  • Purpose: Remove multiple slides from an rpptx presentation efficiently
  • Parameters:
    • x: an rpptx object
    • indices: a numeric vector of slide indices to remove
    • rm_images: if TRUE (default), images in removed slides are also cleaned up
  • Returns: Modified rpptx object with slides removed

New Function: clean_unused_media()

Location: R/pptx_slide_manip.R

  • Purpose: Remove media files that are no longer referenced by any slides or components in the PowerPoint presentation
  • Parameters:
    • x: an rpptx object
  • Returns: Modified rpptx object (invisibly)
  • Benefits: Helps reduce file size by removing orphaned images and media files

Internal Implementation Changes

File: R/ppt_classes.R

  • Added get_slide_xml_info() public method to retrieve slide XML information for bulk operations
  • Added remove_slides_bulk() method to presentation class for handling bulk removal operations
  • Enhanced internal slide management for batch operations

File: R/ppt_class_dir_collection.R

  • Added remove_slides_bulk() method for efficient collection management
  • Improved file system operations for bulk deletions

@markheckmann
Copy link
Contributor

Dear @WahiduzzamanKhan ,

great that you want to contribute to officer and thanks for your PR.

@davidgohel asked my to have a look at your PR as it is related to issue #634 , which I opened.


Feedback

I will have a deeper look at the PR in the near future, but just some first remarks:

  1. Finalizing the PR requires adding tests (see /tests/testthat/) to ensure that the new features work as expected. Example: ensure that clean_unused_media() does not delete media from a currently unused master layout (or make this an option). Let me know if you need help with that.
  2. As noted by David, your PR is related to feat: vectorize index arg in remove_slide() #634 . I do not recommend creating an new additional function for deletion as remove_slide() already exists. Instead, I would prefer to modify how remove_slide() works under the hood using your approach of bulk deletion (please also see my request below).
  3. The clean_unused_media() function is related to the pending (unfinished) PR Add function to remove unused media files #571 . Different from Add function to remove unused media files #571 , your PR exposes the cleaning function to the user. I see some benefit in exporting it, as PowerPoint tries - but AFAIK - does not reliably delete unused media files. But I am wondering: Does it make sense to expose the cleanup function to the user or should it be kept internal and be called by other functions where appropriate instead (i.e. effectively never having to call it explicity at all)? @WahiduzzamanKhan , @davidgohel : What are your thoughts on that? What would be a scenario, where this must be done explicitly?

A personal request:

I just started this R YouTube Channel, where the first series is on issue #634 (video 1 and video 2). I needed a very simple issue for this, which be fixed in a few videos, so I went with #634 (vecorizing index arg in remove_slide()) . Merging your PR would effectively make #634 obsolete and I could not finish the video series.

My suggestion: Could we keep remove_slides_bulk() internal in your PR (i.e. also no need to rename it to remove_slide then). Then, in my video, I could just change remove_slide() to use remove_slides_bulk() internally, so I can still finish my video series. @WahiduzzamanKhan : Would you be okay with the suggestion?

@WahiduzzamanKhan
Copy link
Author

Hello @markheckmann
Thanks for taking a look into the PR.
Here are my thoughts regarding the feedback:

  1. I can understand that tests are important and a requirement for finalizing the PR. But I have little experience in writing tests so I might need help on that.
  2. I agree that it makes less sense to have more than one function for same task. I can make necessary changes to update the existing remove_slide() function to use the new logic. In the remove_slides_bulk() functoin the parameter is named indices as it can take more than one slide number. How should this be handled when I update the existing funtion's logic? Do we keep the original name index? Or does it make sense to keep both and make at least one required to ensure backwards compatibility?
  3. I don't see any harm in exposing the clean_unused_media() to user. I can imagine some user wanting to perform a cleaning operation without performing any other modifications. I'll leave it to you to decide.

Regarding your personal request, I don't really know what to say. Having an internal function and then calling this function from the exposed one does not make much sense to me. Feels like an unnecessary detour. If the video series is that important you, we can delay the merging of this PR for some time so you can finish the series. Other than that, I don't think there is much else we can do.

@markheckmann
Copy link
Contributor

markheckmann commented Oct 21, 2025

Hi @WahiduzzamanKhan ,

  1. Alright, I can help you with the tests. See below for timing.
  2. I suggest we stick to arg indexfor backward compatibility. There are other officer functions, which also call it index but take more than one. The docs will suffice to clarify this , IMO.
  3. Okay for me to export it, I see that it can be helpful. NB: General goal is to not clutter the package with functions which are not really used, so I always prefer to discuss this.

Thanks, it would be great to slightly postpone the PR. I will be on holiday until Sunday. Beginning of next week I will finish the video series. After that we can add the tests together. As I will also add a few new tests for remove_slide for my PR, you/we may use them as a basis and take it from there.

It would be good if we could already make a list of what behavior is expected and unexpected. We will write a test for all wanted and all unwanted behavior, including expected errors.

Example:

  • Make sure no media is deleted which is not contained on any slide but in an unused layout, etc.

The tests should call each line of the code at least once. Also, please use cli::cli_abort for error messages, see tidyverse style guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants