This repository performs provides an experiment for the generation of recaps from the BOOKSUM (Krysćiński et al. 2022) dataset.
Please clone the repository with the command git clone --recurse-submodules git@github.com:jecGrimm/Recap.git.
If the repository has been cloned via git clone git@github.com:jecGrimm/Recap.git, please run git submodule update --init --recursive to install the needed submodules.
We provide a conda environment which can be installed with the command conda env create -f environment.yml.
To run the compute the SPICE metric please download Stanford CoreNLP 3.6.0 and add the files stanford-corenlp-3.6.0.jar and stanford-corenlp-3.6.0-models.jar to the directory CaptionMetrics/pycocoevalcap/spice/lib/.
We test compare an NER model, SBERT, and Gemma-2-2b-it for the generation of recaps from BOOKSUM (Krysćiński et al. 2022). A recap is a summary of previous content which is important for coming content.
For the NER model and SBERT, we map the last chapter summaries to the second-to-last chapter summaries. For NER, we extract the sentences from the second-to-last chapter summary that contain at least one Named Entitiy in the last chapter summary. For SBERT, we extract the sentences with cosine similarity > 0.1 with the last chapter summary. Gemma-2-2b-it is prompted to generate recaps for the booktitles in BOOKSUM (Krysćiński et al. 2022).
We evaluate the generated recaps with BLEU-1, ROUGE-L, and SPICE. For further analysis, we create figures that show the positions and the sources of the kept sentences.
This directory contains a modified version of the repository wangleihitcs/CaptionMetrics. It is used to compute the ROUGE-L and SPICE scores.
This directory contains the dataset files with the original summaries of BOOKSUM (Krysćiński et al. 2022). Each instance maps one last chapter summary to its second-to-last chapter summaries. Keys:
recap_id: ID of the instance, the format is {<book_id>_<next_source>}
bid: ID of the book
previous_summary_id: list of the second-to-last chapter summary ids
previous_summary: list of the second-to-last chapter summaries
previous_source: list of the second-to-last chapter summary sources
next_summary_id: last chapter summary id
next_summary: last chapter summary
next_source: source of the last chapter summary
This directory contains files with the evaluation metrics from the performed experiments.
This directory contains the notebook for the LLM recap generation. Gemma-2b-2-it needs a GPU to run.
This directory contains the recaps generated by the examined approaches.
This directory contains the figures for the analysis of the kept sentences.
This script provides functions for the creation of the figures for the analysis.
This script contains the class RecapData which maps the chapter summaries from BOOKSUM (Krysćiński et al. 2022) and creates the baseline and gold recaps. The last chapter summaries serve as gold references, the second-to-last summaries as baseline.
This script develops the tresholds for the NER model and for SBERT on the validation split.
This script provides functions to perform the evaluation of the generated recaps. We compute BLEU-1, ROUGE-L, SPICE scores.
This script generates recaps for all examined approaches on the test split and evaluates them.
This script generates recaps with Gemma-2b-2-it. Please note that a GPU is needed to run the script!
This script contains the class NER which can generate extractive recaps via NER matching.
This script contains the class SentenceSimilarity which can generate extractive recaps via cosine similarity.
Krysćiński, Wojciech, Nazneen Rajani, Divyansh Agarwal, Caiming Xiong, and Dragomir Radev. 2022. “BOOKSUM: A Collection of Datasets for Long-Form Narrative Summarization.” In Findings of the Association for Computational Linguistics: EMNLP 2022, edited by Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang, 6536–58. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-emnlp.488.