Skip to content

obrizan/doi-parser

Repository files navigation

DOI Parser

A lightweight, automation-friendly tool for extracting DOIs from messy bibliographic references. The project uses an OpenAI Agent combined with a Crossref lookup tool to resolve up to dozens of references in parallel and produce a clean CSV, containing:

  • original reference
  • DOI URL (if found)
  • Scopus URL (if found)

Installation

Install dependencies using uv:

uv sync

Configuration

Set-up environment variables:

  • SCOPUS_API_KEY
  • OPENAI_API_KEY

or use example.env file.

Usage

Put text references into references.txt (one reference per line):

1. IEEE Standard for Integrated Circuit (IC) Open Library Architecture (OLA), in IEEE STD 1481-2009 , vol., no., pp.1-658, 11 March 2010.
2. Esmaieli, E.; Sedaghat, Y.; Peiravi, A. Fanout-Based Reliability Model for SER Estimation in Combinational Circuits, in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 72, no. 1, pp. 228-240, Jan. 2025.
3. Hwang, M.-E.; Jung, S.-O.; Roy, K. Slope Interconnect Effort: Gate-Interconnect Interdependent Delay Modeling for Early CMOS Circuit Simulation, in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 56, no. 7, pp. 1428-1441, July 2009.
...

Run the assistant:

python3 -m assistant

Check the output results.csv:

"1. IEEE Standard for Integrated Circuit (IC) Open Library Architecture (OLA), in IEEE STD 1481-2009 , vol., no., pp.1-658, 11 March 2010.",https://doi.org/10.1109/ieeestd.2009.5430852,,DOI found for the IEEE Standard for Integrated Circuit (IC) Open Library Architecture (OLA).
"2. Esmaieli, E.; Sedaghat, Y.; Peiravi, A. Fanout-Based Reliability Model for SER Estimation in Combinational Circuits, in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 72, no. 1, pp. 228-240, Jan. 2025.",https://doi.org/10.1109/tcsi.2024.3458864,https://www.scopus.com/pages/publications/85204697423,The DOI for the article 'Fanout-Based Reliability Model for SER Estimation in Combinational Circuits' is 10.1109/tcsi.2024.3458864.
"3. Hwang, M.-E.; Jung, S.-O.; Roy, K. Slope Interconnect Effort: Gate-Interconnect Interdependent Delay Modeling for Early CMOS Circuit Simulation, in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 56, no. 7, pp. 1428-1441, July 2009.",https://doi.org/10.1109/tcsi.2008.2006217,https://www.scopus.com/pages/publications/67651156228,DOI found for the paper: Slope Interconnect Effort: Gate-Interconnect Interdependent Delay Modeling for Early CMOS Circuit Simulation.

Columns are:

  • original reference
  • DOI URL
  • Scopus URL
  • comment

About

AI-based tool for extracting DOIs from messy bibliographic references

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages