Skip to content

dadosfera/project-embrapii

Repository files navigation

Text-to-SQL-References

This repository is designed to maintain a curated list of references related Text-to-SQL including State-Of-The-Art, other repositories, code samples and innovative techniques

Repository Structure

  • dados_datasus/: DataSUS ingestion, schema, and JDBC-based access (see its README for DB tunnel/test steps).
  • analises/: Exploratory data analysis notebooks.
  • assets/: Images and supporting assets.
  • paraphrase-benchmark/: Benchmark configs for paraphrase evaluation.
  • README.ipynb: Notebook version of the README/content.
  • ParaphraseEvaluator.py, generated_queries.py, sample.py: Supporting scripts.

Views

  • Nascimento, E.R., García, G., Izquierdo, Y.T. et al.
    LLM-Based Text-to-SQL for Real-World Databases.
    SN Computer Science, 6, 130 (2025).
    Paper | Summary

  • Nascimento, E., García, G., Feijó, L., Victorio, W., Izquierdo, Y., R. de Oliveira, A., Coelho, G., Lemos, M., Garcia, R., Leme, L., and Casanova, M.
    Text-to-SQL Meets the Real-World.
    In Proceedings of the 26th International Conference on Enterprise Information Systems (2024).
    Paper | Summary

  • Zeshun You, Jiebin Yao, Dong Cheng, Zhiwei Wen, Zhiliang Lu, and Xianyi Shen.
    V-SQL: A View-Based Two-Stage Text-to-SQL Framework.
    (2024).
    Paper | Summary

Surveys

  • Yuyu Luo, Guoliang Li, Ju Fan, Chengliang Chai, and Nan Tang.
    Natural Language to SQL: State of the Art and Open Problems.
    PVLDB, 18(12): 5466–5471, 2025.
    Paper | Summary

  • Ali Mohammadjafari, Anthony S. Maida, and Raju Gottumukkala.
    From Natural Language to SQL: Review of LLM-based Text-to-SQL Systems.
    (2025).
    Paper | Summary

Data Lakes

  • Chen, Albert, et al.
    Text-to-SQL for Enterprise Data Analytics.
    arXiv preprint arXiv:2507.14372 (2025).
    Paper | Summary

AutoViz

  • Dibia, V.
    LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics Using Large Language Models.
    arXiv preprint arXiv:2303.02927 (2023).
    Paper | Summary

  • Zhang, R., & Elhamod, M.
    Data-to-Dashboard: Multi-Agent LLM Framework for Insightful Visualization in Enterprise Analytics.
    arXiv preprint arXiv:2505.23695 (2025).
    Paper | Summary

Others

  • Coelho, G.M.C. et al.
    Improving the Accuracy of Text-to-SQL Tools Based on Large Language Models for Real-World Relational Databases.
    In: Strauss, C., Amagasa, T., Manco, G., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. (2024).
    Paper | Summary

  • Catalina Dragusin∗, Katsiaryna Mirylenka∗, Christoph Miksovic Czasch, Michael Glass, Nahuel Defosse, Paolo Scotton, and Thomas Gschwind.
    Grounding LLMs for Database Exploration: Intent Scoping and Paraphrasing for Robust NL2SQL.
    VLDB 2025 Workshop.
    Paper | Summary

  • Haoyang Li, Jing Zhang, Hanbing Liu, Ju Fan, Xiaokang Zhang, Jun Zhu, Renjie Wei, Hongyan Pan, Cuiping Li, and Hong Chen.
    CodeS: Towards Building Open-source Language Models for Text-to-SQL.
    (2024).
    Paper | Summary

  • Cao, Zhenbiao, et al.
    Rsl-sql: Robust schema linking in text-to-sql generation.
    arXiv preprint arXiv:2411.00073 (2024).
    Paper | Summary

  • Uber.
    QueryGPT – Natural Language to SQL Using Generative AI.
    (2024).
    Post | Summary

  • Biswal, Asim, et al.
    Text2SQL is Not Enough: Unifying AI and Databases with TAG.
    arXiv preprint arXiv:2408.14717 (2024).
    Paper | Summary

About

Projeto com todos artefatos e ativos produzidos no projeto da Embrapii, entre UFMG, UFAM e Dadosfera

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors