Skip to content

rag with pdf #37

@navr32

Description

@navr32

Hi Leon . very nice app.
I give a try local with a conda env python 3.10.12.
All start and run ok in think.
I i put llava and an image this is working if i do some ask about this image.
But with pdf this is very bad.
I try with the hover.pdf you have put for testing . I browse it. the pdf is loaded and show in the left panel so i think it have been good ingest to the database.
I i ask about this file at the chat the filename of the pdf file on the left pane is reset . Why very curious ?. Because with the image the file stay register in the list.
And the return about the file subject is bad. And if ask for some precise information in the file the retrieve is very bad.
I have try with a table of number with date put in pdf and the problem is the same..the retrieve see only a very little parts of all the date and numbers.
Do you have done a look about tika ? this is i think better for a rag system. Tika i able to index many many type of file.
Thanks have a nice days.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions