Skip to content

Feature Request: Return Structured Resume Analysis Results #28

@avanish-garg

Description

@avanish-garg

Issue Title

Feature Request: Return Structured Resume Analysis Results


Issue Description

Problem

Currently, the Solo AI server returns a generic response (output) when analyzing resumes. For example:

{
  "output": "I'm a helpful AI assistant named SmolLM, designed to help users analyze and interpret resumes..."
}

This format is not suitable for applications that require structured data, such as extracting specific fields (e.g., skills, education, experience) and validating the resume for anomalies.

Proposed Solution

Update the Solo AI server to return structured resume analysis results in the following format:

{
  "isValid": true,
  "metadata": {
    "skills": ["JavaScript", "React", "Node.js"],
    "education": [
      { "degree": "Bachelor of Science", "institution": "XYZ University", "year": 2020 }
    ],
    "experience": [
      { "title": "Software Engineer", "company": "ABC Corp", "duration": "2 years" }
    ]
  }
}

Key Features

  1. Structured Metadata:

    • Extract and return specific fields from the resume, such as:
      • Skills: A list of skills mentioned in the resume.
      • Education: A list of educational qualifications (degree, institution, year).
      • Experience: A list of work experience (title, company, duration).
  2. Validation:

    • Return a isValid flag to indicate whether the resume is valid (e.g., no anomalies or inconsistencies detected).
  3. Customizable Analysis:

    • Allow users to specify the type of analysis they need (e.g., skills extraction, education extraction, anomaly detection).

Example Use Case

In a decentralized resume verification system (e.g., TrustTag), the backend sends a resume to Solo AI for analysis. Solo AI processes the resume and returns structured metadata, which is then stored in a database and on the blockchain. The structured data is used for:

  • Resume validation.
  • Skill matching for job applications.
  • Generating insights for employers.

Implementation Details

  1. Resume Parsing:

    • Use a resume parsing library (e.g., pyresparser, python-docx, PyPDF2) to extract text from resumes (PDF/DOCX).
    • Use NLP techniques to identify and extract structured data (e.g., skills, education, experience).
  2. Anomaly Detection:

    • Implement logic to detect inconsistencies or anomalies in the resume (e.g., mismatched dates, fake institutions).
  3. API Response:

    • Update the /predict endpoint to return structured data in the proposed format.

Example API Request

{
  "file_content": "<hex-encoded resume file content>",
  "prompt": "analyze_resume"
}

Example API Response

{
  "isValid": true,
  "metadata": {
    "skills": ["JavaScript", "React", "Node.js"],
    "education": [
      { "degree": "Bachelor of Science", "institution": "XYZ University", "year": 2020 }
    ],
    "experience": [
      { "title": "Software Engineer", "company": "ABC Corp", "duration": "2 years" }
    ]
  }
}

Additional Context

  • Current Behavior: Solo AI returns a generic response (output) that is not machine-readable or structured.
  • Expected Behavior: Solo AI should return structured data that can be directly used by applications for further processing (e.g., database storage, blockchain integration).

Why This is Important

Structured resume analysis results are essential for applications that require:

  • Automated resume validation.
  • Skill matching for job applications.
  • Generating insights for employers.
  • Storing verified resume data on the blockchain.

Without structured data, developers have to manually parse and process the generic response, which is error-prone and inefficient.


Proposed Changes

  1. Update the Solo AI server to extract structured data from resumes.
  2. Add a isValid flag to indicate whether the resume is valid.
  3. Update the /predict endpoint to return structured data in the proposed format.

Acceptance Criteria

  • The /predict endpoint should return structured resume analysis results in the proposed format.
  • The response should include:
    • A isValid flag.
    • Structured metadata (skills, education, experience).
  • The implementation should support common resume formats (PDF, DOCX).

Additional Notes

  • If this feature is implemented, it would greatly enhance the usability of Solo AI for applications like TrustTag.
  • I’m happy to provide more details or assist with testing if needed.

Labels

  • feature-request
  • enhancement
  • resume-analysis

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions