Skip to content

Implement NL2Code training for Hearthstone dataset #1

@rshin

Description

@rshin
  • Finish token generation mechanism
  • Create encoder
    • Copying tokens
    • Attention
  • Create code to read data
  • Bahdanau attention
  • Generate vocabularies for source and target
    Depending on model, we need different vocabularies
    Look at OpenNMT
    Look at tensor2tensor
  • Blacklist some elements in Python grammar ("ctx" fields)
  • Add optimizers to registry
  • Improve registry to avoid config
  • Setup model to connect everything together (enc2dec)
  • Figure out how to specify vocabulary, grammar, etc. to models
  • Figure out what to do when a singular type has only one derivation:
    Nothing special in the case of no unary closure
  • Masking of actions
    • At training time: no masking applies
  • Deal with types of constants: Num -> object should be Num -> int, Num -> float
  • Name -> identifier -> str to Name -> str
  • Variational dropout
    Complicated to do in PyTorch
  • Ensure that all loss components have positive sign
  • Adjust grammar based on empirical observations throughout the data
  • Implement unary closures

Others

  • Save training progress more permanently
  • Batching + constructing training instructions in separate processes
  • Introduce abstract base classes
    • preproc
    • model
    • encoder

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions