Reads files stored on remote server using SFTP
embulk-input-sftp v0.3.0+ requires Embulk v0.9.12+
- Plugin type: file input
- Resume supported: yes
- Cleanup supported: yes
- host: (string, required)
- port: (string, default: 22)
- user: (string, required)
- password: (string, default: null)
- secret_key_file: (string, default: null). OpenSSH format is required.
- secret_key_passphrase: (string, default: "")
- user_directory_is_root: (boolean, default: true)
- timeout: sftp connection timeout seconds (integer, default: 600)
- path_prefix: Prefix of output paths (string, required)
- incremental: enables incremental loading(boolean, optional. default: true). If incremental loading is enabled, config diff for the next execution will includelast_pathparameter so that next execution skips files before the path. Otherwise,last_pathwill not be included.
- path_match_pattern: regexp to match file paths. If a file path doesn't match with this pattern, the file will be skipped (regexp string, optional)
- total_file_count_limit: maximum number of files to read (integer, optional)
- min_task_size (experimental): minimum size of a task. If this is larger than 0, one task includes multiple input files. This is useful if too many number of tasks impacts performance of output or executor plugins badly. (integer, optional)
- proxy:
- type: (string(http | socks | stream), required, default: null)- http: use HTTP Proxy
- socks: use SOCKS Proxy
- stream: Connects to the SFTP server through a remote host reached by SSH
 
- host: (string, required)
- port: (int, default: 22)
- user: (string, optional)
- password: (string, optional, default: null)
- command: (string, optional)
 
- type: (string(http | socks | stream), required, default: 
in:
  type: sftp
  host: 127.0.0.1
  port: 22
  user: embulk
  secret_key_file: /Users/embulk/.ssh/id_rsa
  secret_key_passphrase: secret_pass
  user_directory_is_root: false
  timeout: 600
  path_prefix: /data/sftpTo filter files using regexp:
in:
  type: sftp
  path_prefix: logs/csv-
  ...
  path_match_pattern: \.csv$   # a file will be skipped if its path doesn't match with this pattern
  ## some examples of regexp:
  #path_match_pattern: /archive/         # match files in .../archive/... directory
  #path_match_pattern: /data1/|/data2/   # match files in .../data1/... or .../data2/... directory
  #path_match_pattern: .csv$|.csv.gz$    # match files whose suffix is .csv or .csv.gzWith proxy
in:
  type: sftp
  host: 127.0.0.1
  port: 22
  user: embulk
  secret_key_file: /Users/embulk/.ssh/id_rsa
  secret_key_passphrase: secret_pass
  user_directory_is_root: false
  timeout: 600
  path_prefix: /data/sftp
  proxy:
    type: http
    host: proxy_host
    port: 8080
    user: proxy_user
    password: proxy_secret_pass
    command:in:
  type: sftp
  host: 127.0.0.1
  port: 22
  user: embulk
  secret_key_file: /Users/embulk/.ssh/id_rsa
  secret_key_passphrase: secret_pass
  user_directory_is_root: false
  timeout: 600
  path_prefix: /data/sftpPlease set path of secret_key_file as follows.
in:
  type: sftp
  ...
  secret_key_file: /path/to/id_rsa
  ...You can also embed contents of secret_key_file at config.yml.
in:
  type: sftp
  ...
  secret_key_file:
    content: |
      -----BEGIN RSA PRIVATE KEY-----
      ABCDEFG...
      HIJKLMN...
      OPQRSTU...
      -----END RSA PRIVATE KEY-----
  ...$ ./gradlew gem  # -t to watch change of files and rebuild continuously
$ ./gradlew bintrayUpload # release embulk-input-sftp to Bintray maven repo
$ ./gradlew test  # -t to watch change of files and rebuild continuously