KING

KING (Kinship-based Inference for GWAS) is a relationship inference tool that estimates kinship coefficients for all pairwise relationships. Unrelated pairs can be precisely separated from close relatives with no false positives, with accuracy up to 3rd- or 4th-degree (depending on array or WGS) for --related and --ibdseg analyses, and up to 2nd-degree for --kinship analysis.

The KING orchestration workflow estimates kinship coefficients from VCF files. At least two samples must be used as input to the workflow. Samples for analysis can either be in multiple single-sample VCFs, a single joint VCF, or multiple joint VCFs that do not have overlapping samples. There are four options for running KING: "ibdseg", "kinship", "related", and "duplicate". This workflow will run with the "related" option by default. Each option uses a different algorithm or specifications to estimate kinship. The output file types vary for each flag that can be run with KING. Refer to the original KING manual for more information on the differences between the options for running KING.

Note: The "ibdseg" option will fail to produce an output if no IBD segments are found. It can also fail if there are less than 10 samples in the analysis, in which case the "kinship" option is better suited.

KING Orchestration Input Parameters

An few example JSON input files for running KING are provided in the example folder of this repo. The inputs within the JSON files are dummy paths and are not meant to be used as is. Descriptions of each input are outlined below:

Type	Name	Req'd	Description	Default Value
Array[File]	input_vcfs	Yes	VCFs for identifying related individuals; VCFs within the array must not have overlapping samples and should be gzipped
Array[File]	input_vcfs_idx	Yes	Index files for input_vcfs
File	input_bed	No	BED file for filtering joint VCFs; Use a BED file to increase efficiency of merging dataset VCFs
String	output_basename	Yes	Basename for file outputs
String	run_type	No	Type of flag to be used for running KING; Either "ibdseg", "kinship", "related", or "duplicate"	"related"
Int	degree	No	The maximum degree of relatedness to include in KING output	3
Boolean	missing_to_ref	No	If `true`, all missing variant calls will be converted to reference genotypes (0/0) when merging VCFs	false
String	bcftools_docker_image	No	Docker image for bcftools	"us-central1-docker.pkg.dev/mgb-lmm-gcp-infrast-1651079146/mgbpmbiofx/bcftools:1.17"
String	king_docker_iamge	No	Docker image with KING tools	"uwgac/topmed-master@sha256:0bb7f98d6b9182d4e4a6b82c98c04a244d766707875ddfd8a48005a9f5c5481e"

KING Orchestration Output Parameters

Type	Name	When	Description
File	kin_output	When either the --kinship or --related flag is used	.kin file that contains kinship coefficients of individuals
File	kin0_output	When either the --kinship or --related flag is used	Second .kin file that contains kinship coefficients of between-family relationship checking
File	seg_output	When the --ibdseg flag is used	.seg file that contains kinship coefficients and inferred relationships of samples
File	con_output	When the --duplicate flag is used	.con file that contains only duplicate individuals

KING WDL Tasks

The WDL tasks used by the KING Orchestration workflow are contained within the KingTasks.wdl document within this repo. This includes tasks that manipulate VCFs and tasks that will run KING. Below are the inputs and outputs for each task:

FilterVcfTask

This task will filter the input VCF file to contain only regions within the input BED file. It will then count the number of SNPs and samples in the resulting VCF. If no input BED is given, the task will simply count the number of SNPs and samples in the input VCF.

Input Parameters

Type	Name	Req'd	Description	Default Value
File	input_vcf	Yes	VCF or VCF gz to filter
File	input_vcf_idx	Yes	Index file corresponding to input VCF
File	input_bed	No	BED file containing regions for filtering
String	output_basename	No	Basename for output filtered VCF	Defaults to basename of input VCF
Int	addldisk	No	Addition disk space to add to the final runtime disk space in GB	10
Int	preemptible	No	Number of retries for VM	1

Output Parameters

Type	Name	When	Description
File	output_vcf_gz	If an input BED is supplied	VCF filtered to regions in the input BED file
Int	num_snps	Always	Number of SNPs in the input bed file
Int	num_samples	Always	Number of samples in the output VCF

MergeVcfsTask

This task will merge all the input VCF files into a single VCF. The input VCF files must not have any overlapping samples. If an BED file is supplied, it will simultaneously filter the VCFs to the regions within the BED file. When merging the VCFs, there is an option to convert missing variant calls for any samples to reference calls. Finally, the task will count the number of SNPs and the number of samples in the resulting merged VCF.

Input Parameters

Type	Name	Req'd	Description	Default Value
Array[File]	input_vcfs	Yes	VCFs with non-overlapping samples to merge into one VCF
Array[File]	input_vcfs_idx	No	Index files corresponding to input VCFs; must be in the same order as the input VCF array
File	input_bed	No	BED file with regions for filtering
Boolean	missing_to_ref	No	If `true`, all missing variant calls will be converted to reference genotypes (0/0)	false
String	output_basename	Yes	Basename for output files
String	docker_image	Yes	Docker image for bcftools
Int	addldisk	No	Additional disk space to add to the final runtime disk space in GB	10
Int	mem_size	No	Memory for runtime	Defaults to 4; If the size of input VCFs is greater than 10, defaults to 8
Int	preemptible	No	Number of retries for VM	2

Output Parameters

Type	Name	When	Description
File	merged_vcf	Always	Merged VCF of all input VCFs, filtered to regions in the input BED file if given
Int	num_snps	Always	Number of SNPs in the input bed file
Int	num_samples	Always	Number of samples in the output VCF

Vcf2BedTask

This task will convert a VCF to PLINK bed, bim, and fam files for use with KING.

Input Parameters

Type	Name	Req'd	Description	Default Value
File	input_vcf	Yes	VCF to convert to PLINK BED
String	output_basename	No	Basename for output files	Defaults to basename of input VCF
String	docker_image	Yes	Docker image that contains PLINK
Int	addldisk	No	Addition disk space to add to the final runtime disk space in GB	10
Int	plink_mem	No	Memory to use for PLINK in GB; Actual runtime memory will be twice the size of the input PLINK memory	4
Int	preemptible	No	Number of retries for VM	1

Output Parameters

Type	Name	When	Description
File	bed_file	Always	PLINK BED from VCF
File	bim_file	Always	BIM file corresponding to output PLINK BED
File	fam_file	Always	FAM file corresponding to output PLINK BED

RunKingTask

This task will run KING, a kinship estimation tool. This tool has several flags to run different relationship inferences, each using a different algorithm or specifications to estimate kinship. The output file types vary for each flag that can be run with KING. Refer to the KING manual for further descriptions on each flag.

Input Parameters

Type	Name	Req'd	Description	Default Value
File	bed_file	Yes	PLINK BED file from converting input VCF to BED
File	fam_file	Yes	PLINK FAM file corresponding to input BEB
File	bim_file	Yes	PLINK BIM file corresponding to input BED
Int	degree	No	Largest degree of relatedness allowed for KING relationships	3
String	flag	Yes	Flag to run a specified KING algorithm; either "ibdseg", "related", "kinship" or "duplicate"
String	output_basename	Yes	Basename for output files
String	docker_image	Yes	Docker image for running KING
Int	addldisk	No	Addition disk space to add to the final runtime disk space in GB	10
Int	cpu	No	CPU for runtime	2
Int	mem_size	No	Memory for runtime	4
Int	preemptible	No	Number of retries for VM	2

Output Parameters

Type	Name	When	Description
File	kin_output	When either the --kinship or --related flag is used	.kin file that contains kinship coefficients of individuals
File	kin0_output	When either the --kinship or --related flag is used	Second .kin file that contains kinship coefficients of between-family relationship checking
File	seg_output	When the --ibdseg flag is used	.seg file that contains kinship coefficients and inferred relationships of samples
File	con_output	When the --duplicate flag is used	.con file that contains only duplicate individuals

References

Original KING paper: Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM (2010) Robust relationship inference in genome-wide association studies. Bioinformatics 26(22):2867-2873

KING tutorial and manual: https://www.kingrelatedness.com/manual.shtml

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
examples		examples
scripts		scripts
.dockstore.yml		.dockstore.yml
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KING

KING Orchestration Input Parameters

KING Orchestration Output Parameters

KING WDL Tasks

FilterVcfTask

Input Parameters

Output Parameters

MergeVcfsTask

Input Parameters

Output Parameters

Vcf2BedTask

Input Parameters

Output Parameters

RunKingTask

Input Parameters

Output Parameters

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

anvilproject/ibd-king

Folders and files

Latest commit

History

Repository files navigation

KING

KING Orchestration Input Parameters

KING Orchestration Output Parameters

KING WDL Tasks

FilterVcfTask

Input Parameters

Output Parameters

MergeVcfsTask

Input Parameters

Output Parameters

Vcf2BedTask

Input Parameters

Output Parameters

RunKingTask

Input Parameters

Output Parameters

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages