Annotating VCF/BCF files
Set the ID column to .
and remove INFO/DP and FORMAT/DP annotations
bcftools annotate -x ID,INFO/DP,FORMAT/DP file.vcf.gz
Remove all INFO fields and all FORMAT fields except for GT and PL
bcftools annotate -x INFO,^FORMAT/GT,FORMAT/PL file.vcf
# rename INFO tag `OldName` to `NewName` bcftools annotate -c INFO/NewName:=INFO/OldName file.vcf # rename FILTER `OldName` to `NewName` bcftools annotate -c FILTER/NewName:=FILTER/OldName file.vcf
Populate the columns ID, QUAL and the INFO/TAG annotation
# do not replace TAG if already present bcftools annotate -a src.bcf -c ID,QUAL,+TAG dst.bcf # overwrite existing TAG annotations bcftools annotate -a src.bcf -c ID,QUAL,TAG dst.bcf
Carry over all INFO and FORMAT annotations except FORMAT/GT
bcftools annotate -a src.bcf -c INFO,^FORMAT/GT dst.bcf
Carry over FILTER column
bcftools annotate -a src.bcf -c FILTER dst.bcf
The following command can be used to transfer values from a tab-delimited
file into a new INFO/TAG annotation. Note that if the TAG is not
defined in the VCF header, a header fragment with the definition must
be provided via the -h
option.
# Annotate from a tab-delimited file with six columns (the fifth is ignored), # first indexing with tabix. The coordinates in the text file are 1-based, same # as the coordinates in the VCF tabix -s1 -b2 -e2 annots.tab.gz bcftools annotate -a annots.tab.gz -h annots.hdr -c CHROM,POS,REF,ALT,-,TAG file.vcf # Annotate from a tab-delimited file with regions (1-based coordinates, inclusive) tabix -s1 -b2 -e3 annots.tab.gz bcftools annotate -a annots.tab.gz -h annots.hdr -c CHROM,FROM,TO,TAG input.vcf # Annotate from a bed file (0-based coordinates, half-closed, half-open intervals) bcftools annotate -a annots.bed.gz -h annots.hdr -c CHROM,FROM,TO,TAG input.vcf
Modifiers that control what to do with missing values:
|
Add TAG if the source value is not missing (“.”). If TAG exists in the target file, it will be overwritten |
|
Add TAG if the source value is not missing and TAG is not present in the target file. |
|
Add TAG even if the source value is missing. This can overwrite non-missing values with a missing value
and can create empty VCF fields ( |
|
Add TAG even if the source value is missing but only if TAG does not exist in the target file; existing tags will not be overwritten. |
# transfer FILTER column to INFO tag NewTag; notice that the -a option is not present, therefore # B.bcf/FILTER is the source annotation bcftools annotate -c INFO/NewTag:=FILTER B.bcf # transfer FILTER column from A.bcf to INFO/NewTag in B.bcf; notice that the -a option is present, # therefore A.bcf/FILTER is the source annotation bcftools annotate -c INFO/NewTag:=FILTER -a A.bcf B.bcf # transfer B.bcf/FILTER column to INFO tag NewTag; notice that although the -a option is present, # the notation "./FILTER" tells the program to transfer the annotation locally and make B.bcf/FILTER # the source annotation bcftools annotate -c INFO/NewTag:=./FILTER -a A.bcf B.bcf # transfer B.bcf/FILTER column to INFO/NewTag, then A.bcf/FILTER to B.bcf/FILTER bcftools annotate -c INFO/NewTag:=./FILTER,FILTER -a A.bcf B.bcf # transfer A.bcf/FILTER to B.bcf/FILTER, then the new B.bcf/FILTER value to INFO/NewTag; notice # that due to the order of operation, FILTER and INFO/NewTag will be identical bcftools annotate -c FILTER,INFO/NewTag:=./FILTER -a A.bcf B.bcf
Imagine you need to transfer INFO/DP annotation to FORMAT/DP. This is currently not possible
using a single bcftools annotate
command, but can be done easily in multiple steps.
This is a complete example that can be copy and pasted as is:
# Create a test VCF echo -e '##fileformat=VCFv4.3' > test.vcf echo -e '##INFO=<ID=DP,Number=1,Type=Integer,Description="Read depth">' >> test.vcf echo -e '##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">' >> test.vcf echo -e '##contig=<ID=1,length=248956422,assembly=hg38>' >> test.vcf echo -e '#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO\tFORMAT\tsmpl1\tsmpl2' >> test.vcf echo -e '1\t16648016\t.\tG\t.\t.\t.\tDP=10\tGT\t0/0\t0/0' >> test.vcf # Extract INFO/DP into a tab-delimited annotation file bcftools query -f '%CHROM\t%POS\t%DP\n' test.vcf | bgzip -c > annot.txt.gz # Index the file with tabix tabix -s1 -b2 -e2 annot.txt.gz # Create a header line for the new annotation echo -e '##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read depth">' >> hdr.txt # Transfer the annotation to sample 'smpl1' bcftools annotate -s smpl1 -a annot.txt.gz -h hdr.txt -c CHROM,POS,FORMAT/DP test.vcf
Feedback
We welcome your feedback, please help us improve this page by either opening an issue on github or editing it directly and sending a pull request.