Plugin fill-tags
This plugin can be used to compute and fill various INFO tags:
INFO/AC Number:A Type:Integer .. Allele count in genotypes
INFO/AC_Hom Number:A Type:Integer .. Allele counts in homozygous genotypes
INFO/AC_Het Number:A Type:Integer .. Allele counts in heterozygous genotypes
INFO/AC_Hemi Number:A Type:Integer .. Allele counts in hemizygous genotypes
INFO/AF Number:A Type:Float .. Allele frequency from FMT/GT or AC,AN if FMT/GT is not present
INFO/AN Number:1 Type:Integer .. Total number of alleles in called genotypes
INFO/ExcHet Number:A Type:Float .. Test excess heterozygosity; 1=good, 0=bad
INFO/END Number:1 Type:Integer .. End position of the variant
INFO/F_MISSING Number:1 Type:Float .. Fraction of missing genotypes (all samples, experimental)
INFO/HWE Number:A Type:Float .. HWE test (PMID:15789306); 1=good, 0=bad
INFO/MAF Number:1 Type:Float .. Frequency of the second most common allele
INFO/NS Number:1 Type:Integer .. Number of samples with data
INFO/TYPE Number:. Type:String .. The record type (REF,SNP,MNP,INDEL,etc)
FORMAT/VAF Number:A Type:Float .. The fraction of reads with the alternate allele, requires FORMAT/AD or ADF+ADR
FORMAT/VAF1 Number:1 Type:Float .. The same as FORMAT/VAF but for all alternate alleles cumulatively
TAG:Number=Type(EXPR) .. Experimental support for user expressions such as DP:1=int(sum(DP))
If Number and Type are not given (e.g. DP=sum(DP)), variable number (Number=.) of floating point
values (Type=Float) will be used.
By default the values are calculated across all samples, but also per-population values can be calculated. For this, provide a file with the list of samples in the first column and a comma-separated list of populations in the second column. The file can look for example like this:
Sample1 Group1 Sample2 Group1,Group2 Sample3 Group2
The list of plugin-specific options can be obtained by running
bcftools +fill-tags -h, which will print the following usage page:
About: Set INFO tags AF, AC, AC_Hemi, AC_Hom, AC_Het, AN, ExcHet, HWE, MAF, NS
FORMAT tag VAF, custom INFO/TAG=func(FMT/TAG).
See examples below, run with -l for detailed description.
Usage: bcftools +fill-tags [General Options] -- [Plugin Options]
Options:
run "bcftools plugin" for a list of common options
Plugin options:
-d, --drop-missing do not count half-missing genotypes "./1" as hemizygous
-l, --list-tags list available tags with description
-t, --tags LIST list of output tags. By default, all tags are filled.
-S, --samples-file FILE list of samples (first column) and comma-separated list of populations (second column)
Examples:
# Print a detailed list of available tags
bcftools +fill-tags -- -l
# Fill INFO/AN and INFO/AC
bcftools +fill-tags in.bcf -Ob -o out.bcf -- -t AN,AC
# Fill (almost) all available tags
bcftools +fill-tags in.bcf -Ob -o out.bcf -- -t all
# Calculate HWE for sample groups (possibly multiple) read from a file
bcftools +fill-tags in.bcf -Ob -o out.bcf -- -S sample-group.txt -t HWE
# Calculate total read depth (INFO/DP) from per-sample depths (FORMAT/DP)
bcftools +fill-tags in.bcf -Ob -o out.bcf -- -t 'DP:1=int(sum(FORMAT/DP))'
# Calculate per-sample read depth (FORMAT/DP) from per-sample allelic depths (FORMAT/AD)
bcftools +fill-tags in.bcf -Ob -o out.bcf -- -t 'FORMAT/DP:1=int(smpl_sum(FORMAT/AD))'
# Add number of samples which pass (INFO/good) and fail (INFO/bad) a binomial test
bcftools +fill-tags in.bcf -- -t 'good=N_PASS(binom(FMT/AD[:0],FMT/AD[:1])>=1e-5)','bad=N_PASS(binom(FMT/AD[:0],FMT/AD[:1])<1e-5)'
# Annotate with phred-scaled p-value of fisher exact test, use the DP4 or ADF,ADR tags
bcftools +fill-tags in.bcf -- -t 'FMT/FT:1=phred(fisher(FMT/DP4))'
bcftools +fill-tags in.bcf -- -t 'FMT/FT:1=phred(fisher(FMT/ADF,FMT/ADR))'
# Annotate with allelic fraction
bcftools +fill-tags in.bcf -Ob -o out.bcf -- -t FORMAT/VAF
Feedback
We welcome your feedback, please help us improve this page by either opening an issue on github or editing it directly and sending a pull request.