Plugin trio-dnm2
This plugin can be used to screen variants for possible de-novo mutations in trios (i.e. in samples with parental data available).
The program adds the following annotations:
-
FORMAT/DNM: posterior probability of the variant being DNM (see
--dnm-tagoption) -
FORMAT/VA: the variant allele given as a 0-based index to REF,ALT alleles (see
--vaoption) -
FORMAT/VAF: the fraction of reads supporting the de novo allele (see
--vafoption)
There are three calling models are available:
- Naive model
-
This simply looks at sample genotypes (FORMAT/GT) and identifies sites that violate Mendelian inheritance, taking into account sex inheritance patterns on sex chromosomes and in pseudo-autosomal regions. This model is activated as
bcftools +trio-dnm2 -P samples.ped --use-NAIVE
- DeNovoGear model
-
The original DeNovoGear model with fixed bugs (
--with-pPL) or with bugs left as is (--use-DNG) This model is activated as
bcftools +trio-dnm2 -P samples.ped --with-pPL bcftools +trio-dnm2 -P samples.ped --use-DNG
- Trio-DNM model
-
A new calling model which results in a cleaner callset at the cost of decreased sensitivity to parental mosaics. This model is executed by default
bcftools +trio-dnm2 -P samples.ped
For more information and math notes see http://samtools.github.io/bcftools/trio-dnm.pdf
The list of plugin-specific options can be obtained by running
bcftools +trio-dnm2, which will print the following usage page:
About: Screen variants for possible de-novo mutations in trios
Usage: bcftools +trio-dnm2 [OPTIONS]
Common options:
-e, --exclude EXPR Exclude trios for which the expression is true (one matching sample invalidates a trio)
-i, --include EXPR Include trios for which the expression is true (one failing samples invalidates a trio)
-o, --output FILE Output file name [stdout]
-O, --output-type u|b|v|z[0-9] u/b: un/compressed BCF, v/z: un/compressed VCF, 0-9: compression level [v]
-r, --regions REG Restrict to comma-separated list of regions
-R, --regions-file FILE Restrict to regions listed in a file
--regions-overlap 0|1|2 Include if POS in the region (0), record overlaps (1), variant overlaps (2) [1]
-t, --targets REG Similar to -r but streams rather than index-jumps
-T, --targets-file FILE Similar to -R but streams rather than index-jumps
--targets-overlap 0|1|2 Include if POS in the region (0), record overlaps (1), variant overlaps (2) [0]
--no-version Do not append version and command line to the header
General options:
-m, --min-score NUM Do not add FMT/DNM annotation if the score is smaller than NUM
-p, --pfm [1X:|2X:]P,F,M Sample names of child (the proband), father, mother; "1X:" for male pattern of chrX inheritance [2X:]
-P, --ped FILE PED file with the columns: <ignored>,proband,father,mother,sex(1:male,2:female)
-X, --chrX LIST List of regions with chrX inheritance pattern or one of the presets: [GRCh37]
GRCh37 .. X:1-60000,chrX:1-60000,X:2699521-154931043,chrX:2699521-154931043
GRCh38 .. X:1-9999,chrX:1-9999,X:2781480-155701381,chrX:2781480-155701381
--dnm-tag TAG[:type] Output tag with DNM quality score and its type [DNM:log]
log .. log-scaled quality (-inf,0; float)
flag .. is a DNM, implies --use-NAIVE (1; int)
phred .. phred quality (0-255; int)
prob .. probability (0-1; float)
--force-AD Calculate VAF even if the number of FMT/AD fields is incorrect. Use at your own risk!
--va TAG Output tag name for the variant allele [VA]
--vaf TAG Output tag name for variant allele fraction [VAF]
Model options:
--dng-priors Use the original DeNovoGear priors (including bugs in prior assignment, but with chrX bugs fixed)
--mrate NUM Mutation rate [1e-8]
--pn FRAC[,NUM] Tolerance to parental noise or mosaicity, given as fraction of QS or number of reads [0.005,0]
--pns FRAC[,NUM] Same as --pn but is not applied to alleles observed in both parents (fewer FPs, more FNs) [0.045,0]
--use-DNG The original DeNovoGear model, implies --dng-priors
--use-NAIVE A naive calling model which uses only FMT/GT to determine DNMs
--with-pAD Do not use FMT/QS but parental FMT/AD
--with-pPL Do not use FMT/QS but parental FMT/PL. Equals to DNG with bugs fixed (more FPs, fewer FNs)
Example:
# Annotate VCF with FORMAT/DNM, run for a single trio
bcftools +trio-dnm2 -p proband,father,mother file.bcf
# Same as above, but read the trio(s) from a PED file
bcftools +trio-dnm2 -P file.ped file.bcf
# Same as above plus extract a list of significant DNMs using the bcftools/query command
bcftools +trio-dnm2 -P file.ped file.bcf -Ou | bcftools query -i'DNM>10' -f'[%CHROM:%POS %SAMPLE %DNM\n]'
# A complete example with a variant calling step. Note that this is one long
# command and should be on a single line. Also note that a filtering step is
# recommended, e.g. by depth and VAF (not shown here):
bcftools mpileup -a AD,QS -f ref.fa -Ou proband.bam father.bam mother.bam |
bcftools call -mv -Ou |
bcftools +trio-dnm2 -p proband,father,mother -Oz -o output.vcf.gz
Feedback
We welcome your feedback, please help us improve this page by either opening an issue on github or editing it directly and sending a pull request.