Sometimes there is the need to create a consensus sequence for an individual
where the sequence incorporates variants typed for this individual.
This is possible using the
We need the reference sequence
reference.fa in the fasta format
and an indexed VCF with the variants
calls.bcf. The command is:
cat reference.fa | bcftools consensus calls.vcf.gz > consensus.fa
This assumes we have already made the calls, normalized indels and filtered. There is another page which goes deeper and is devoted just to this, but in brief, the variant calling command in its simplest form is:
# call variants bcftools mpileup -Ou -f reference.fa alignments.bam | bcftools call -mv -Oz -o calls.vcf.gz bcftools index calls.vcf.gz # normalize indels bcftools norm -f reference.fa calls.vcf.gz -Ob -o calls.norm.bcf # filter adjacent indels within 5bp bcftools filter --IndelGap 5 calls.norm.bcf -Ob -o calls.norm.flt-indels.bcf # apply variants to create consensus sequence cat reference.fa | bcftools consensus calls.vcf.gz > consensus.fa
Note that in order to keep the program relatively simple,
consensus merely applies variants from a VCF.
If more more advanced logic
is required, the VCF has to be preprocessed with other tools, such as the
(see an example here).