Geneconv: detect gene conversion

Geneconv is a program for detecting gene conversion between aligned DNA sequences. It can also search for gene conversion fragments from outside the alignment. The output results are ranked by P-values and presented in a spreadsheet manner. The data are valuable for bioinformatics studies and papers that deal with evolution.

Geneconv is available at SBo and at the moment its version is 1.81a. The SlackBuild bundles into the package some example materials and extensive documentation, placed respectively in /usr/share/geneconv and /usr/doc/geneconv-1.81a. This post is by no means intended to be a hand-holding tutorial, it does not even cover the basics that well! It is merely meant as a quick reference for my future use.

Running the program takes the following format:

geneconv input-alignment.fasta output.frags /fl -nolog

In the example above:

  • input-alignment.fasta is the file containing the multiple sequence alignment (MSA). It should be in a supported format: NEXUS, Pearson/FASTA, NBRF/PIR, CLUSTAL, ASF or PHYLIP interleaved.
  • output.frags is the output file. If you do not specify a name, the output file will be named as the input-alignment file, but with the file extension frags.
  • /fl (or -flags in case long synthaxis is used) lists any specified program settings and options for writing the output file. The list of available flags is long, so see the section below for a few examples.
  • -nolog is needed if you get a “Segmentation fault” when the program writes to its log file (e.g. input-alignment.sum).

Some possible flags to use are:

  • /w123 will start the random number generator of geneconv at that value. This will guarantee that the output is the same, indifferent to the computer or time the program is run at.
  • /lp will tell geneconv to list pairwise significant fragments in addition to global lists.
  • /sp will make polymorphisms and their offsets to be written in the output file.
  • /sb will tell the program to write the formulas for the BLAST-like weighted global scores to the output file.
  • /g1 allows mismatches within fragments with a mismatch penalty of 1.
  • /r tells geneconv that the alignment used is from a DNA coding region. Therefore the program will use codon polymorphisms instead of site polymorphisms. The alignment used should be a actually a codon aligment, generated for example by PAL2NAL.
  • /f makes a fancier output.

To have an example, let’s take a look at the E. coli 6-phosphogluconate dehydrogenase coding region DNA. The MSA is found here: /usr/geneconv/examples/gnd7.asf. Therefore, to run the program with the options from above:

geneconv gnd7.asf gnd7-output.frags /w123 /lp /sp /sb /g1 /r /f -nolog

Check the contents of the gnd7-output.frags locally in a text editor. The program finds 3 global inner fragments (GI) and 5 pairwise inner (PI) fragments. You can compare the output with the examples /usr/share/geneconv/gnd7.frags and especially /usr/share/geneconv/gnd7g1.frags.

The documentation that comes with the program is a highly recommended reading material.

3 Comments

Filed under Academic

3 responses to “Geneconv: detect gene conversion

  1. Great writeup, thanks. I’ve run geneconv on an fasta file containing aligned nucleotide sequence of 10 orthologous genes from 10 genomes. AKA, 10 orthologs (~100 amino acids long) that are aligned with muscle. In my output file (.frags), there are “No outer-sequence fragments listed” and “No inner fragments listed”. Can I interpret this as indicating no recombination events between genes in my orthologous group?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s