FASconCAT-G: extensive functions for multiple sequence alignment preparations concerning phylogenetic studies

Background Phylogenetic and population genetic studies often deal with multiple sequence alignments that require manipulation or processing steps such as sequence concatenation, sequence renaming, sequence translation or consensus sequence generation. In recent years phylogenetic data sets have expanded from single genes to genome wide markers comprising hundreds to thousands of loci. Processing of these large phylogenomic data sets is impracticable without using automated process pipelines. Currently no stand-alone or pipeline compatible program exists that offers a broad range of manipulation and processing steps for multiple sequence alignments in a single process run. Results Here we present FASconCAT-G, a system independent editor, which offers various processing options for multiple sequence alignments. The software provides a wide range of possibilities to edit and concatenate multiple nucleotide, amino acid, and structure sequence alignment files for phylogenetic and population genetic purposes. The main options include sequence renaming, file format conversion, sequence translation between nucleotide and amino acid states, consensus generation of specific sequence blocks, sequence concatenation, model selection of amino acid replacement with ProtTest, two types of RY coding as well as site exclusions and extraction of parsimony informative sites. Convieniently, most options can be invoked in combination and performed during a single process run. Additionally, FASconCAT-G prints useful information regarding alignment characteristics and editing processes such as base compositions of single in- and outfiles, sequence areas in a concatenated supermatrix, as well as paired stem and loop regions in secondary structure sequence strings. Conclusions FASconCAT-G is a command-line driven Perl program that delivers computationally fast and user-friendly processing of multiple sequence alignments for phylogenetic and population genetic applications and is well suited for incorporation into analysis pipelines. Electronic supplementary material The online version of this article (doi:10.1186/s12983-014-0081-x) contains supplementary material, which is available to authorized users.


Introduction
FASconCAT-G offers a wide range of possibilities to edit and concatenate multiple nucleotide, amino acid, and structure sequence alignment files for phylogenetic and population genetic purposes. The main options include sequence renaming, file format conversion, sequence translation, consensus generation of predefined sequence blocks, and RY coding as well as site exclusions in nucleotide sequences. FASconCAT-G implemented process options can be invoked in any combination and performed during a single process run. FASconCAT-G can also read in and handle different file formats (FASTA, CLUSTAL, and PHYLIP) in a single run. The program is designed to handle multiple sequence alignments, however, sequences must have equal length within each file. FASconCAT-G extracts taxon specific associated gene or structure sequences out of given input files and links them to one string. Missing taxon sequences in single files are replaced either by 'N', 'X' or by '.' (dots) depending on their taxon associated data type (nucleotide, amino acid or "dot-bracket" structures). It is possible to concatenate nucleotide and amino acid files to one supermatrix file. FASconCAT-G can read sequences in interleaved and non-interleaved format. FASconCAT-G was written on Linux and runs on Windows PCs, Mac OS and Linux operating systems. The program can be started directly via command line or through interactive menu options. Input files coming from Windows CRLF line feeds should be converted into Unix (LF) line feeds. This can be done in several editors (e.g., Bioedit or Notepad++). FASconCAT-G usually recognizes and converts these files but not in every instance. Ambiguities, indels ('-'), and missing characters ('?'), are allowed, however, any other character in sequences except for those covered by the universal DNA/RNA or amino acid code will lead to an unacceptable error prompt. Structure information in ribosomal sequences is also recognized, analyzed and concatenated by FASconCAT-G if present once in each file and associated with matching sample names. Otherwise, FASconCAT-G will interrupt with a specific error prompt. Besides the concatenation and sequence conversion processes, FASconCAT-G delivers additional sequence information about each input file and the new concatenated supermatrix. The extent of information depends on the chosen information output setting. The default information output file ("FcC info.xls") includes information about concatenated areas of each sequence fragment in the concatenated supermatrix, a list of all concatenated sequences, and several other sequence characteristics. In addition to FASTA format, FASconCAT-G can also output concatenated and/or converted sequence files in NEXUS or PHYLIP format. NEXUS outfiles can be imbedded with MrBayes commands for direct execution in PAUP or output without any specific commands. PHYLIP files are always output in non-interleaved format but taxon names can be printed in relaxed or strict format (i.e., with unlimited or limited signs). The default information file "FcC info.xls" also reports base compositions for nucleotide data of single and supermatrix files while an additional info file "FcC structure.txt" lists single loop as well as stem pairing positions if structure sequences in dot-bracket format are present. Concatenated and/or single infile structure sequences are printed to this file too. For more details about FASconCAT-G usage and options, see the 'Usage/Options' section.

Usage/Options
In order to run FASconCAT-G, open the terminal of your operating system and change directories to the folder where FASconCAT-G and your input files are located. Type the name of your FASconCAT-G version, followed either by a blank and your demand options in one row to start FASconCAT-G directly or just hit <enter> to open the FASconCAT-G menu. Notice that all input files have to be located in the FASconCAT-G home folder. To execute FASconCAT-G, a Perl interpreter must be installed on the current run system. Linux and Mac systems typically come with an interpreter pre-installed while Windows users will have to install a Perl interpreter (e.g., post). We recommend the ActivePerl interpreter, which can be downloaded for free here: • http://activeperl.softonic.de/

Open the menu under Windows
Open a prompt (DOS) terminal on your Windows system and navigate to the folder where FASconCAT-G and files are located <cd your path. . . >. Then open FASconCAT-G: • C:\FASconCAT-G Folder> FASconCAT-G v1.0.pl <enter>

Open the menu under Linux/Mac
Open a terminal and navigate to the folder where FASconCAT-G and files are located <cd your path. . . >. Then open FASconCAT: • user@user:\~/FASconCAT-G Folder> perl FASconCAT-G v1.0.pl <enter> Note: in some cases Linux/Mac users got a path error report from Perl. In this case, the executable for Perl and all of its associated files are located in another directory as /usr/bin/ (e.g. /usr/local/bin/). To start FASconCAT-G without adding the full directory path to the perl start command (e.g. /usr/local/bin/perl FASconCAT-G v1.0.pl) users can change the default path directory to the perl executable in the first line of the FASconCAT-G script by just opening FASconCAT-G with an text editor and changing the first line of the script: • line 1: #!usr/bin/perl ; → #!usr/local/bin/perl ;

Menu handling
The main menu of FASconCAT-G is subdivided into two parts separated by a dashed line. The upper portion displays all possible options and their associated commands for adjustment. The lower part shows the current parameter settings of FASconCAT.

Command Options
Parameter Setting FASconCAT-G Menu To change the default parameter setting type the option associated command into the command line and press <enter>. The new setting configuration will be displayed in the lower part of the menu. After finishing parameter configuration FASconCAT-G can be started by typing "s" and pressing <enter>. For getting help type "h" and press <enter>, to return to FASconCAT-G press <enter>, to quit the program type "q" and press <enter>.

Start FASconCAT-G via single command line
FASconCAT-G can be directly started by single line commands, which simplifies the implementation of FASconCAT-G into complex analyses pipelines. Change directories to the folder where FASconCAT-G and your input files are located and then type the name of the FASconCAT-G version followed by a blank and the desired command options with a minus (-) sign in front of each. Then press <enter>. Make sure you write the input options correctly, for example "-i" and not "-i". Otherwise FASconCAT-G will not start working but instead open the menu.

Options
FASconCAT-G has many options that can either be invoked in the program menu or via a single command line. Note: When invoking options in the single command line, use a minus sign before the command (e.g., "-i"). While in the working menu, simply type the command (e.g., "i") and hit <enter>.

File Processing (-o option)
The file processing option (-o option) is the main option in FASconCAT-G. With this option the user specifies if single alignment infiles are either converted, concatenated, or converted & concatenated by FASconCAT-G. Note: single alignment infiles do not need to be in the same file format as FASconCAT-G can handle any combination of acceptable infile formats. Under default file processing setting (Supermatrix), FASconCAT-G generates a concatenated supermatrix from given infiles in selected output format(s) (e.g., FcC supermatrix.fas). If only one infile is defined under the 'Supermatrix' option, FASconCAT-G converts the selected infile to the selected output format.  To convert multiple infiles to different output formats without sequence concatenation, use the "-o'" command to change the file processing option to "Convert" (File Processing: Convert).

Sequence Translation (-e option)
FASconCAT-G can translate standard nucleotide sequence states to amino acid characters and visce versa. If sequence translation is defined, the translation process will be the first of all defined process steps conducted by FASconCAT-G. FASconCAT-G can recognize and handle amino acid and nucleotide data sets in a single process run. Therefore, FASconCAT-G only translates sequences of infiles which are suitable for a defined translation process. This makes it very easy to concatenate a mixture of different infile sequence types to one specific supermatrix sequence type ( Figure 5). Under default, the translation option is set to 'none'.

AA_infile_1 AA_infile_2
NUC_infile_3 NUC_infile_4 -e -s NUC to AA Concatenated Supermatrix (AA) Figure 5: Example usage 1 of sequence translation. Concatenation of a mixture of different infile sequence types to one specific supermatrix sequence type (NUC: nucleotide data, AA: amino acid data).
To translate nuceotide triplet states to amino acid characters use the "-e" command. FASconCAT-G does not recognize start or stop codons of nucleotide data. It will recode triplets based on the first sequence position until the last sequence position is reached. For sequence translation of nucleotide data FASconCAT-G uses the standard IUPAC triplet codes for amino acid characters. FASconCAT-G will print an info text on the terminal if sequence lengths are not a multiple of three, but will not abort the translation process. Instead, FASconCAT-G will translate incomplete nucleotide triplets to '?'. FASconCAT-G translates nucleotide triplets even if triplets contain ambiguity codes, provided that the triplets are still assignable to specific amino acid characters (e.g. 'YTR' → Leucine/L). Otherwise, unspecific triplets are translated to '?' (e.g. 'RCT' → ?).
• perl FASconCAT-G v1.0.pl -e -s <enter> → translating nucleotide character states to amino acid character states under default options.
To translate amino acid characters to nucleotide triplet states use the "-e -e" command. Amino acid characters are translated to nucleotide triplets using the compressed IUPAC triplet code for a given amino acid character (e.g. Phenylalanin/F → 'TTY'). Unrecognized sequence characters in amino acid sequences like '-' or 'X' are translated to '???'. Note: stop codon signs are not allowed in FASconCAT-G and have to be excluded before file processing! FASconCAT-G does not check for the correctness of given reading frames!
• perl FASconCAT-G v1.0.pl -e -e -s <enter> → translating amino acid character states to nucleotide character states under default options.

Renaming Sequence Names (-k option)
FASconCAT-G can rename defined sequence names prior to file processing. The user must provide an extra info file named "new seq names.txt" in the FASconCAT-G home folder, which in each row lists the old name delimited from the new name by a tabstop (Table 2). Otherwise, FASconCAT-G will abort with an error prompt. Sequences which are not listed in "new seq names.txt" are left unchanged.
FASconCAT-G will print additional information of the sequence renaming process to a new outfile named "FcC rename control.txt". Note: the new defined sequence names in "new seq names.txt" must be unique. Otherwise, FASconCAT-G will abort with an error prompt. Table 2: Example format of user defined infile ("new seq names.txt") for sequence renaming.

Given Sequence Names New Defined Sequence Names
Given Name sequence 1 <tabstop> New Name sequence 1 Given Name sequence 2 <tabstop> New Name sequence 2 Given Name sequence 3 <tabstop> New Name sequence 3 Given Name sequence n <tabstop> New Name sequence n

Rejection of 3 rd Nucleotide Codon Positions (-d option)
To reject each third codon position in nucleotide data use the "-d" command. FASconCAT-G will reject each third sequence position of given nucleotide infiles and proceed with reduced sequences with respect to other defined FASconCAT-G options. Amino acid sequences will not be reduced under the "-d" command. To reduce amino acid sequences as well, use the sequence translation option ("-e" command) together with the "-d" command. FASconCAT-G will translate amino acid states to corresponding nucleotide characters using compressed IUPAC codes and afterwards exclude each third position in translated sequences (Figure 7).  First, amino acid sequences of given FASTA infiles are internally translated to nucleotide triplets using the "-e -e" command. Afterwards, each third sequence position of given infiles is rejected by FASconCAT-G ("-d" command). Finally, reduced sequences of given infiles are converted and printed as PHYLIP (strict) formated outfiles ("-a -p" command) and, due to the "-o -o" command, concatenated and printed as PHYLIP (strict) formated supermatrix outfile as well.
Note: concatenation of different infile sequence types (e.g., nucleotide and amino acid sequences) without the "-e" command will lead to a supermatrix with different ranges of nucleotide and amino acid character states (Figure 8).

RY Coding (-b option)
RY coding can either be applied to each third nucleotide sequence position ("-b" command) or to complete nucleotide sequences ("-b -b" command). The R code is used for purine states (A or G → R) while the Y code is used for pyrimidines (C or T|U → Y). Amino acid sequences are left unchanged, except when the "-e -e" option has been chosen (see section 2.3.3 and Figure 7).

Create Consensus Sequences (-c option)
FASconCAT-G can create consensus sequences for matching defined sequence blocks within given infiles (see Figure 9, 10). Sequence blocks are matched by identical alphanumeric characters before the first underscore (case sensitive) in sequence names. If no underscore is present, the complete sequence name will be used for sequence block identification. Consensus sequences of defined sequence blocks are named by the common sequence name prefix and an additional name suffix (' consensus') (see Table 3). In sequence concatenation processes, missing sequence blocks in given infiles are replaced by the corresponding replacement variables for the sequence type ('N' for nucleotide data, 'X' for amino acid data) in the supermatrix out file. Three different consensus types are available in FASconCAT-G to process consensus sequences of defined sequence blocks: 'Most Frequent Consensus' ("-c" command), 'Majority Rule Consensus' ("-c -c" command), and 'Strict Consensus' ("-c -c -c" command).
'Most Frequent Consensus'. The 'Most Frequent Consensus' option considers the most frequent character state among defined sequence blocks as the consensus character state. If two or more character states are equally frequent, FASconCAT-G uses either the corresponding IUPAC ambiguity code as the consensus character state (nucleotide data) or '?' (amino acid data and nucleotide data) (Table 4).

Sequences of a Given Nucleotide Infile Sequences of a Given Amino Acid Infile
Majority Rule Consensus'. The 'Majority Rule Consensus' option considers character states which occur at a given site position in more than 50% of sequences of a defined sequence block as consensus character state. Otherwise, FASconCAT-G uses a '?' as consensus character state (Table 5).

Sequences of a Given Nucleotide Infile Sequences of a Given Amino Acid Infile
The 'Strict Consensus' option considers all character states at a given site position to generate a strict consensus sequence for a defined sequence block using IUPAC ambiguity codes for nucleotide data and a 'X' for amino acid data (for nucleotide data, '-' and '?' are ignored as long as a nucleotide character state exists for a specific site position). However, if a specific site position of a defined sequence block lacks a specific character state and consists of only gap states ('-') and missing data states ('?'), FASconCAT-G will output a '?' as the consensus character state (Table 6).
• perl FASconCAT-G v1.0.pl -c -c -c -s <enter> → create 'Strict Consensus' sequences of defined sequence blocks under default options. Table 6: Example of a strict consensus block. A consensus character will only be assigned when the character state at given site is unanimous in all sequences of a defined sequence block. In all other cases, FASconCAT-G will assign the appropriate IUPAC ambiguity code for nucleotide states ('-' and '?' are not considered) or 'X' for amino acid data. If a site position consists of only gaps ('-') and missing data ('?'), FASconCAT-G will assign '?' as the consensus character state.

Handling of Defined Input Files (-f option)
When the defined input files option "-f " is invoked, FASconCAT-G requires the user to define infiles for processing. The program will list files in FASTA (.fas), PHYLIP (.phy) and CLUSTAL (.aln) format currently found in its home folder along with an associated list number (Table 7). So the user can define specific files for defined processing steps. Type the file associated number of selected files separated by comma without blanks in one row and press <enter>. By typing b and <enter>, FASconCAT-G will skip back to the main menu. Note: the "-f" command is unsuitable for pipeline processes, because it requires a user sepcified input list of infiles for FASconCAT-G.  Figure 11: Example usage of the defined infile option "-f". Using the "-f" command, FASconCAT-G will print a list of potentially infiles identified from the FASconCAT-G home folder. In this example infiles 2,3, and 4 have been selected from the FASconCAT-G infile list for the default sequence concatenation process.

Print OUT of Parsimony Informative Sites (-j option)
FASconCAT-G will print OUT additional information file(s) identifying parsimony-informative sites of given infiles and/or the concatenated supermatrix (depending on the defined "-o" command) if the user invokes the "-j" command. A site is parsimony-informative if it contains at least two types of nucleotides (or amino acids), and at least two of them occur with a minimum frequency of two. The file format of parsimonyinformative alignment files depends on the chosen output format(s) (Figure 13).  Figure 12: Example usage 1 of parsimony informative site print OUT command "-j". In this example, FASconCAT-G will print an additional outfile that contains parsimony-informative sites of the concatenated supermatrix outfile.  Figure 13: Example usage 2 of parsimony informative site print OUT command "-j". In this example, FASconCAT-G will print additional outfiles that contain parsimony-informative sites of both the concatenated supermatrix outfile as well as the FASTA converted infiles (combined with "-o -o" option).

Print OUT Partition File(s) (-l option)
With the "-l" command invoked FASconCAT-G generates additional gene partition output files if the sequence concatenation option has been defined. If the user selects to output a supermatrix in FASTA and/or PHYLIP format, FASconCAT-G prints an associated gene partition file ("FcC supermatrix partition.txt") which can be directly used for a Maximum Likelihood analysis with RAxML. Without using the ProtTest option ("-m") FASconCAT-G chooses the LG substitution model as default model for given or translated amino acid gene partitions. For a supermatrix output in NEXUS (block) format, FASconCAT-G prints an additional NEXUS file in which single gene partitions are defined as single sequence blocks ("FcC supermatrix partition.nex"). If the NEXUS (MrBayes) format is defined, FASconCAT-G prints an additional MrBayes executable file with defined partitions (also "FcC supermatrix partition.nex"). The parameters in the MrBayes partitioned executable NEXUS file vary from the non-partitioned version (Table 8). Note: The LG model fits in most cases, but must NOT be mandatory the best fitting model for given data! To identify the best fitting amino acid substitution model for each gene partition we suggest to use FASconCAT-G together with the external ProtTest software (see "-m option"). Table 8: Overview of the MrBayes parameters integrated into the NEXUS output under the "-n -n" option in combination with the "-l" option.

MrBayes Commands Set Up
Number

Using ProtTest for Amino Acid Model Specification (-m option)
FASconCAT-G offers the option to generate the best-fit protein model for each amino acid gene partition in RAxML partition formatted supermatrices using the external software, ProtTest version 3.3 (Darriba et al. (2011), ProtTest 3: fast selection of best-fit models of protein evolution, Bioinformatics). ProtTest will only be executed for given or translated amino acid infiles if sequence concatenation has been chosen together with the partition print option ("-l"), but not for supermatrices in NEXUS format. FASconCAT-G uses default settings for ProtTest version 3.3 and the best-fit model will be selected by the ProtTest BIC criterium. Note: all outfiles from ProtTest version 3.3 are placed in the FAScoCAT-G home directory.
FASconCAT-G will print a warning message to the output terminal if ProtTest could not be executed successfully. In that case, FASconCAT-G will assign the LG model for the corresponding gene partition in the RAxML gene partition file. ProtTest version 3.3 can be downloaded for free here: • http://www.mybiosoftware.com/phylogenetic-analysis/2854 The name of the implemented ProtTest version is "prottest-3.3.jar". FASconCAT-G can only start ProtTest executables with that name. To change the default ProtTest executable name, just open FASconCAT-G with an text editor and change the ProtTest name in line 10. Be aware that the ProtTest name is surrounded by single quotation marks: • line 10: my $prottest = 'prottest-3.3.jar' ; → my $prottest = 'NEW PROTTEST NAME' ;

Switch Off Print OUT File(s) in FASTA format (-a option)
Under default, FASconCAT-G prints OUT all defined outfile(s) in FASTA format (.fas). Use the "-a" command to switch off the FASTA format print OUT. Note: if the FASTA format print OUT has been switched off and no other output format has been defined, FASconCAT-G will abort with an error prompt. Therefore, we strongly recommend to use the "-a" command via the terminal command line only in combination with another file format print OUT command (e.g. "-p" or "-n") ( Figure 14). This note is especially important if FASconCAT-G should be implemented in automated pipeline processes.
• COMMAND: -a -p -s <enter> → start FASconCAT-G under default wih PHYLIP output format instead of FASTA output format.
infile_1.fas infile_2.fas -a -p -s Concatenation FcC_supermatrix.phy (Strict) Figure 14: Example usage of non-FASTA formated print OUT command "-a". In this case, FASconCAT-G will print OUT the concatenated supermatrix in PHYLIP (strict) format. Be aware, FASconCAT-G will abort with an error prompt if no other outfile format has been chosen.

Print OUT File(s) in NEXUS format (-n option)
FASconCAT-G will output defined outfiles in NEXUS (block) format with the "-n" command invoked (Figure 15), which can be directly loaded into programs such as PAUP or MrBayes. Alternatively, FASconCAT-G can generate an executable NEXUS file for the bayesian analysis software MrBayes with the "-n -n" command ( Figure 16). The MrBayes executable is printed with a set of parameters that serve as good starting point for bayesian analyses but these can easily be changed with any text editor. If a structure string in dot-bracket format is given while dots code unpaired (loop) positions and brackets code pairings (stems) ("(" → opening a stem, ")" → closing a stem), FASconCAT-G compiles a partition set for MrBayes with single charset for stem and loop regions. Table 9 gives an overview of the integrated set up for MrBayes.
• COMMAND: -n -s <enter> → start FASconCAT-G under default with additional supermatrix print OUT in NEXUS (block) format • COMMAND: -n -n -s <enter> → start FASconCAT-G under default with additional supermatrix print OUT in NEXUS (MrBAYES) format infile_1.phy infile_2.phy -n -s Concatenation FcC_supermatrix.fas FcC_supermatrix.nex (Block) Figure 15: Example usage 1 of NEXUS (block) formatted print OUT command "-n". In this example, FASconCAT-G will print a NEXUS (block) formatted version of the concatenated supermatrix outfile in addition to the FASTA supermatrix outfile (the default FASTA format print OUT has not been switched off).
infile_1.phy infile_2.phy -n -n -a -s Concatenation FcC_supermatrix.nex (MrBAYES) Figure 16: Example usage 1 of NEXUS (MrBayes) formated print OUT command "-n -n". In this example, FASconCAT-G will only print a NEXUS (MrBayes) formatted version of the concatenated supermatrix (the default FASTA format print OUT has been switched off by the "-a" command).

Print OUT File(s) in PHYLIP format (-p option)
With the "-p" or the "-p -p" command invoked FASconCAT-G generates interleaved PHYLIP (.phy) output files in strict format (taxon name restriction up to 10 signs) or relaxed format (no restriction in character number for taxon names), respectively. To switch off the FASTA formatted print OUT (default setting for print OUT), enter the "-a" command as well ( Figure 17) (see section 2.3.10).
• COMMAND: -p -s <enter> → start FASconCAT-G under default with additional supermatrix print OUT in PHYLIP (strict) format  Figure 17: Example usage 1 of PHYLIP (strict) formatted print OUT "-p" command. In this case, FASconCAT-G will only print OUT a concatenated supermatrix outfile in PHYLIP (strict) format because the "-a" command (switches off the default FASTA formatted print OUT of defined outfile) is also invoked.  Figure 18: Example usage 2 of PHYLIP (relaxed) formatted print OUT "-p -p" command. In this example, FASconCAT-G will convert the CLUSTAL infiles and print OUT outfiles in both PHYLIP (relaxed) and FASTA format.

Reduction of Information Print OUT (-i option)
By default FASconCAT-G provides useful information about the supermatrix output file and all single input files (Table 10). However, the evaluation of this additional information often results in longer computation times depending on the data set. The user can therefore invoke the "-i" command to increase the overall computation speed by decreasing information sampling and print out a reduced information file (Table 11).
• COMMAND: -i -s <enter> → start FASconCAT-G under default with reduced information print OUT

Input/Output
FASconCAT-G is able to import three different file formats. The format of the output files depend on the user determined parameter settings. Table 12 shows possible input and output formats while

Secondary Structure Information
If one or more infiles contain a secondary structure string which consists only of dots ('.') and brackets ('(' and ')'), FASconCAT-G will print additional structure string information to an extra output file named FcC structure.txt. If sequence concatenation has been chosen via default or "-o -o" command, FASconCAT-G will print basal secondary structure information of the concatenated supermatrix string to the common info file as well (see Table 10). Table 16 lists all additional structure info printed to FcC structure.txt. Note: only one structure string is allowed per infile and the corresponding string name must be identical among each infile (mind your cases). Otherwise, FASconCAT-G will abort with an error prompt.

Hierarchical Order of File Processing Options
To avoid errors during single file processing steps, for example an exclusion of 3 rd nucleotide site positions before sequence translation to amino acid character states, FASconCAT-G contains a hierarchical order of single file processing steps: 1. Read IN of infiles and file checking 2. Sequence renaming

Example Usage
The following examples should show that FASconCAT-G is a suitable and user-friendly tool for complex phylogenetic and population genetic data processing. All different options of FASconCAT-G are combinable.

Example 1
The following steps can be performed by FASconCAT-G in a single process run: 1

Example 2
The following steps can be performed by FASconCAT-G in a single process run: 1