Home
Home

 Help Documentation
 Current document  

Introduction to the Tutorial

The following is a brief tutorial on how to use SIMMAP. It does not cover every feature but it should help establish the main approaches to selecting and applying features which are common across the program interface regardless of the type of dating being analyzed. It will also get you started on how to simulate mutational map(s) and begin exploring your own data. When keyboard shortcuts exist they are given in square brackets, e.g., [O] is the shortcut for opening a file by pressing the command plus O (Oh) keys.

Getting Started ⇒ A note about input files

The first step in using SIMMAP is loading a data file and a tree file into memory. Both files need to be in the Nexus format (go to the help pages on input files for more information about file formats). Files can be opened by selecting the Open Data... [O] and the Open Trees... [T] options under the File menu in the main menu bar.

Briefly, the data and trees can be combined into a single file so long as the begin trees block is placed after the begin data block. While the Nexus format is required, only certain commands are recognized by SIMMAP. The following is a list of mandatory Nexus elements used by SIMMAP in reading the data file:
  1. begin data block
  2. dimensions (ntax, nchar)
  3. format (datatype, interleave=no)
  4. matrix
The following is a list of mandatory Nexus elements used by SIMMAP in reading the trees file (or in block following the begin data block):
  1. begin trees block
  2. translate (integer, name)
  3. non-zero branch lengths or absent

To get a feel for the input files look like and what will work take a look at the sample files included with the program distribution or go to the help pages on input files. A final note on these files. SIMMAP supports both dna and rna datatypes and all of the IUPAC symbols for nucleotide residues. In addition to these, ?, ., and - are are allowed (? and - are treated as unknown, N, during the analysis). Morphological character data sets have a few constraints. The datatype command must read standard and alternate symbols or the use of trait names is not supported - characters need to be coded as integers. SIMMAP supports analysis of up to 7 states for a character starting with 0 (zero). Characters with more than 7 states are read, displayed, but ignored for analysis purposes.

Lesson 1 ⇒ Opening Files

Let's get started. This tutorial will be using files located in the Sample Files folder in the software distribution. Select Open Data... [O] in the File menu and select the file 'angiosperms.nex' located in the Sample Files folder in the SIMMAP Folder. This file contains 34 species of Angiosperms coded for 41 standard (morphological) characters and a single tree.

First you should notice that after a progress window opens that a warning window appears. In analysing morphological data, unlike nucleotide data in which four states are known to exist, at least two character states must be assumed. Three of the characters in this data set are all zeros. What can we do? We have two options. First, we can assume that an additional state exists but is not observed in our taxa by selecting YES. Second, we can select NO and exclude these characters from analysis. Select YES and continue. This will need to be done individually for each character. Once the file file has been read two windows should open automatically; the Data View and the Tree View. Take a few moments to familiarize yourself with these windows.

The main features in the Tree View window we will be using in this tutorial are the Display Options, Clear Map, and Map buttons (highlighted in red). In addition, the State Colors, Coding and Deselect All buttons, and the character display table in the Data View window will be put to use (highlighted in red).

The remaining features are explained elsewhere in the help pages.

Lesson 2 ⇒ Changing Tree Display Options

Changing how the tree is displayed is aesthetic. It is useful if you wish to create images to be exported for seminars or publication.

Bring the Tree View window to the front if it isn't already by selecting it with the cursor. Click the Display Options button. A window should pop down displaying a number of options that can be changed in displaying the tree. This tutoriial will be focusing on the highlighted buttons but try a few of them to see the results.

First, select a font size of 24 from the Labels size: pop-up menu. Press Apply to activate the change. Now, let's change the color of the taxon labels. Click the Label color: color well. A colors window will appear. Select a color and press Apply to activate the change. (Note: If simply selecting a color in the colors window doesn't change the color in the color well you can drag the color to the well, or reselect the color well with a mouse-click and select the color again. The color well should now have changed. Press Apply.) For the moment ignore the State Colors button. We will get to this in the next section. Change some of the other features to get a feel for what they do. (Note: Some of the features apply only to molecular data sets and when drawing a character map so you may not immediately see the changes.) Close the Display Options window.

Lesson 3 ⇒ Changing State Colors

State colors are aesthetic. They change how you view colors in the Data View and Tree View. However, this maybe useful for producing graphics for seminars and/or publications.

Bring the Data View window to the front. Click the State Colors button. A window should pop down (shown to the right). Notice the two areas; one for molecular data and one for morphological data. We will be working with the morphological panel, but the same applies to manipulating the display of molecular data. While exploring the Display Options window in the previous section, you probably noticed a button for State Colors. If you had pressed this it would have brought the current options window up attached to the Tree View window. State colors apply simultaneously, to both the data and characters mapped on trees (as we will see in a bit). Select one of the character states using the pop-up menu (which when the window first opens should be displaying zero). Select the associated color well and change the color as you did with the taxon labels. You should notice that the cells containing that state in the Data View window have now changed color. Close this window.

Lesson 4 ⇒ Configure Character Coding

Character coding is simply how the characters are interpretted by SIMMAP. For example, analyzing nucleotide data as protein coding or setting standard (morphological) characters as ordered or unordered.

Click the Coding button or use the keyboard shortcut [4]. A new window should pop down with two areas; one for molecular data and one for morphological data (shown below). We will be working with the morphological panel (see the help pages on options by menu or options by window for information on setting the coding of molecular data). This feature allows us to change whether the morphological characters will be treated as unordered or ordered. Characters can be moved between the tables by either double-clicking or single-clicking to select a character and then using the arrow buttons to move it from one category to the other. Give it a try and then close this window.

Lesson 5 ⇒ Simulating A Single Mutational History (Map)

We are now ready to get a feel for how we go about simulating a character map. First we will explore how to visualize one character map at a time. Let's get started. Bring the Data View window to the front. We will need to first select a character to map. Scroll the data character table till character 17 is visible. Select the header of the character column with the cursor to select the character for analysis. This should highlight the column and now the character has been selected. You can change selections by simply clicking on another column. Columns can be un-selected by clicking the Deselect All button. An alternative approach to selecting a character is to select the Data menu and then Select Character To Display... [1] option. This approach brings up a menu which allows you to enter a character number or scroll down through those available.

Now for the fun! There are two ways to simulate a single map. First, is to use the Map button at the bottom right of the Tree View window. Let's try this first. Bring the Tree View window to the front and click on the Map button. You should see that the tree now displays a mutational map. The colors represent the states and should reflect your choices in selecting state colors above (or the defaults as shown above). If it isn't clear which states are represented by which colors, select the Display Options button opening the options window and select Show States. The tips of the tree should now be labelled with character states and the last color along the branch is the state color.

There is an alternative way to generate a single map without mouse clicking on the Map button. You can use the Simulate History option in the Analysis menu (or [M]). Explore a couple of maps. When you are done use the Clear Map button to return to the default tree.

Lesson 5 ⇒ Simulate Multiple Mutational Histories (Maps)

While simulating single maps will give you a feel for the patterns of character change along the tree, for the most part you will be using the multiple mapping features. It is with these features that you can generate many maps over model parameters, trees, and priors (morphology only). When simulating a single map no information about the character history is recorded outside the image , and it isn't stored but persists only as long as the mapping doesn't change in the Tree View window. When performing multiple maps information about the histories begin to be recorded and things get really interesting.

Designing an experiment

Let's begin. You may have noticed when exploring alternative ways to generate a single map an option called Simulate Histories...[M] in the Analysis menu. Go the menu and select this option. A new window should appear with four tabs of options (an image of the window when each tab is selecte is shown below).



The design of a multiple map experiment can be as complex or as simple as you desire. This tutorial will walk you through just the simplest design. You can explore the remainder of the Help Documents for additional features that will expand the complexity of your analysis.

The first tab, Sampling, sets which character(s), type of analysis (history or covariation), the number of samples (or realizations) requested. It is important to remember that we want to sample the posterior of character histories and we want these to be good estimates. This may require a large number of realizations (and samples from the priors under morphology models). (Note: SIMMAP assumes that the trees used in a morphological analysis are derived from some other source, e.g., nucloetide data, or an a priori knowledge.) The last option is burnin. If we had read in multiple trees (and model parameters in the molecular case) this would allow us to restrict the analysis to only those samples desired. This option is primarily here if you wish to average your inferences over topologies sampled from some distribution (e.g., the posterior samples from MrBayes). Select character 17 from the Character history pop-up menu and enter a 1000 in the text box below Number of realizations for each tree, each site text field.

The second tab, Trees, allows us to select which trees to use in the analysis, and some options to control displaying maps during the analysis, enforcing fixed ancestral states, and constraining the analysis to a single branch. With the data set we are using, because it only has a single tree, the first three options are equivalent. (Note: The current release of SIMMAP has not fully implemented collecting statistics for a clade defined by a constraint. You can however you use constraints to calculate ancestral states.) If you select to have the maps shown during the analysis Show history on trees during simulation, be aware that this could slow down the required computation time by 30-70%. For this tutorial, let's select the Show history on trees during simulation switch, the first time around anyway, as it is much more exciting.

We will ignore the third tab since it applies only to molecular data. For detailed information on these options see the help pages on options by window.

The fourth tab, Miscellaneous, provides many options. For the moment, let us just consider that we can set up whether the statistics are saved to a file. In order to explore each map, the Save individual map statistics to file switch must be selected and an output file chosen. An output file can be chosen using the Save As button to the right of the switch, or through the Statistics menu using the Choose Mappings Save To File... option. Select this option and choose a file and location to save it to. Saving statistics, or any information to a file, is cummulative and will be appended to the currently selected file. In addition, the statistics stored in the program are cummulative and therefore, if the design changes, they need to be cleared if it would be innapropriate to average runs together (how this is done will be shown in a bit). They can be displayed and saved at any time until they are cleared from memory. That being said leave the summary switch off.

Running the experiment

Proceed back to the first tab. We are now ready to simulate the character maps. Press the Run button. You should see a progress indicator begin to display the percent of the analysis that has been completed, and the mutational histories should be visible in the Tree View window. Once the experiment is complete we should check a couple of things. First, check out the file you saved the individual mappings to (an explicit description of the output is presented elsewhere in the output files section). Shown below is an excerpt of the output file.


It shows the first few columns and the first five and last rows. (Note: Your values may differ because the initial seed is different. The seed is initialized by the time and date of your computer clock. It can be set manually to verify your results.) The third row contains a header (tab-delimited) that describes the column's values. Next, let's look at the summary of the data in the output file. Go back to the program and under the Statistics menu mouse to the Morphology sub-menu and select the Show Morphology Statistics... option. A new window should appear with a listing for each character in the data set. Characters that have been mapped and have data collected will appear with a dark triangle. Scroll down to character 17 and click on the dark triangle. You should see something like the window below.

The table displays a variety of summary statistics. Most of the output should be obvious. For example, the sample size should read 1000, the number of transformations (i.e., number of changes) will be something close to 7.4, etc. This window allows you to clear the statistics and also save the summary values to a file.

Now go back to the Trees tab and deselect the Show history on trees during simulation switch. Repeat the analysis for character 17 making sure you enter 1000 in the Number of realizations for each tree, each site text field.

You should have noticed that it went much faster than the first run because the character histories are not being displayed in the Tree View window. Explore the statistics again. Now you have a sample size of 2000 (if you didn't clear the previous run).

The output file should now have an additional 1000 maps appended after the first 1000.

That wraps up this brief tutorial.

I hope the tutorial has given you a feel for how to use SIMMAP and what you can do with it. Look through the remainder of the help documents to get a better understanding of all of the remaining features.

Good luck!
About Search System Requirements License Acknowledgements Contact
Overview   Ancestral States
Page Last Updated: 6 August 2008