Creating a Minimum Spanning Tree based on MLST data

This tutorial illustrates how to create a Minimum Spanning Tree (MST) based on MLST allele numbers. The same steps are also applicable for clustering of other categorical character data sets such as MLVA.

Characters

A character is basically a name-value pair of which the value can be binary, multi-state or continuous. Because of this very broad definition, a wide variety of data can be analyzed as character types (= an array of characters). This includes morphological and biochemical features, commercial test panels (API®, Biolog®, Vitek®, etc.), antibiotics resistance profiles, fatty acid profiles, microarrays, SNP arrays, repeat numbers in MLVA, allelic profiles in MLST, etc.

Download PDF file: 
Download sample data: 
Neisseria MLST data
MS Excel file, containing MLST allelic profiles for 500 Neisseria meningitidis strains, including serogroup and strain information.