1. Separate HCV Sequences by type from HCV_2002-2004_dupnamesremoved.fasta using Read_Fasta_sepbytype.pl.

2. For type 1, because there are so many, count duplicate sequences and append number to name of one, delete rest, using Unique String Frequency.vi.

3. Align sequences of each type using Clustalx. (open/extend gap penalties = 0 and realign selected residues of bad alignments).

3. Search for duplicates using Unique String Frequency.vi.
 	For type 1, search for duplicate sequences using Unique String Frequecy 2nd Run.vi, which adds numbers of duplicates together
	rather than initializing the value.

4. Remove sequences with ambiguous bases using RemoveAmbigSeqs.vi (ambiguous bases overrepresent unique sequences and possibly indicate sequencing error).

5. Put unique sequences of all types in same .fasta file and check for duplicates (they should all be unique because they are different types).
	This returns duplicates across types, meaning ARUP error in assigning types.

6. Graph frequencies of unique sequences using Graph Unique Sequences.vi.

7. Create a weight matrix of aligned sequences using Aln2PWM or UniqAln2PWM.pl.

8. Plot weight matrices with ProfilePlot.vi.