There are the following comparison pictures:
- two basic comparison pictures. First one contains offsets of the first alignment relative to the second alignment. Second picture contains offsets of the second alignment relative to the first alignment. - two pictures (the 1st alignment versus the 2nd one, and 2nd versus 1st) with comparison of column values in the range best to worst. X axe contains percent of columns of alignment. These pictures are useful to compare alignments with very different lengths too. - two pictures (the 1st alignment versus the 2nd one, and 2nd versus 1st) with comparison of column values in the range best to worst. X axe contains quantity of columns of the 1st or 2nd alignment respectively.
All images have same size (except "AUTO" height when two alignments have different number of sequences). Width and height of the images may be set from 320 to 3000 and 200 to 10000 pixels, respectively. Defaults are 765 and "AUTO". If the height set to "AUTO" the program select the value itself depending on the number of sequences in the alignment. Two alignments may have different number of sequences and different order of sequences. Same sequences not necessary must have same names, but must have same letter strings (case insensitive).
Offset color scale can be set as 1 - max, or min - max. The second option works only when both smallest left offset and smallest right offset are greater than 1.
The sequences of same names that have at least one different letter at same position are mismatched sequences. Such sequences will not be compared as a default (mode "No"). You can select "Yes" to compare them as though they are okay. In any case, the first mismatched positions are marked out in the picture (see the picture legends).
An alighment can be presented in one or several graphical forms (picture types):
- all letter groups are colored; - colored only letters of a maximum group at every column; - colored only letters of selected groups; - a few statistical functions (column values): - number of letters in max group at the column, - number of same letters at the column, - adjusted number of groups at the column, - weight of the column; - same functions but another ordering. That is in the range best to worst over X-axe.
You can select the color group type, protein or nucleotide, or leave it to
program (the type "AUTO") to decide which one should be used.
There are the following protein letter groups:
1. _______A G _______________
2. _______C _________________
3. _______D E N Q B Z _______
4. _______I L M V ___________
5. _______F W Y _____________
6. _______H _________________
7. _______K R _______________
8. _______P _________________
9. _______S T _______________
10. _______Others ____________
There are the following nucleotide letter groups:
1. _______A _________________
4. _______C _________________
6. _______G _________________
4. _______T U _______________
5. _______Others ____________
Images of both alignments have same size (except "AUTO" height when two alignments have different number of sequences). Width and height of the images may be set from 450 to 3000 and 200 to 10000 pixels, respectively. Defaults are 765 and "AUTO". If the height set to "AUTO" the program select the value itself depending on the number of sequences in the alignment.
Every column can be colored evenly or with respect to column weight: less weight more space is left uncolored (sparse coloring).
Long alignments can be splitted into several parts. You can select the size of one part in the range 50 to 500 letters.
Alignment
The query alignment should be written in one-letter
code (low or upper case) and can be divided to several strings. It have to be blank string
between sequental batches and at the begining of every batch all sequence names should be
repeated.
Example of two alignments to be compared:
First alignment:
.**.. ....+. .....+.. +.......++.. . +.... +.... ....*. ..
HCVPCP2 ( 1) EDGVNVHDVTVTTDKSFEQQV-GVIADKDKDLSGAVPSDLNTSELLTK-----AIDV-DW
TGVPCP2 ( 1) EDNVNHERVSVSFDKTYGEQLKGTVVIKDKDVTNQLPSAFDVGQKVIK-----AIDI-DW
MHVPCP2 ( 1) VDGVNFRSCCVAEGEVFGKTL-GSVFCDGINVTKVRCSAIYKGKVFFQysdLSEADLvAV
MHVPCP1 ( 1) REGIAEAKATVCAD--AVDACPDQVEA--FEIEKVEDSILDELQTELN----APADK-TY
HCVPCP1 ( 1) EEGGNDLSLPVMISEWPLSVQQAQQEATLPDIAEDVVDQVEEVNSIFD---IETVDV---
TGVPCP1 ( 1) SEG--------------AEGTSSQEEVETVEVADITSTD-EDVD-IVE---VSAKDD-PW
IBVPCP ( 1) EDGVKYRSIVLKPGDSLGQ--FGQVYAKNK-IV-FTADDVEDKEILYV----PTTDK-SI
....+..+...+..... ....++... .+....+..++***+++.++.+* ....+.
HCVPCP2 ( 54) VEFYGFKDAVTFATVDHSAF-AYESAV-VNGIRVLKTSDNNCWVNAVCIALQYSKPHFIS
TGVPCP2 ( 55) QAHYGFRDAAAFSASSHDAY-KFEVVT-HSNFIVHKQTDNNCWINAICLALQRLKPQWKF
MHVPCP2 ( 60) KDAFGFDEPQLLKYYTMLGMCKWPVVV-CGNYFAFKQSNNNCYINVACLMLQHLSLKFPK
MHVPCP1 ( 52) EDVLAFDAVCSEALSAFYAVPSDETHFkVCGFYSPAIERTNCWLRSTLIVMQSLPLEFKD
HCVPCP1 ( 55) -----KHDVSPF-------EMPFEELN---GLKILKQLDNNCWVNSVMLQIQLTGI----
TGVPCP1 ( 41) AAAVDVQEAEQF----NPSLPPFKTTN-LNGKIILKQGDNNCWINACCYQLQ----AFDF
IBVPCP ( 52) LEYYGL-DAQKYVIYLQTLAQKWNVQY-RDNFLILEWRDGNCWISSAIVLLQAAKIRFKG
....+*..*. *... **..++.......+. .*.. .+ .+. ....*.. . . . .
HCVPCP2 ( 112) QGLDAAWNKFVLGDVEIFVAFVYYVARLMKGDKGDAEDTLTKLSKYLANEAQV-QLEHYS
TGVPCP2 ( 113) PGVRGLWNEFLERKTQGFVHMLYHISGVKKGEPGDAELMLHKLGDLMDNDCEI-IVTHTT
MHVPCP2 ( 119) WQWQEAWNEFRSGKPLRFVSLVLAKGSFKFNEPSDSIDFMR--VVLREADLSGATCNLEF
MHVPCP1 ( 112) LEMQKLWLSYKAGYDQCFVDKL--VKSVPKSIILPQGGYVADFAYFFLSQCSF-KAYANW
HCVPCP1 ( 96) LDGDYAMQFFKMGRVAKMIERCYTAEQCIRGAMGDVGLCMYRL----LKDLHTGFMVMDY
TGVPCP1 ( 92) FN-NEAWEKFKKGDVMDFVNLCYAATTLARGHSGDAE-YLLEL---MLNDYSTAKIVLAA
IBVPCP ( 110) F-LTEAWAKLLGGDPTDFVAWCYASCTAKVGDFSDANWLLANLAEHFDADYTNAFLKKRV
.* .*... .. .++. ..+..... ........* .+.. ......+......+..
HCVPCP2 ( 171) SCVECDAKFK--NSVASINSAIVCASVKRDGVQVGYCVHGIK--YYSRVRSVRGRAIIVS
TGVPCP2 ( 172) ACDKC-------AKVEKFVGPVVAAPLAIHGTD-ETCVHGVS-VNV-KVTQIKGTVAITS
MHVPCP2 ( 177) VC-KCGVKQEQRKGVDA---VMHFGTLDKGDLVRGYNIACTCgSKLVHCTQFNVPFLICS
MHVPCP1 ( 169) RCLECDMELK-LQGLDA---MFFYGDVVSHMCK---CGNSMT------LLSADIPYTLHF
HCVPCP1 ( 152) KC-SC-----TSGRLEE-SGAVLFCTPTKKAFPYGTCLNCNA-PRMCTIRQLQGTIIFVQ
TGVPCP1 ( 147) KC-GCGEK---EIVLER---AVFKLTPLKESFNYGVCGDCMQ-VNTCRFLSVEGSGVFVH
IBVPCP ( 169) SC-NCGIKSYELRGLEACIQPVRATNLLHFKTQYSNCPTCGA-NNTDEVIEASLPYLLLF
. . . ....+....+.*....**+.. . .... ..*+ ..... . . ....
HCVPCP2 ( 227) VEQ-LEPCAQSRLLSGVAYTAFSGPVDKGHYTVYDTAKKS-MYDG---DRFV--KHD-LS
TGVPCP2 ( 222) LIG----PIIGEVLEATGYICYSGSNRNGHYTYYDNRNGL-VVDAEKAYHFNRDLLQVTT
MHVPCP2 ( 233) NTP--EGRKLPDDV--VAANIFTGG-SVGHYTHVKCKPKYqLYDACNVNKVSEAKGNFTD
MHVPCP1 ( 216) GVR-DDKFCAFYTPRKVFRAACAVDVNDCHSMAVVEGKQI---DGKVVTKFIGDKFDFMV
HCVPCP1 ( 204) QKP-EPVNPVSFVVKPVCSSIFRGAVSCGHYQTNIYSQNL-CVDGFGVNKIQPWTNDALN
TGVPCP1 ( 199) DILsKQTPEAMFVVKPVMHAVYTGTTQNGHYMVDDIEHGY-CVDGMGIKPLKK--RCYTS
IBVPCP ( 227) ATD--GPATVDCDEDAVGTVVFVGSTNSGHCYTQAAGQAF---DNLAKDRKFGKKSPYIT
.+....... ...... .. ..+... ++ ...
HCVPCP2 ( 279) LLSVTSVVM------VGGYVAPVNTVKPKPVINQ
TGVPCP2 ( 277) AIASNFVVKK-PQAEERPKNCAFNKVAASPKIVQ
MHVPCP2 ( 288) CLYLKNLKQTFSSVLTTFYLDDVKCVEYKPDLSQ
MHVPCP1 ( 272) GYGMTFSMSPFELAQLYGSCITPNVCFVK-----
HCVPCP1 ( 262) TICIKDADY---NAKVEISVTPIKN---------
TGVPCP1 ( 256) TLFINANVM--TRAEKPKQEFKVEKVEQQPIVEE
IBVPCP ( 282) AMYTRFAFK----NETSLPVAKQSKGKSKS-VKE
Second alignment:
HCVPCP2 EDGVNVHDVTVTTDKSF-EQQVGVIADKDKDLSGAVPSDLNTSELLTKAIDVDWVEFYGF
TGVPCP2 EDNVNHERVSVSFDKTYGEQLKGTVVIKDKDVTNQLPSAFDVGQKVIKAIDIDWQAHYGF
MHVPCP2 VDGVNFRSCCVAEGEVF-GKTLGSVFCDGINVTKVRCSAIYKGKVFFQYSDLSEADLVAV
TGVPCP1 --------SEGAEGTSS-QEEVETVEVADITST-----DEDVDIVEVSAKDDPWAAAVDV
HCVPCP1 -----------EEGGND--LSLPVMISEWPLSVQQAQQEATLPDIAEDVVDQVEEVNSIF
IBVPCP EDGVKYRSIVLKPGDSL--GQFGQVYAKNKIVFTAD-DVEDKEILYVPTTDKSILEYYGL
MHVPCP1 REGIAEAKATVCADAVD--ACPDQVEAFEIEKVEDSILDELQTELNAPA-DKTYEDVLAF
HCVPCP2 KDAVTFATVDHSAF-------AYESAVVNGIRVLKTSDNNCWVNAVCIALQYSKPHFISQ
TGVPCP2 RDAAAFSASSHDAY-------KFEVVTHSNFIVHKQTDNNCWINAICLALQRLKPQWKFP
MHVPCP2 KDAFGFDEPQLLKYYTMLGMCKWPVVVCGNYFAFKQSNNNCYINVACLMLQHLSLKFPKW
TGVPCP1 QEAEQF-NPSLPPF---------KTTNLNGKIILKQGDNNCWINACCYQLQAFD--FFN-
HCVPCP1 DIETVDVKHDVSPF-------EMPFEELNGLKILKQLDNNCWVNSVMLQIQLTG--ILDG
IBVPCP DAQKYVIYLQTLAQ-------KWNVQYRDNFLILEWRDGNCWISSAIVLLQAAKIRFKG-
MHVPCP1 DAVCSEALSAFYAVPS-----DETHFKVCGFYSPAIERTNCWLRSTLIVMQSLPLEFKDL
HCVPCP2 GLDAAWNKFVLGDVEIFVAFVYYVARLMKGDKGDAEDTLTKLSKYLAN---EAQVQLEHY
TGVPCP2 GVRGLWNEFLERKTQGFVHMLYHISGVKKGEPGDAELMLHKLGDLMDN---DCEIIVTHT
MHVPCP2 QWQEAWNEFRSGKPLRFVSLVLAKGSFKFNEPSDSIDFMRVVLREADLS--GATCNLEFV
TGVPCP1 --NEAWEKFKKGDVMDFVNLCYAATTLARGHSGDAEYLLELMLNDYST----AKIVLAAK
HCVPCP1 --DYAMQFFKMGRVAKMIERCYTAEQCIRGAMGDVGLCMYRLLKDLHT----GFMVMDYK
IBVPCP FLTEAWAKLLGGDPTDFVAWCYASCTAKVGDFSDANWLLANLAEHFDADYTNAFLKKRVS
MHVPCP1 EMQKLWLSYKAGYDQCFVDKLVKSVPKSIILP-QGGYVADFAYFFLSQ---CSFKAYANW
HCVPCP2 SSCVECDAKFKNSVASINSAIVCASVKRDGVQVGYCVHGIKYYSRVRSVRGRAIIVSVEQ
TGVPCP2 TACDKCAKVEKFVGPVVAAPLAIHGTD-ET-----CVHGVSVNVKVTQIKG---TVAITS
MHVPCP2 CKCGVKQEQRKGVDAVMHFGTLDKGDLVRGYN-IACTCGSKLVHCTQFNVP----FLICS
TGVPCP1 CGCGEKEIVLERAVFKLTPLKESFNYGVCG----DCMQVNTCRFLSVEGSG-VFVHDILS
HCVPCP1 CSCTSGRLEESGAVLFCTPTKKAFPYGTCLN----CNAPRMCTIRQLQGTI--IFVQQKP
IBVPCP CNCGIKSYELRGLEACIQPVRATNLLHFKTQYS-NCPTCGANNTDEVIEASLPYLLLFAT
MHVPCP1 R-CLECDMELKLQGLDAMFFYGDVVSHMCK-----CGNSMTLLSADIPYTL----HFGVR
HCVPCP2 LEPCAQSRLLSGVAYTAFSGPVDKGHYT---------VYDTAKKSMYDGDRFVKHDLSLL
TGVPCP2 LIGPIIGEVLEATGYICYSGSNRNGHYT---------YYDNRNGLVVDAEKAYHFNRDLL
MHVPCP2 NTPEGRKLPDDVVAANIFTGG-SVGHYTHVKCKPKYQLYDACNVNKVSEAKGNFTDCLYL
TGVPCP1 KQTPEAMFVVKPVMHAVYTGTTQNGHYM---------VDDIEHGYCVDGMGIKPLKKRCY
HCVPCP1 EPVNPVSFVVKPVCSSIFRGAVSCGHYQ---------TNIYSQNLCVDGFGVNKIQPWTN
IBVPCP DGPATVDCDEDAVGTVVFVGSTNSGHCYT-------QAAGQAFDNLAKDRKFGKKSPYIT
MHVPCP1 DDKFCAFYTPRKVFRAACAVDVNDCHSMA-------VVEGKQIDGKVVTKFIGDKFDFMV
HCVPCP2 S----VTSVVMVGGYVAP-----------VNTVKPKPVINQ--
TGVPCP2 Q----VTTAIASNFVVKKPQAEERPKNCAFNKVAASPKIVQ--
MHVPCP2 KNLKQTFSSVLTTFYLDD-----------VKCVEYKPDLSQ--
TGVPCP1 TSTLFINANVMTRAEKPKQ--E-----FKVEKVEQQPIVEE--
HCVPCP1 D---ALNTICIKDADYNA-----------KVEISVTPIKN---
IBVPCP AMYTRFAFKNETSLPVAK------------QSKGKSKSVKE--
MHVPCP1 G---YGMTFSMSPFELAQ-----------LYGSCITPNVCFVK
Most important alignment rules and typical errors:
1. Different sequences should have different names.
2. Empty symbols aren't advisory inside sequence name.
3. Several blanks are advisory between sequence and its name.
4. Empty lines are mandatory inside parts of alignment.
5. Gap symbol must be '-'.
6. Beginning and finishing gaps are necessary.