-
Notifications
You must be signed in to change notification settings - Fork 0
/
thesis.txt
11455 lines (8194 loc) · 414 KB
/
thesis.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
INVESTIGATING THE RECOGNITION
AND INTERACTIONS OF NON-POLAR
α HELICES IN BIOLOGY
A thesis submitted to the University of Manchester
for the degree of Doctor of Philosophy
in the Faculty of Science and Engineering
2019
James A. Baker
School of Chemistry
Contents
Abstract
15
Lay Abstract
16
Declaration
17
Copyright Statement
18
Acknowledgements
19
1 Introduction
20
1.1
Transmembrane proteins . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1
1.2
1.3
1.4
20
A brief history of the discovery and exploration of the transmembrane proteins . . . . . . . . . . . . . . . . . . . . . . . . .
20
1.1.2
Transmembrane proteins in disease . . . . . . . . . . . . . . . .
22
1.1.3
The transmembrane protein problem . . . . . . . . . . . . . . .
23
1.1.4
The transmembrane protein revolution . . . . . . . . . . . . . .
24
1.1.5
The role of bioinformatics in transmembrane biology . . . . . .
26
Biological membranes . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
1.2.1
Membrane lipids . . . . . . . . . . . . . . . . . . . . . . . . . .
28
1.2.2
Membrane potential . . . . . . . . . . . . . . . . . . . . . . . .
30
α helices in the membrane; structure and function . . . . . . . . . . . .
31
1.3.1
Transmembrane helix sequence composition . . . . . . . . . . .
31
1.3.2
The hydrophobicity of transmembrane segments . . . . . . . . .
35
1.3.3
Sequence complexity . . . . . . . . . . . . . . . . . . . . . . . .
38
Biogenesis of transmembrane proteins . . . . . . . . . . . . . . . . . . .
40
1.4.1
40
An overview of translocation . . . . . . . . . . . . . . . . . . . .
2
1.5
1.4.2
Signal peptides . . . . . . . . . . . . . . . . . . . . . . . . . . .
44
1.4.3
β sheets in the membrane . . . . . . . . . . . . . . . . . . . . .
45
The aims of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
2 The “negative-outside” rule
2.1
50
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
2.1.1
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
2.1.2
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
2.1.3
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
2.2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
2.3
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
56
2.3.1
Acidic residues within and nearby transmembrane helix segments are rare . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.2
56
Amino acid residue distribution analysis reveals a “negative-notinside/negative-outside” signal in single-pass transmembrane helix segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.3
60
Amino acid residue distribution analysis reveals a general
negative-charge bias signal in outside flank of multi-pass transmembrane helix segments — the negative outside enrichment
rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.4
64
Further significant sequence differences between single-pass and
multi-pass helices: distribution of tryptophan, tyrosine, proline
and cysteine . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.5
Hydrophobicity and leucine distribution in transmembrane helices in single- and multi-pass proteins . . . . . . . . . . . . . .
2.3.6
71
A negative-outside (or negative-non-inside) signal is present
across many membrane types . . . . . . . . . . . . . . . . . . .
2.3.7
69
73
Amino acid compositional skews in relation to transmembrane
helix complexity and anchorage function . . . . . . . . . . . . .
76
2.4
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
2.5
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
2.5.1
90
Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2.5.2
On the determination of flanking regions for transmembrane helices and the transmembrane helix alignment . . . . . . . . . . .
97
2.5.3
Separating simple and complex single-pass helices . . . . . . . .
99
2.5.4
Distribution normalisation . . . . . . . . . . . . . . . . . . . . . 100
2.5.5
Hydrophobicity calculations . . . . . . . . . . . . . . . . . . . . 101
2.5.6
Normalised net charge calculations . . . . . . . . . . . . . . . . 101
2.5.7
Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3 Collation and analysis of tail-anchored protein transmembrane helices reveals subcellular variation in flanking charged residue distribution and the transmembrane helix core hydrophobicity
104
3.1
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
3.2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.3
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.4
3.3.1
Building a list of tail-anchors . . . . . . . . . . . . . . . . . . . 110
3.3.2
Calculating hydrophobicity . . . . . . . . . . . . . . . . . . . . . 115
3.3.3
Calculating sequence information entropy . . . . . . . . . . . . . 116
3.3.4
Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
3.3.5
Modelling cytochrome b5 and PTP1b . . . . . . . . . . . . . . . 117
3.3.6
Availability of materials . . . . . . . . . . . . . . . . . . . . . . 118
Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3.4.1
A comparison of up-to-date tail-anchored protein datasets . . . 118
3.4.2
It is difficult to observe any hydrophobic variation of tailanchored protein transmembrane helices from different species . 121
3.4.3
There are biochemical differences between tail-anchored transmembrane helices from different organelles . . . . . . . . . . . . 123
3.4.4
More annotation is required to identify chaperone interaction
factors of the transmembrane helix. . . . . . . . . . . . . . . . . 131
3.4.5
Spontaneous insertion may be achieved by polar strips in the
transmembrane helix of tail-anchored proteins . . . . . . . . . . 133
3.5
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
4
4 Sequence analysis of polarity in transmembrane helices suggests that
translocation of marginally hydrophobic helices could be facilitated
by neighbouring typically hydrophobic helices
139
4.1
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
4.2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
4.2.1
The ribosome-translocon complex in the biogenesis of membrane
proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
4.2.2
Cooperative transmembrane helix insertion by the transloconribosome complex . . . . . . . . . . . . . . . . . . . . . . . . . . 143
4.2.3
4.3
4.4
Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
4.3.1
Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
4.3.2
Gene ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
4.3.3
Complexity and hydrophobic estimation . . . . . . . . . . . . . 147
4.3.4
Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
4.3.5
Availability of materials . . . . . . . . . . . . . . . . . . . . . . 148
Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
4.4.1
Large contrasts in transmembrane helix hydrophobicity occur in
channels and receptors . . . . . . . . . . . . . . . . . . . . . . . 149
4.4.2
GPCRs contain conserved relatively polar TMH7, which follows
the typically hydrophobic TMH6 . . . . . . . . . . . . . . . . . 150
4.4.3
6TMH ion channels contain polar-hydrophobic transmembrane
helix pairs/groups indicative of conserved cooperative insertion . 155
4.4.4
The prevalence of the high hydrophobic discrepancy of transmembrane helices amongst other common ‘transporter’ transmembrane protein classes
4.5
. . . . . . . . . . . . . . . . . . . . . 159
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5 Conclusions and outlook
164
Word count 47,705
5
List of Tables
2.1
Acidic residues are rarer in transmembrane helices of single-pass proteins than in transmembrane helices of multi-pass proteins. . . . . . . .
2.2
Statistical significances for negative charge distribution skew on either
side of the membrane in single-pass transmembrane helices. . . . . . . .
2.3
67
Leucines at the inner and outer leaflets of the membrane in transmembrane helices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5
63
Statistical significances for negative charge distribution skew on either
side of the membrane in multi-pass transmembrane helices. . . . . . . .
2.4
57
74
Simple transmembrane helices are less similar than complex transmembrane helices to transmembrane helices from multi-pass proteins in UniHuman. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6
79
Simple transmembrane helices are less similar than complex transmembrane helices to transmembrane helices from multi-pass proteins in ExpAll. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
2.7
The experimental evidences of TOPDB. . . . . . . . . . . . . . . . . .
92
2.8
Records with INTRAMEM and TRANSMEM flanking region overlap. .
99
3.1
Hydrophobicity statistical comparisons between mouse and human,
yeast, and plants in the SwissProt Filtered Dataset. . . . . . . . . . . . 121
3.2
Hydrophobicity statistical comparisons between mouse and human,
yeast, and plants in the UniProt Curated Dataset. . . . . . . . . . . . . 123
3.3
Statistical comparisons between TMH sequences from organelles in the
UniProt Curated Dataset. . . . . . . . . . . . . . . . . . . . . . . . . . 123
3.4
Statistical comparisons between transmembrane helix sequences from
organelles in the SwissProt Filtered Dataset. . . . . . . . . . . . . . . . 129
6
4.1
Dataset sizes of common transmembrane protein families of transporters
and channels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7
List of Figures
1.1
A selection of figures demonstrating important discoveries of membrane
proteins and their environment. . . . . . . . . . . . . . . . . . . . . . .
21
1.2
The structure of SecYE in a nanodisc at near atomic resolution. . . . .
25
1.3
A comparison of a single structure held by commonly used structural
transmembrane protein databases. . . . . . . . . . . . . . . . . . . . . .
1.4
A cartoon showing the general components of the membrane and a
typical transmembrane helix. . . . . . . . . . . . . . . . . . . . . . . . .
1.5
28
32
A cartoon of the topological clustering of positively-charged residues
near transmembrane helices in relation to subcellular location. . . . . .
33
1.6
The hydropathic index of rabbit cytochrome b5. . . . . . . . . . . . . .
37
1.7
The hydrophobic-complexity continuum distinguishes between transmembrane helix anchors and those with function beyond anchoring. . .
1.8
39
A simplified schematic of the co-translational Sec pathway and the posttranslational pathway. . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
The pore, the plug, and the lateral gate of the translocon. . . . . . . .
42
1.10 The key components of a signal peptide. . . . . . . . . . . . . . . . . .
45
1.9
1.11 Cartoons showing the structural differences of the outer membrane β-barrel proteins, transmembrane α-helix channels, and transmembrane
α-helix signal transducers in the membrane. . . . . . . . . . . . . . . .
46
1.12 A cartoon of the biogenesis of β barrel membrane proteins in mitochondria and Gram-negative bacteria. . . . . . . . . . . . . . . . . . . . . .
2.1
47
Negatively-charged amino acids are amongst the rarest residues in transmembrane helices and ±5 flanking residues. . . . . . . . . . . . . . . .
8
58
2.2
Relative percentage normalisation reveals a negative-outside bias in
transmembrane helices from single-pass protein datasets. . . . . . . . .
2.3
Negative-outside bias is very subtle in transmembrane helices from
multi-pass proteins. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4
61
65
The net charge across multi-pass and single-pass transmembrane helices
shows a stronger positive inside charge in single-pass transmembrane
helices than multi-pass transmembrane helices. . . . . . . . . . . . . . .
2.5
66
Relative percentage heat-maps from predictive and experimental
datasets corroborate residue distribution differences between transmembrane helices from single-pass and multi-pass proteins.
2.6
. . . . . . . . .
There is a difference in the hydrophobic profiles of transmembrane helices from single-pass and multi-pass proteins. . . . . . . . . . . . . . .
2.7
72
There is a difference in the hydrophobic profiles of transmembrane helices from single-pass and multi-pass proteins. . . . . . . . . . . . . . .
2.8
70
73
Comparing charged amino acid distributions in transmembrane helices
of multi-pass and single-pass proteins across different species and organelles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.9
75
Comparing the amino acid relative percentage distributions of simple
and complex transmembrane helices from single-pass proteins and transmembrane helices from multi-pass proteins. . . . . . . . . . . . . . . . .
77
2.10 Residue distributions of transmembrane anchors. A view showing additional residue distribution features that transmembrane helices with
an anchorage function display. . . . . . . . . . . . . . . . . . . . . . . .
89
2.11 The lengths of flanks and transmembrane helices in multi-pass and
single-pass proteins in the UniHuman and ExpAll dataset. . . . . . . .
98
2.12 Relative percentage heatmaps from the predictive datasets calculated
by fractions of the absolute maximum and by the relative percentage of
a given amino acid type. . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3.1
An overview of the biogenesis of tail-anchored proteins. . . . . . . . . . 107
3.2
The sources, methods, and filters applied to the sequences in the tailanchored protein datasets. . . . . . . . . . . . . . . . . . . . . . . . . . 111
9
3.3
A Venn diagram showing tail-anchored protein UniProt ids present in
each of the datasets as well as those present in multiple datasets. . . . . 119
3.4
Average values of species datasets from UniProt manually curated set
and SwissProt automatically filtered dataset. . . . . . . . . . . . . . . . 122
3.5
Average sequence-based biochemical values of organelle datasets from
UniProt manually curated set and SwissProt automatically filtered
dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
3.6
The normalised skews of each amino acids from tail-anchored proteins
grouped by localisation from the SwissProt automatically filtered dataset.126
3.7
The normalised skews of each amino acids from tail-anchored proteins
grouped by localisation from the SwissProt automatically filtered dataset.128
3.8
The profile of transmembrane helix and flanks hydrophobicity from tailanchored protein groups stratified by chaperone interactors. . . . . . . 133
3.9
Structural biochemical analysis of a homology model of cytochrome b5. 135
3.10 Structural biochemical analysis of a homology model of PTP1b. . . . . 136
3.11 A cartoon of a potential method the cytochrome b5 and PTP1b transmembrane helix could integrate spontaneously into the membrane. . . . 137
4.1
A cartoon showing the generally accepted schematic of sequential multipass transmembrane helix insertion into the membranes. . . . . . . . 140
4.2
A cartoon of the ribosome in association with the translocon during
insertion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
4.3
Pie charts of a non-redundant list of transmembrane proteins compared
to a list of transmembrane proteins containing the most hydrophobically
different transmembrane helix pairs. . . . . . . . . . . . . . . . . . . . . 149
4.4
The hydrophobicity and complexity of GPCR transmembrane helices. . 151
4.5
The hydrophobic difference observed between TMH6 and TMH7 in
GPCRs is not due to the choice of hydrophobic scale. . . . . . . . . . . 153
4.6
The hydrophobicity of transmembrane helices in GPCR subfamilies. . . 154
4.7
The hydrophobicity of transmembrane helices in ion channels. . . . . . 157
4.8
The hydrophobic difference observed between TMH4 and the neighbouring transmembrane helices in 6TMH ion channels is not due to the
choice of hydrophobic scale. . . . . . . . . . . . . . . . . . . . . . . . . 159
10
4.9
Sequence entropy is unsuitable for assessing function in TMH4 of ion
channels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
4.10 High polarity discrepancy between sequentially adjacent transmembrane helices is not present in all transmembrane protein transporter
families. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
4.11 A cartoon of potential cooperative transmembrane helix insertion methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
11
Acronyms
AP Arrest Peptide. 141–144, 163, 165
EM Electron Microscopy. 21, 22, 25–27, 43, 163, 165
EMC ER Membrane protein Complex. 23, 43, 44, 107, 109, 162, 165
ER Endoplasmic Reticulum. 30–32, 37, 40, 43, 54, 86, 87, 106–110, 112–114, 123–126,
128–131, 140, 144, 155
GPCR G protein-coupled receptor. 28, 46, 145–147, 149–155, 162, 163, 165
K-S Kolmogorov-Smirnov. 78–81, 103, 116
K-W Kruskal-Wallis. 63, 67, 78–81, 102, 103, 116
MD Molecular Dynamics. 27, 29, 35, 165
MOM Mitochondrial Outer Membrane. 46–48, 109, 130, 131, 135, 136
PDB Protein Data Bank. 22, 28, 32, 55
PM Plasma Membrane. 40, 43, 86, 87, 112, 114, 123–125, 127, 129, 130
POPC Palmitoyloleoylphosphatidylcholine. 38, 73
RNA Ribonucleic Acid. 41, 140–142
SNARE Soluble N-ethylmaleimide-sensitive factor attachment protein receptor. 105,
109, 164
SP Signal Peptide. 24, 26, 43–45, 47, 48, 112, 113, 131
12
SR Signal Recognition Particle Receptor. 41, 43, 44, 106, 140, 141
SRP Signal Recognition Particle. 41, 43, 44, 106–108, 115, 140, 141
TA Tail Anchor. 27, 37, 40, 43, 49, 105–110, 113–115, 118–125, 127, 129–133, 135–
137, 164, 165
TM Transmembrane. 21, 26, 28, 34, 39, 41, 52–55, 69, 71, 78, 85, 98, 109
TMH Transmembrane Helix. 20, 23–46, 48, 49, 51–78, 81–91, 93, 94, 96–103, 105–
110, 112–114, 116–118, 120–138, 140–145, 147–158, 160–166
TMP Transmembrane Protein. 20–27, 29, 31, 35, 40, 42–45, 48, 49, 51, 53, 55, 58,
59, 68, 71, 73, 82–87, 108, 121, 131, 140–143, 149, 150, 155, 162–166
TMS Transmembrane Segment. 24, 32, 101, 117, 143
TOM Translocase of the Outer Membrane. 47, 48, 109
13
The University of Manchester
James A. Baker
Doctor of Philosophy
Investigating the Recognition and Interactions of Non-Polar α Helices in
Biology
February 1, 2019
14
Abstract
Non–polar helices feature prominently in structural biology. Transmembrane α helix
containing proteins make up around a third of all proteins, represent around 40% of
drug targets, and contain some of the most critical proteins required for life as we
know it. Yet they are fundamentally difficult to study experimentally. This is in part
due to the very features that make them so biologically influential; their non–polar
transmembrane helical regions.
By leveraging large data-sets of transmembrane proteins, this thesis is focused on
characterising features of transmembrane α helices en masse, particularly regarding
their topology, membrane–protein interactions, and intramembrane protein interactions.
In this study, we present statistical evidence demonstrating the ‘negative-outside’
rule in opposition to the ‘positive-inside’ rule. We also identify stabilising amino acid
distributions in anchoring transmembrane helices compared to transmembrane helices
with function beyond anchoring.
Tail-anchored proteins are a group of post-translationally inserting proteins. In this
thesis we show adaptations of hydrophobicity and residue distributions through the
transmembrane helices of tail-anchored proteins to different membrane environments
within the cell (the mitochondria, endoplasmic reticulum, the Golgi, and the plasma
membrane). However, we could not detect a hydrophobic difference between global
populations of tail-anchored proteins in different species (mammals, plants, and fungi).
A handful of these proteins are capable of integrating into the membrane without
the need for membrane integration proteins. Structural modelling of transmembrane
helices from PTP1b and cytochrome b5 reveals a 3D amphipathic arrangement of
residues. This structural feature may play a role in their spontaneous membrane
insertion.
Finally, we find a conserved pattern of typically hydrophobic transmembrane helices neighbouring marginally hydrophobic helices in some families of transmembrane
proteins. This feature corresponds to transmembrane helices that have the potential to cooperate in order to integrate the more polar, but functionally important,
transmembrane helix of the pair into the membrane.
15
Lay Abstract
The survival of each of our cells relies on a cellular barrier (called the membrane)
to separate themselves from the surrounding environment. The membrane works by
being chemically very different from both the outside environment and the inside of
the cell, which in both cases is mostly water. The membrane is fatty so repels water.
Proteins are the molecular machinery that forms much of the cell structure and
shape as well as carrying out many of the cell’s routine tasks. Around a third of
our genome codes for membrane-embedded proteins. But because these membraneproteins are adapted for a life in the water-repelling cell membrane, they are very hard
to study in laboratories which often rely on methods that hold proteins in water-based
solutions.
In this thesis, we focus particularly on the parts of the protein that are embedded in
the water repelling membrane. We computationally analysed the biochemical make-up
of thousands of proteins from openly available biological databases.
This thesis demonstrates three features of membrane proteins:
• the radically different evolutionary story that membrane-bound regions have
compared to other proteins; the sacrifices they make for their stability in order to
maintain their function, and their optimisation through evolutionary timescales
to adapt to the membrane as best they can.
• a distinct sub-group of membrane-proteins that have a radically different
membrane-insertion mechanism (tail-anchored proteins) have adaptations in
their membrane regions depending on where they are located in the cell.
• some types of membrane proteins may use several membrane elements to ensure
the least stable, but functionally important, elements are correctly inserted into
the membrane.
These results will go on to inform more specific studies about membrane proteins.
These findings will provide insight into the causes of some genetic diseases as well as
drug targets in the case of pathogenic infections and cancers.
16
Declaration
No portion of the work referred to in the thesis has been
submitted in support of an application for another degree
or qualification of this or any other university or other
institute of learning.
17
Copyright Statement
i. The author of this thesis (including any appendices and/or schedules to this thesis)
owns certain copyright or related rights in it (the “Copyright”) and s/he has given
The University of Manchester certain rights to use such Copyright, including for
administrative purposes.
ii. Copies of this thesis, either in full or in extracts and whether in hard or electronic
copy, may be made only in accordance with the Copyright, Designs and Patents
Act 1988 (as amended) and regulations issued under it or, where appropriate, in
accordance with licensing agreements which the University has from time to time.
This page must form part of any such copies made.
iii. The ownership of certain Copyright, patents, designs, trade marks and other intellectual property (the “Intellectual Property”) and any reproductions of copyright
works in the thesis, for example graphs and tables (“Reproductions”), which may
be described in this thesis, may not be owned by the author and may be owned by
third parties. Such Intellectual Property and Reproductions cannot and must not
be made available for use without the prior written permission of the owner(s) of
the relevant Intellectual Property and/or Reproductions.
iv. Further information on the conditions under which disclosure, publication and
commercialisation of this thesis, the Copyright and any Intellectual Property
and/or Reproductions described in it may take place is available in the University
IP Policy (see http://documents.manchester.ac.uk/DocuInfo.aspx?DocID=487), in
any relevant Thesis restriction declarations deposited in the University Library,
The University Library’s regulations (see http://www.manchester.ac.uk/library/aboutus/regulations) and in The University’s Policy on Presentation of Theses.
18
Acknowledgements
I would like to thank all members of both the Eisenhaber research group, as well as
the Curtis and Warwicker research group for their interesting and engaging discussions throughout my PhD. In particular, I would like to acknowledge the patience,
direction, and exceptional supervision from Dr Jim Warwicker of the University of
Manchester along with Dr Frank Eisenhaber and Dr Birgit Eisenhaber from the Singapore Bioinformatics Institute. I also express my gratitude towards Dr Wing Cheong
Wong and Dr Max Hebditch for working with me on many fiddly problems. I would
also like to thank my advisor, Professor Stephen High, for his stimulating discussions
and guidance throughout the project.
I also thank the University of Manchester and the ARAP programme at the
A*STAR for funding the project.
I thank my parents Janet and Martin, my brother Tim, and my partner Emily for
their unwavering support of me in undertaking a PhD based for two years on the other
side of the planet. Being so far from my family and loved ones was certainly the most
challenging part of this process.
19
Chapter 1
Introduction
Membrane biology is a huge and varied field that is ultimately the study of the interface between compartments of the cell. The Transmembrane Protein (TMP) group
include some of the most critical to life proteins as well as a large number of drug targets. However, the experimental inaccessibility of the Transmembrane Helix (TMH)
has hampered the progress of study compared to their globular structural analogues.
Despite progress over the last decade, the understanding of the relationship between
the sequence and function of a TMH is incomplete.
In this chapter we will place the TMH problem in context, discuss some of the tools
and methods that allow us to analyse the TMHs, describe the important biological
aspects of the TMH, and discuss other membrane spanning protein segments.
1.1
1.1.1
Transmembrane proteins
A brief history of the discovery and exploration of the
transmembrane proteins
Due to the ability to segregate biochemical environments, the cellular barrier and
resultant compartmentalisation has been described as one of the fundamental pillars
of life as we know it [1, 2].
The significance of the cellular barrier was first noted in 1665 with the dawn of
the microscope, when Robert Hooke described the cell wall of the cork plant (Figure
1.1A), with the clear distinction of the barrier giving rise to the term “cell” [3, 4].
20
1.1. TRANSMEMBRANE PROTEINS
A)
C)
21
B)
D)
Figure 1.1: A selection of figures demonstrating important discoveries of membrane
proteins and their environment. A) (Above/“Fig 1”) A diagram of the cork plant as viewed
through an optical microscope drawn by Hooke circa 1665. Note the observable cellular barrier
between what we now know to be the plant cells. (Below/“Fig 2”) The cork plant used in the
microscope observations. From Hooke, republished in 1961 [4]. B) The first near-atomic resolution
(7Å resolution map) structure of a TMP acquired by Electron Microscopy (EM). This is a model
of the 7Transmembrane (TM) bacteriorhodopsin single protein from the purple membrane viewed
roughly parallel to the plane of the membrane. From Henderson & Unwin, 1975 [5]. C) The electron
density map of the first crystal structure of a TMP [6]. The image shows 11 layers with contours
representing 1.2Å between layers at 3Å resolution. As in B, this structure is from phototrophic purple
bacteria. The chromophores are made up of cytochrome (can be seen at the top of the image) and the
H-subunit (which can be seen at the bottom). From Deisenhofer et al., 1984. D) The first drawing
in which the fluid mosaic model was presented in the bilayer. From Singer & Nicolson, 1972 [7]
Throughout the early 20th century several theories were explored regarding the
22
CHAPTER 1. INTRODUCTION
composition of the membrane. In 1925, by scrutinising the concentration of chromocytes in the blood, and the surface area of the cells in blood from a variety of
mammals Gorter and Grendel concluded that: “chromocytes are covered by a layer
of fatty substances that is two molecules thick” [8]. This established the awareness of
the lipid bilayer. Later, the popular Danielli and Davson model correctly identified a
phospholipid bilayer, however also incorrectly suggested a lipoid space existed within
the phospholipid bilayer [9]. Eventually, as these ideas were explored, the fluid mosaic
model was established in 1972 (Figure 1.1D) [7]; the membrane has a fluid behaviour,
due to the gel-like nature of its composition, which allows proteins to move throughout
the membrane bilayer.
In order to fully understand the relationship between TMPs and the membrane,
molecular details would be needed. By the 1970s, reasonably detailed structures of
membrane-spanning segments of TMPs were available by EM [5] (Figure 1.1B). However, the real goal was to have the atomic resolution available. The advent of x-ray
crystallography through the first half of the 20th century showed that it was possible to solve atomically resolved structures of small organic molecules [10, 11] and
larger proteins such as a steroid [12], penicillin [13], and vitamin B12 [14]. Dorothy
Hodgekin was awarded the Nobel prize for chemistry in 1964 for the elucidation of
these complex structures [15]. Seven years after the creation of the Protein Data Bank
(PDB) [16], the first TMP was successfully solved by x-ray crystallography in 1984
[6] (Figure 1.1C). Because of this discovery Johann Deisenhofer, Robert Huber, and
Hartmut Michel won the Nobel prize in chemistry in 1988 for solving with atomistic
resolution the 3D structure of a photosynthetic reaction centre [17]. In the following
decades, x-ray crystallography of TMPs, despite the challenges therein, remained the
predominant method of elucidating the structures of TMPs [18]. Currently, we are in
a revolution of TMP structural acquisition, which will be discussed further in section
1.1.4.
1.1.2
Transmembrane proteins in disease
Membrane-bound proteins underpin almost every biological process directly, or indirectly, from photosynthesis to respiration. Integral TMPs are encoded by between a
third to a half of the genes in the human genome [19–21] and account for 40% of drug
1.1. TRANSMEMBRANE PROTEINS