Home » Section 1: Proteins and Motifs: Homework section I

Section 1: Proteins and Motifs: Homework section I

The copy of the homework on the web will have active links to various pages. If you find any dead links please email me immediately.

Homework1: Introduction

Use the Dropbox folder that we made for you to submit homework.  
Due Date : September 3, 2019

You need to acknowledge where you get information

The reference format for journal articles I would like you to use in this class is. 
Last name, first initials for all authors (year) Full title of paper vol, first page-last page. 
For example, Zhang, J., Frerman, F.E. and Kim, J.J. (2006) Structure of electron transfer flavoprotein-ubiquinone oxidoreductase and electron transfer to the mitochondrial ubiquinone pool. Proc. Natl. Acad. Sci. U.S.A. 103, 16212–16217.
If you are using information from a web site then give me the full URL. 

1) Use Wikipedia or GOOGLE a biochem or cell bio book. Give me a brief definition. Give me the reference for each piece of information. 
DEFINE
-a) Mitochondria and Chloroplast
-b) Heme and quinone
-c) van der Waals forces
-d) pKa of an acid
-e) protein backbone; protein side chain

2) Use: Oxidative Phosphorylation at the fin de siecle to answer questions a-e. Other reviews of mitochondrial proteins that may be of interest are from Rich-2010 (you can find this paper on Dropbox also as Rich-2010-Essays Biochem.pdf) (good for question d) and from Newmeyer (you can find this paper on Dropbox also as Newmeyer-2003-Cell.pdf) (more biological).
a) Write out the reference to the three papers using the form described above.
b) What is the overall reactant and product of oxidative phosphorylation?
c) 5 proteins are described as Complex I-V. 
What are the products and reactants for each protein? The answer will show you that the name tells you the function. 
An example of the desired answers are given for complex 1. Write out the answers for the other 4 complexes.
Complex I is known as the NADH: ubiquinone oxidoreductase – It takes electrons from NADH (oxidizing it) and gives them to ubiquinone (reducing that). This process is described by the name. Protons are pumped across the membrane when this happens. 
d) List the non-amino acid cofactors shown for each protein.
Complex I contains 1 flavin mononucleotide, seven or eight different FeS centers, covalently bound lipid and a quinone molecule. 
e) use in the membrane proteins of known structure to find the Protein Databank 4 letter code for one structure of each of the 5 proteins. What structure has the best resolution (lowest number #Å)? When that structure was solved?
For complex I: 14 structures are available from 6 different organisms (List organisms). Best resolution 3RKO at 3.00Å). The most complete structure is 4HEA. (To go the complex I, first go to membrane proteins of known site and go to TRANSMEMBRANE PROTEINS: ALPHA-HELICAL. You can see Electron Transport Chain Complexes: Complex I there.)
Look at one of the proteins in the Protein data bank. If you click on the 4 letter code it will take you to the file for the structure.

3) Ion channels and ion pumps: See Gadsby review
a) What is the difference between how an ion channel and an ion pump work?
b) Does a channel make a gradient or dissipate a gradient? Does a pump make a gradient or dissipate a gradient? 
c) What would happen if you put a channel and a pump together in the same membrane?

4) – Use: The Biological Frontier of Physics to tell me how many moles ATP you synthesize each day. (You can modify this based on your caloric intake.)

_____________EXTRA WEB SITES TO BROWSE__________________

How to read scientific papers – while this says it is for non-scientists it’s actually very good. I will assign mostly review articles for class reading. However, for your paper, you will need to read several primary research papers. You will need to through many papers to decide which few you will spend the time to read very carefully. 
Start with the web sites of interest page
I suggest you download one of the molecular graphics packages (I will use Chimera PYMOL, VMDRasmolSwiss PDB viewer) are all ok. I will use PYMOL in class. CHIMERA may be the easiest. You may also be able to do a fair amount of work using the Java Server at the RCSB site.
Look up amino acids in the Wikipedia or GOOGLE, 
Amino acid web sites Indiana State (basic chemistry of Biomolecules page),
Look at the protein chapter in the Indiana State (Protein structure and analysis) 
Find real cell biology or biochemistry textbook and look up amino acids and protein synthesis.

Papers that connect biology to chemistry
Can biological phenomena be understood by humans? link
The Biological Frontier of Physics. link

Why cells use a proton gradient?
Review of the physics of proton gradients.
Why proton gradients may have been chosen by life.
Possible evolution of proton motive force



Homework 2: Lecture: Amino acid properties; peptide bonds

Use the Dropbox folder that we made for you to submit homework.  
Due Date : September 10, 2019

Make use of the Introduction to protein structure chapter for this assignment. 

This homework is repetitive – but I want you to begin to get a feel for the properties of the amino acids.
-Look at the amino acids at the RCSB.org site using their visualization tool.
-Download the figure and write you answers on it:  figure of amino acids;
(1) Notice each side chain has 3 labels (e.g. Glycine); Three letter name (Gly); and its 1 letter name (G)
– On the print out: Label which is S (small); M (medium); and L (large) in size (this is a bit subjective).
-Label which amino acids have acidic and basic side chains.

(2) Draw a valine amino acid (backbone and side chain). Draw a dipeptide Ala-Val. You need to combine 2 amino acids to form a peptide bond. Circle the peptide bond; circle each amino acid; and label each side chain.

(3) What distinguishes hydrophobic and hydrophilic amino acids? Give me the reference you used for the definition.

4a) It has been suggested that the genetic code (a 3 nucleic acid word is translated to one amino acid) groups codons (the 3 nucleic acids) so that if a mistake is made in the third letter the mutation will not be so serious. Look at the genetic code – notice 6 amino acids are grouped with the same first two letters. List which amino acids are grouped together (e.g phe and leu both use UU as the first 2 letters). In what ways are the pairs of amino acids using the same first 2 letters in their codon similar and how are they dissimilar?
4b) Look at the blosum62 matrix at Wikipedia
The more positive the number along the diagonal the more likely this amino acid is to be conserved. 
List the most and least conserved amino acids and comment on what kinds of residues they are. 
The off-diagonal elements tell the likelihood of one amino acid being exchanged for another during evolution. Positives are more likely, negatives less so. 
— Choose 3 pairs of amino acids and describe what you think of the number – this can be a pair with a number that surprises you or a pair with a number that makes sense to you (e.g. Glu and Gln are +2 at a pretty high positive value. They have the same shape with one of the O in Glu exchanged for an N in the Gln. However, Glu is often negatively charged while Gln must be neutral.)

5a) Use search ‘search of amino acids images’ in Google. There are several figures showing all 20 amino acids. Many sorts of the amino acids differently.Find 3 different classifications. Record the URL.
5b) Which residues have a consistent classification?
5c) List 3 amino acids that shift classification and their different classifications. You may want to look at the figure here for a Venn diagram.

Give me a definition of the pK of an acid or base (and the reference you used).
6b) What amino acids have acidic side chains?
6c) Which ones have basic side chains
6d) What is the pK of the side chain for these acidic and basic groups? (what reference did you use?). [do not give me the n-terminal or c-terminal site pK here].
6e) Every side chain (except proline) has an amino n-terminal and carboxylic acid c-terminal. What is the pK of the n-terminal and the c-terminal?

(7) Find a table of side chain physical properties (this can be found in some of the web references provided)
7a) What is the molecular volume of an Ala, a Lys, a Val? Is this value for just the side chain or for the whole amino acid? (what reference did you use?). Compare the molecular volume with the hydrophobicity of these residues. Use this hydrophobicity scale.
7b) Look at the Suggested amino acid substitutions. There are 4 groups circled. What properties do the amino acids in each circle have in common? 
7c) Make the following graphs. (1) Surface area vs solubility; (2) Volume vs solubility. Which has a better correlation? Use data at Rockefeller site. If you use another data source give me the reference. Is there a correlation if you look at all amino acids? What if you separate them into different categories?

_____________EXTRA WEB SITES TO LOOK FOR AMINO ACIDS__________________one source, åßœ- Individual properties of the 20 standard aa (including pKs); 
A nice chatty, introduction to amino acid properties. This includes a short film for each one.
Look at the ionic equilibria review page
pH dependence of residue charge: isolated aa 
Membrane proteins. Look at the Experimentally determined hydrophobicity scale
Other pictures for secondary structures.
helix; hbd in helix and strand and sheets



HOMEWORK 3: Lecture: Our proteins

Use the Dropbox folder that we made for you to submit homework.  
Due Date : September 17, 2019

Papers describing the protein database are by Rose-2010 and Rose-2013
If you are interested in the history of PDB you can see papers by Berman and by Goodsell.
It’s a good time to get a molecular visualization. Download Chimera or another visualization program. If you are using pymol here is a manual
Look at the protein databank. Look at the interactive tour and other links on the front page.

1) Find the right protein 
If you have a membrane protein, find your protein at the membrane proteins of known structure web site. (You need to open the site by clicking on the +++ tab). If you have a motor protein you will need to find it directly in the protein databank?
What are available 4 letter codes for the PDB file for your protein in the protein databank?

2) Find some good references
You can use these refs from lecture

–What is the main reference suggested the membrane proteins of known structure web site?
–Give a reference to a review article for the protein (look at PubMed and search for reviews). Put the paper associated with the PDB file into PubMed and then search for related articles.
–You may need to find different papers for the structure and for the function. For some proteins, you may find a better review for a protein in your group of protein rather than for your specific protein (eg Bacteriorhodopsin review will help you understand halorhodopsin; cytochrome c oxidase may help with quinol oxidase – these connections are noted in the handout sheet for your assigned proteins).
–Give me the URL for a web site online that is dedicated to your protein. 
Try using PubMedGoogle Scholar, (or when on campus ISI web of knowledge or Science Direct). 

3) What does protein do?
–What is the function of your protein? What are the reactants and the products? (this may be chemical A + electrons go to chemical B or it may be an ion or protons move from the cytoplasm to the periplasm. Or it may be ATP hydrolysis driving protein motion in a motor. 
— Draw a sketch of the reaction (giving me the reference you are using). 
–Does the reaction happen in several steps? If yes what are the intermediates?
–Does the protein store energy in the membrane gradient – or does it use the membrane gradient as its energy source? 
A nice review of electron transfer ‘cofactors’ Use: Oxidative Phosphorylation at the fin de siecle to answer questions a-e. Other reviews of mitochondrial proteins that may be of interest are from Rich-2010 (you can find this paper on Dropbox also as Rich-2010-Essays Biochem.pdf) (good for question d) and from Newmeyer (you can find this paper on Dropbox also as Newmeyer-2003-Cell.pdf)  (more biological).

4) What is important, what is missing and what is extra in your PDB structure file?
–What are the non-amino acid groups (cofactors, reactants, products) needed for the function of your protein? Find this in the review articles. 
–What are the non-amino acids found in the crystal structure? This is in the SMALL MOLECULES: Ligand at the bottom section of the main page for your protein at the protein databank.
–Match up the cofactors/reactants/products needed for function and the groups found in the structure. Do you have extra groups or missing groups?
–How many polypeptide chains should your protein have (look in a review article). How many do you find in the structure? [Different organisms have a different number of polypeptide subunits. Usually, mammalian proteins have more subunits. If you have a choice use a smaller protein to start. However, some proteins are missing important subunits. Others have extra proteins attached – Look out for proteins that are co-crystallized with an antibody. You do not want to look at the antibody!

This is a very important exercise for your project – you will use this information for your project.
(5) Go to The protein data bank (PDB) site.
1a) How many entries are currently in the protein data bank for all proteins?

You have each been given a protein. Find the structures for your protein. Each protein structure has a unique 4-letter code used in all of these sites. If you are unsure what protein I am pointing you to, please ask now! These proteins have multiple names and it can be confusing (thus cytochrome c is not the same as cytochrome c oxidase). 
Look up your protein in different sites. I would start with Wikipedia, look at 
5a) membrane protein site. This should be the first site you go to. Use this to find PDB ID (4 letters) codes for your protein. There may be multiple copies of the same protein – from different organisms, with different mutations, or with different bound molecules. In the end, you will need to pick one file to work with. What file will you use?
How to choose your PDB file: 
Resolution – smaller is better. Any structure under 2 Å will be fine, but many membrane proteins structures can by 3 or even 4Å. If that’s the best you can do, use it.
–For your project, it is nice to have the right ligands bound. If you have an electron transfer protein there are non-amino acid pieces that are important to your reaction. They are not bound in all structures. You may have an inhibitor bound in the active site. This choice requires some knowledge and you can change the structure you are using later. If you are doing a channel you may find a structure with the ion bound. That is a good thing because this shows a place where the ion is interacting with the protein strongly. 
–Number of subunits. Too Few: Some proteins have many polypeptides that come together to make it work. For example, many ion channels must have 4 polypeptide chains to make the channel – some of the PDB structures only has one. This one must be combined with 3 other to make the full channel – you need the biological assembly not the protein in the unit cell. 
— Number of subunits. Too Many: Bacterial proteins often have many fewer subunits than their mammalian cousins. They both work and do the same job. Use the simplest structure possible. It will be much easier to manipulate and to see things. 
— Addition of an immunoglobulin peptide. This is added to help crystallization. It is not your protein and should be removed/ignored. 
–What’s a quick way to figure out if my structure has too much or too little?
 For many proteins, Wikipedia will have a picture of the working protein with information about the important bits. In the protein databank, there will be a link to a journal article. This will describe the good and bad of this particular structure. You really will need to read this paper carefully. (And you can ask me by email or in person). 

5b) What do you get when you search the protein databank using the name of your protein? How long is the list? Give a few examples of the entries you find. How long is the list if you search by the PDB ID?
5c) At PubMed, use the pull-down at the top of the page (default PubMed) and go to structure and search by PDB ID.searching

–If you can’t find your protein in the database than report that right away to me.
–At this point, I’d suggest you bookmark these page and look at them throughout the semester. They have some similar information and some unique. And there are big differences in ease of finding things for different proteins. 

You can answer these questions from The protein data bank (PDB) site. In each case just type your 4 letter code in and play around. 

For the PDB file for your protein you have chosen:
From the SUMMARY page. 

6a) What is the PDB id? 
6b) What is the ‘source’. What’s the common name for this organism?
6c) How many residues in your protein?
6d) How many polypeptide chains (subunits) (use sequence tab)?
6e) What is the resolution? Low values of resolution are better. Below 2 Å is not too bad. What resolution means?
6f) What non-amino acid groups are bound to this protein? 
— List the ligands names and 3 letter code and indicate if they are important for protein function (a cofactor, a substrate/reactant/inhibitor or if you don’t know). This is important to get straight now. If you are using Chimera look at ‘select/residue/all nonstandard’ to see all the ligands. 
On the Annotations Page
6g) What is the SCOP fold?
From the SEQUENCE Page
6h) How many helices are there? What percent is helical?
From the Sequence Similarity page
6i) How many proteins are 100% similar (identical). Why might there be multiple copies of the same protein?
6j) How many proteins are on the list of 30% similarity. Tell me what some of them are.
6k) 3D View. Use Ligand View. You will see dashed lines for interesting contacts between the ligand and protein. List Hydrogen Bonds (the atoms on the protein and the ligand; The hydrophobic contacts; PI interactions; Metal interactions (see bottom right to switch contacts on and off). [If you have many ligands pick one that should be important.]



Homework 4: Lecture: Protein folds and families 

Use the Dropbox folder that we made for you to submit homework.  
Due Date : September 24, 2019



1a) Compare the classification system of SCOP and CATH (SCOP has Class ->Fold -> Superfamily -> Family, CATH starts with Architecture). What levels correspond in the 2 systems. Define what each level describes. 
1b) How many topologies are there in CATH? How many folds are in SCOP? 

2) Use the PDB file for your protein you started looking at in homework 3.
2a) Compare the classification of your protein in SCOP and CATH. 
2b) How many domains are found for each chain of your protein?
2c) What other kinds of proteins are found in the same SCOP superfamily?
2d) What other kinds of proteins are found in the same SCOP family?
If you don’t find your protein in the SCOP database than a report that. Answer the questions for glycerol-3-phosphate dehydrogenase (2QCU)

-3a) What is a protein domain? (Give a definition from ref A and from another source).
-3b) How do CATH and SCOP handle proteins with multiple domains?
-3c) What are the 2 alternative ways for evolution to occur described in ref 1 and which one does Chothia support?
-3d) What do protein families have in common? 
-3e) Have all the genomes discussed here in 1992 been sequenced? Give me the URL for the web site for one of them?
-3f) How many proteins are now available at the Protein Data Bank?
-3g) How many protein families did Chothia think there would be in 1992? How good was his prediction? (How many families are there in SCOP?)
-3h) How much does the structure tell us about the function? Give 2 examples described in ref 4.

Using a molecular display program (Chimera, Pymol, VMD). Use protein 1YCC.pdb.
-4a) What protein is 1YCC?
=4b) What non-amino acid groups are present in the structure? Which are essential for function? 
-4b) Make a picture showing just the backbone in a form that differentiates helix, sheets, and random coil. See if you can get the program to color each type of secondary structure differently.
-4c) Add the HEME group back into the picture. (HEME will have a 3 letter name in the file and you have to use that). Make the HEME shown as spheres (or CPK).
-4d) Add the amino acids within 3Å of the heme to the picture. Show them as sticks or lines. What amino acids are connected to the heme iron? What is the name of the iron in the file?

References for homework 4a:
A – Evolution of the Protein Repertoire Science (2003) 300 pg 1701
B- One thousand families for the molecular biologist Nature (1992) 357 pg 543-4 
C-CATH-a hierarchic classification of protein domain structures Structure (1997) 5 pg 1093-1108. A more recent paper Dessaily et al.
D-From protein sequence to function Curr Opin Struct Biol (1999) 9 pg 363-376.
E- SCOP JMB reference



6:  Membranes and the difference between membrane and soluble proteins. 

No homework


More information about membrane proteins from Steve White on membrane proteins.
A nice review by Engelman.


7: Powerpoint slide for the structure of your protein.
This is a template for the single slide you should give me. Please make a picture of your protein using Chimera or another program. If you use any pictures from the web or from a paper you MUST need to include the URL or paper reference in the slide.
Mail me a powerpoint (.ppt) slide. If you use keynote save it as powerpoint format. 
If you don’t own powerpoint I’d recommend OpenOffice. Free word/excel/powerpoint programs for mac/pc/linux. They work pretty well. Just make sure you export the slide to powerpoint.
It will help me if you name your slide with LAST-NAME.ppt (i.e. your last name).