MACROMOLECULAR
CHARGE FLIPPING
The charge-flipping algorithm introduced by Oszlányi and Sütő for single crystals in 2004 has been adapted to accommodate protein crystals diffraction data in the computer program SUPERFLIP. A flow diagram of the procedure is given below.
Two main applications are described:
* ab initio procedure for the determination of protein crystal structures using diffraction data at atomic resolution;
*
procedure for heavy atom or anomalous scatterers substructure
determination from isomorphous or anomalous differences.
Recent changes:
14 december 2015:
References:
1/. Application of Charge flipping to protein crystallography:
Dumas, C. & van der Lee, A. (2008) «Macromolecular structure solution by charge flipping», Acta Cryst. D64, 864-873.
2/. SUPERFLIP program:
Palatinus, L. & Chapuis, G. (2007). «SUPERFLIP - a computer program for the solution of crystal structures by charge flipping in arbitrary dimensions»,
3/. Review on Charge flipping:
Oszlányi, G. & Süto, A. (2008). «The charge flipping algorithm», Acta Cryst. A64, 123-134.
4/. Symmetry determination following structure solution in P1:
Palatinus, L. & van der Lee, A. (2008). “Symmetry determination following structure solution in P1”, J. Appl. Cryst. 41, 975-984.
SUPERFLIP program and utilities:
We refer to the official SUPERFLIP site at the Department of Structure Analysis, Institut of Physics, Praha and the École Polytechnique Fédérale de Lausanne (EPFL) for source files, documentation, and license agreement.
Download source code or the appropriate binaries for your system => Current Version: 02/13/14 8:48
Source code, executables for MacOSX (Intel) or Windows, GNU-Linux x86 (32-bit statically linked) or GNU-Linux x86-64 (64-bit statically linked)
Uncompress the binary, rename it to superflip, make it executable (chmod +x superflip) and move it in your $PATH (/usr/local/bin or ~/bin are good places).
Macromolecular structures can be solved by SUPERFLIP in two ways:
* by setting up an input file to be used with a user-provided hkl-file and running superflip program: $ superflip example.inflip
Two examples (input and log files) can be found here:
◊ heavy atom sub-structure solution:
◊ ab initio structure solution at atomic resolution protein.inflip protein.sflog
* by using C-shell scripts (recommended for Linux and Mac OSX): flipsub for heavy-atom substructure solution and fliprot for ab initio structure solution at atomic resolution.
These scripts create the SUPERFLIP input file on the fly using a limited number of command line options,
The user should install fliprot or flipsub file in a path directory (see your $PATH) and make them executable (chmod +x flipsub fliprot).
If you do not have either csh or tcsh installed on your computer: sudo apt-get install tcsh (on Ubuntu) or yum install tcsh (on Fedora, Redhat, Centos).
Various application examples follow here.
Examples of applications and test data
Ab initio protein structure solution at atomic resolution (beyond 1.1 - 1.2 Å)
example 1:
Test
data used: pdb
code 1mfm
1152 non-H protein atoms, 283 waters
&
Cd/Cu/Zn atoms in the asymmetric unit, space group P212121,
1.03 Å resolution
Ab
initio phasing
of superoxyde dismutase using charge flipping:
C.
Dumas & A. van der Lee, Acta
Cryst. D64, 864-873
Download
1mfm-sf.cif
and 1MFM.pdb
from PDB site
and use it as input file for fliprot
script.
Command: fliprot 1mfm-sf.cif name=mfm
The
procedure optionally asks SG number (here SG=19) and unit cell parameters (if not in the cif
file):
CRYST1 from pdb file: 34.99 48.11
81.08 90.0 90.0 90.0
Annotated log file (typical cpu-time 2 to 3 minutes on an Intel 2.4GHz cpu processor)
Then use
mfm.mtz
file for automatic model
building ( ARP/wARP or Phenix.AutoBuild softwares )
The quality of the phase determination by CFA can be evaluated by superimposition of the resulting map and the reference model (1mfm.pdb) as shown on this figure.
Typically, the correct enantiomorph will produce an overall correlation coefficient CC=0.8. Use the following PHENIX commands.
phenix.get_cc_mtz_pdb mfm.mtz 1MFM.pdb any_offset=true labin="FP=Fobs PHIB=PHIcf"
phenix.get_cc_mtz_pdb mfm.mtz 1MFM.pdb any_offset=true labin="FP=Fobs PHIB=PHIcfi"
Display mfm.map
and offset.pdb using COOT
or CHIMERA.
example 2:
Test data used: pdb code 2anv [PubMed]
2385
non-H atoms, 517 waters &
(Sm,I,Mg,SO4) atoms in the
asymmetric unit, space group C2
Ab
initio phasing
of lysozyme from f22 bacteriophage using
charge flipping:
electron density map at 1.04 Å
resolution
(C.
Dumas & A. van der Lee, Acta
Cryst. D64, 864-873)
Download
2anv-sf.cif
and 2ANV.pdb
from PDB site and
use it as input file for fliprot
script.
Command:
fliprot
2anv-sf.cif name=anv
Annotated log file (typical cpu-time 3 to 5 minutes on an Intel 2.4GHz cpu processor)
Then use anv.mtz file for automatic model building (ARP/wARP or Phenix.AutoBuild softwares)
The quality of the phase determination by CFA can be evaluated by superimposition of the resulting map and the reference model (2anv.pdb) as shown on this figure.
Typically, the correct enantiomorph will produce an overall correlation coefficient CC=0.8. Use the following PHENIX commands.
phenix.get_cc_mtz_pdb
anv.mtz 2ANV.pdb any_offset=true labin="FP=FP PHIB=PHIcf"
phenix.get_cc_mtz_pdb
anv.mtz 2ANV.pdb any_offset=true labin="FP=FP PHIB=PHIcfi"
Display anv.map and offset.pdb using COOT or CHIMERA.
Heavy
atom or anomalous scatterers substructure determination
This command is used to
solve the heavy-atom substructure using anomalous diffraction data
from sad.mtz
file (anomalous
differences data correspond to label columns DANO=label
or
F1=label1
F2=label2.
The generic-name
is used to create output
files (pdb, map and log). The optional parameter 3A
means that the data up
to 3 angstrom resolution were used (default uses all
data).
optional
keywords: (command
flipsub
-h )
name=HAtest
...... generic name for output files (default flipsub)
2.5A
...... high resolution cutoff
(default all input reflections, no resolution cutoff)
conv=4.0 ...... convergence criterion threshold (for peakiness mode, default 2.5
and for symmetry mode 85.0)
norm=wilson ...... normalization of
amplitude differences using Wilson method (default
norm=local)
ked=1.15 ......
coefficient for delta flipping parameter (default
1.25)
weak=0.25 ...... weak reflection
threshold (default 0.15)
trial=10 ......
number of repeated trials (default 5)
maxcycl=3000 ......
maximum number of cycles per trial (default 2000)
copies ...... number of NCS copies (used by phenix.autosol) (Default copies=1)
seq_file=prot.fasta protein sequence file (used by phenix.autosolv). If not given, a poly-Ala backbone is built.
If
necessary, in difficult cases, flipsub
automatically explores various
combinations of ked {1.15, 1.2, 1.25} and weak {0.15, 0.2, 0.3} parameters,
and also tries several resolution cutoffs. In
this case, add the keywork full:
flipsub
sad.mtz
F1="F(+)"
F2="F(-)"
name=CFA4
full
SUPERFLIP log file
.......................
generic-name.sflog
flipsub log file
......................
generic-name.log
Phenix.autosol log file ................................. generic-name-autosol.log
directory for Phenix.Autosol
wizard...................... AutoSol_run_#
The resulting coordinate file generic-name-au.pdb or generic-name-au.ha can
be used as input file for your favourite phasing program SHARP,
PHENIX (Autosol/Phaser-EP), CCP4,
Typically edit the
xxx-au.pdb
file (or xxx-au.ha file, in fractional units) to select the
appropriate number of heavy-atom sites in the asymmetric unit and
remove non-significant sites.
Various test datasets for MAD, SAD phasing are available here:
Download sfdata-cynsemet.tgz , (AUTOSTRUCT / CCP4 site) untar the archive and use cynsemet.mtz file as input data for flipsub scripts
Download the SAD dataset 4F7O.mtz (mtz format, 2.6 Å resolution) as input data for flipsub script:
Command:
flipsub
4F7O.mtz
DANO="DANO_x1"
name=CSN5
The
substructure solution is solved in P1
by SUPERFLIP,
using symmetry score to detect convergence (See log
file).
The
averaged HA map (best densities # 1, 6, 7 and 9) was used to extract
heavy atom sites.
Using
Phenix.autosol
in
flipsub in order to solve the SAD phase problem and built a
model.
Download
the CSN5 protein sequence csn5.fasta
and
restart flipsub
using
parameters for phenix.autosol:
22 heavy atom sites predicted for two molecules in the asymmetric
unit (copies=2),
wavelength in A. The flag solve
indicates
that the heavy atoms sites (bestxx-CSN5-au.pdb file) will be used
for SAD phasing and model building (see directory
Autosol_x).
Command:
flipsub
4F7O.mtz
DANO="DANO_x1"
name=CSN5p
sites=22
atnam=Se
copies=2 seq_file=csn5.fasta lambda=0.9887 solve
Other links:
RSCB
Protein Data Bank: atomic
coordinate files and structure factors of biological
macromolecules;
CCP4:
software
suite for macromolecular crystallography;
SHARP:
software
suite for experimental phasing of macromolecular crystal
structures;
Phenix:
software
suite for the automated determination of macromolecular crystal
structures;
SHELX:
software
suite for crystal structure determination from single-crystal
diffraction data;
Chimera:
visualisation
of electron density maps;
Coot:
visualisation
of electron density maps and model building;
Uppsala
Software Factory:
software
for macromolecular crystallography;
ARP/wARP:
interpretation
of electron density maps and automatic construction of
macromolecular models.
Contact information Christian.Dumas @ cbs.cnrs.fr or avderlee @ univ-montp2.fr
Last modifications: december 23, 2015