easYPipe ‘prep’

Important

This step is a first mandatory step for the preparation of the data.

Usage

easypipe.py data prep [-h]

Example:

$ easypipe.py PROCESSED_DATA prep

How the data should be organized ?

The data folder (whatever it’s name) must contain only datasets folders.

Within each dataset folder, the processed data can be organized in several ways:

  • a mtz file directly in dataset folder
  • a mtz file in a sub-folder, or in a sub-sub-folder … of dataset folder
  • several processes are possible for a dataset, provided that they are in different sub-folders
  • if several mtz files are present in the same sub-folder, only one will be treated on the basis of templates (from ESRF EDNA processes)
_images/how-data-should-be.jpg

Note

Data downloaded with easYGet are directly in the right tree organization.

What does it do ?

In an ‘easYPipe’ folder created at the place where it is executed, ‘prep’ copies each processed data mtz in a sub-folder of the dataset in this way:

  • creation of an ‘easYPipe’ treatment directory where it is run
  • creation of a subdirectory ‘0_processed_datasets’ where all the datasets folder are created
  • creation of a ‘data’ folder in each dataset folder and copy in this folder of processed mtz and log files
  • if there are several mtz in a folder, search for ‘EDNA’ treatment template and selects the right mtz file

Then:

  • launch of xtriage [1] for each mtz to get resolution, completeness, space group and cell parameters
_images/processed_datasets_tree_New2.jpg
  • information on mtz files to be treated written in ‘/easypipe/1a_prep/mtz_to_treat_ALL.csv’ file
_images/mtz-to-treat-ALL-csv_NEW.jpg

-creation of a csv file ‘/easypipe/1c_ligands/ligands_for_datasets.csv’ for future ligand generation with eLBOW [2]

_images/csv-ligands-for-datasets.jpg

You have to fill ‘ligand name’ and ‘ligand smiles’ fields before running ‘easYPipe ligands subcommand’.

Caution

Save the modified csv file somewhere else or with another name if you don’t want to overwrite it in case you launch ‘prep’ sub-command again …

You can also run ‘easYPipe reindex subcommand’ if some mtz should be in higher symmetry space group.

If you are not interested in ligand placement or reindexation, you can directly run ‘easYPipe launch subcommand’.