3.6. All the GP parameters¶

class Frog.class_GP.GlobalParameter[source]

Bases: object

The GlobalParameter object is suposed to holds very basic information for the run. This object will be copied/passed to every threat if parrallel run is performed.

property IS_layer_selection

Type [bool]

Set to True if there are any need of a layer geometry selection. It can be for a diagram or for QM selection. By default set to False.

Note

This attribute is set by Frog and should not be define by the user.

MD_check_GP()[source]: MD part of the global parameter object (GP). The MD trajectory is openned and some general properties are stores in the GP objects.

property MD_convertion_to_angstrom

Type [float]

The convertion to apply to get the distance of the MD trajectory in Angstrom. If not provieded, Frog assume that the unit of the MD is already Angstrom.

Example

If you want to rescale your MD trajectory by 0.529177, use:

GP.MD_convertion_to_angstrom = 0.529177

Position used by frog = (Position MD) x (GP.MD_convertion_to_angstrom)

Note

Use float not integer – 1.0 instead of 1 for instance.

Note

There is no real need to use Angstrom unit. You just have to be coherent with the values defined everywhere in Frog and in the Molecular library file. Be aware that the Anstrom unit are expected in the current procedure with Dalton.

property MD_cut_trajectory

Type [boolean]

If set to True, the trajectory will be cut in several file in order to have one time step per file. This option is made in order to reduce the RAM memory needed. The piece of trajectory are saved in the GP.dir_mol_times directory. By default True.

Example

GP.MD_cut_trajectory = True

Note

If you deal with a very small trajectory, it may not worth it to cut it: set GP.MD_cut_trajectory. Otherwise, it seems a good safe-guard to avoid impacting the initial trajectory. Indeed, the initial trajectory will be read at the very begining of the Frog run to perform the cut, and then will not be open again.

property MD_file_name_topology

Type [string]

The name of the molecular dynamic topology file, used to defined which atom are part of the same molecules. Apart from that, the molecule names, groups, masses or other informations is not taken into account. FROG is not designed to handle changes in the topology files in time. More informations about the supported topology files are available here. Even if the file is located in the working directory, we strongly recommend to use general path:

GP.MD_file_name_topology = '/home/user/MD/mytopology.data'

Warning

Mandatory parameter

property MD_file_name_traj

Type [string]

The name of the molecular dynamic trajectory file. This file should respect the topology defined in GlobalParameter.MD_file_name_topology . More informations about the supported trajectory files are available here. Even if the file is located in the working directory, we strongly recommend to use general path:

GP.MD_file_name_traj = '/home/user/MD/mytrajectory.dcd'

Warning

Mandatory parameter

property MD_file_type

Type [string]

The type of topology and trajectory used as input, in GlobalParameter.MD_file_name_topology and GlobalParameter.MD_file_name_traj respectively. The type used have to be compatible with the MDAnalysis. python module. More informations about the supported MD output are available here.

Example

For an LAMMPS output, use:

GP.MD_file_name_topology = '/home/user/MD/mytopology.data'
GP.MD_file_name_traj = '/home/user/MD/mytrajectory.dcd'
GP.MD_file_type = 'LAMMPS'

You can try to open your topology/trajectory if you have trouble to deal with these parameters in Frog. To open the MD trajectory with MDAnalysis, Frog use:

u = MDAnalysis.Universe(GP.MD_file_name_topology, GP.MD_file_name_traj, format=GP.MD_file_type)

Note

Do not hesitate to try out combinaison of topology-trajectory which would have not been yet presented in the doc and to send us feedback for any working combinaison.

Warning

Mandatory parameter

property box_size

Type [float, float, float]

The first time step MD box size. Used to print some information to the user.

Should not be defined by the user – would be overwritten in any case.

Note

In Frog, the MD box size is update for every frame. This variable is used just for printing user-friendly information.

check_GP()[source]

This function tests if the parameter set in the input file are coherent. It explicitly print the value understood by the software. It also checks that files can be written/openned in several directories.

First it checks how the path are set by creating a file and/or a directory at the given location. This is done to avoid unpleasant surprises after hours of calculation… Depending on the general options/parameters asked, some attribute/value have to be given or not.

check_all_MT_initialized(L_moleculetype)[source]: Check that the MT has been properly initialized: at least all the function defining the key objects have been called in the input file!

property command_launch_job

Type [str]

The command to launch a sumission script in your cluster waiting queue. For instance ‘sbatch’ or ‘qsub’.

cut_trajectory()[source]

property dalton_run_option

Type [str]

option to add to the dalton command. Default = “” (no option). Please define this attribute only once and write all the option you need at once.

Example

GP.dalton_run_option = "-noarch"
GP.dalton_run_option = "-D"
GP.dalton_run_option = "-t"
GP.dalton_run_option = "-noarch -D -t"

property dir_mol_times

Type [string]

The directory used to store the molecular properties for each time step. These data are not humain-readable intented. These datas can become heavy depending on the number of molecule, time step and the number of analysis required. The path is updated using the GP.general_path attribute. The default value is “Molecule_times”.

Example

If no GP.dir_mol_times is declared, the results are saved at GP.general_path/Molecule_times.

If GP.dir_mol_times = Datas/Molecule_in_time, the results are saved at : GP.general_path/Datas/Molecule_in_time

property dir_submission_file

Type [string]

The directory used to store the script needed to launch the QM run on a cluster. A large number of file can be created if many QM simulations are required, but it should not occupy a large amount of space. These script are humain-readable. The path is updated using the GP.general_path attribute. The default value is “Submission_script”.

Example

If no GP.dir_submission_file is declared, the results are saved at GP.general_path/Submission_script.

If GP.dir_submission_file = Dir_sge/, the results are saved at : GP.general_path/Dir_sge

property dir_torun_QM

Type [string]

The directory used to store the QM directories: the input script and the results. A large number of directory can be created if many QM simulations are required and thus occupied large amount of space eventhough each file (input file or the result) are small. These data are humain-readable and we strongly recommand to have a look from time to time. The path is updated using the GP.general_path attribute. The default value is “QM_Simulations”.

Example

If no GP.dir_torun_QM is declared, the results are saved at GP.general_path/QM_Simulations.

If you want to set the QM datas at: “/scratch/myname/Datas/Frog/Water_QM”, you have to set GP.dir_torun_QM = “/scratch/myname/Datas/Frog/Water_QM”.

Note

If possible, we recommand to set this directory to a place where the writting/openning of the files are easely efficient.

property env_authorised_pbc_condition

Type [list]

Defines which direction can be used to built environment is needed. By default all 3 direction are supposed to be PBC.

Example

GP.env_authorised_pbc_condition = [0, 1]

In this case, only the X and Y direction are understood as PBC.

Warning

FROG has been tested for 3D and 2D PBC. If you are using no PBC condition, you may want to be carefull…

property file_template_script_run_QM

Type [str]

The file used to create every submission script. The line relative to the QM run are added at the end of the script. Please note that the file name is update with GP.general_path. The created submission files are written in the GP.dir_submission_file directory.

find_molecule_type(molecule_number)[source]: Return the molecule type name for the molecule_number given. Return an Exception if it is was not implemented.

property general_path

Type [string]

The general path used to defined all the other path of the directory. By default, it will be set to the working directory. You may define the directory with or without the final ‘/’:

GP.general_path = '/home/user/MD/My_systeme/Frog_Analysis'
GP.general_path = '/home/user/MD/My_systeme/Frog_Analysis/'

If you want to not use this behaviour – for instance you want to define yourself every directory – you can set GP.general_path to ‘/’. You may also define the other directory attribute with a starting ‘//’. For instance, If you want to set the QM datas at: /scratch/myname/Datas/Frog/Water_QM

GP.general_path = '/home/user/MD/My_systeme/Whatever/'
GP.dir_torun_QM = '//scratch/myname/Datas/Frog/Water_QM'

Note

The general_path does not affect the location of the GP.MD_file_name_topology and GP.MD_file_name_traj.

initializating_software_friendly_GP(L_moleculetype)[source]

This function update the GP according to the given attributes/options.

MD analysis & Diagrams Preparing the molecules set Preparing the QM parts: Daltons inputs

property layer_nbr_max

Type [int]

If there are several layer geometric selection (for diagram or QM selction), layer_nbr_max contains the maximal number of layer required. The aim is to do the layer attribution only once with this maximal layer number. If some diagrams use less possible layer (for instance 2 instead of 5), then the molecule in deeper layer are assigned to 0 (for instance the molecule at the layer 3 is assigned to 0).

Note

This attribute is set by Frog and should not be define by the user.

layer_radii_MT_definition(L_moleculetype)[source]: Create a new topology file where the radii used by pytim to define the layer are set according to the MT values. The new topology file is savec as: GP.dir_mol_times/frog_topology.pqr

property layer_which_radii_MT

Type [str or dict]

Defines how to set the VDW radii used by pytim to attributes layer to the molecule. Precisely, the radii is the optional argument ‘radii_dict’ of the function pytim.ITIM, see https://marcello-sega.github.io/pytim/guessing_radii.html

Possible values:

‘MD’: No values are set manually. Therefore, pytim try to use the Gromos 43A1 force field to attribute radii.

‘MT’: default value, uses the values defines in the MT, using the function info_molecule_for_layer of the molecular library file. Note that if this option is used, the GP.MD_cut_trajectory is set to True. Indeed, Frog needs to modify the topology file in order to make this option work.

your_dict: you can pass directly a dictionary, which will be used as pytim.ITIM(u, radii_dict = your_dict). The name of the directory does not matter: it just has to be a directory so that this option is detected by Frog.

property max_submission_QM

Type [int]

The number of jobs which can be prepared by the software to perform the QM simulations. This option has been made in order to avoid trouble when submitting a lot of job to a cluster – for instance sending 100 000 jobs to a cluster…

Using this option, you might not be able to perform all the QM simulation at the same time. If it is so, you should wait that the QM calculation already sent end, then re-run the programme in order to treat the rest of the QM simulation and resumit to the cluster the rest of the QM simulation.

Example

GP.max_submission_QM = 100

property nbr_job_parr_QM

Type [int]

The number of QM simulation which run at the same time on a server.

For exemple, set to 1 to have one QM simulation runing for every jobs submitted to the cluster – for monocore CPUs.

Set 8 to have 8 QM simulations running for every jobs submitted to the cluster – designed to run on multicore CPUs. Note that no memory sharing is needed to perform several QM simulation on a single server (no OpenMPi mandatory) since every QM simulation can be performed independently.

The maximum number of QM simulation witch can be launched simultaneously is: nbr_job_parr_QM*max_submission_QM.

Note

Please be aware that the memory needed for every QM simulation can be large: check the RAM available and where you write the temporary Dalton files. For instance, if 100 QM simulation are running on the same time on the cluster, a lot of reading/writting file will be occuring (temporary Dalton files) and may slow down the cluster. /tmp or a scratch directory should be used for these temporary file, see the scratch_dir variable.

Note

The template used to define how to send job to the cluster (GP.file_template_script_run_QM) must match with this nbr_job_parr_QM. If the template ask only for 4 cores while nbr_job_parr_QM = 8 you may have some trouble. See the Tutorials.

Example

GP.nbr_job_parr_QM = 16

property nbr_mpi_dalton

Type [int]

The number of MPI processes for each dalton calculation.

Note

Please be aware that the memory needed for every QM simulation can be large: check the RAM available and where you write the temporary Dalton files.

Note

The template used to define how to send job to the cluster (GP.file_template_script_run_QM) must match with nbr_mpi_dalton. If the template ask only for 4 MPI processes while nbr_mpi_dalton = 8 you may have some trouble. The number of MPI processes asked for the job to the job manager should be (nbr_mpi_dalton * nbr_job_parr_QM)

For SLURM , if one demande one node using 8 mpi processes, the number of mpi tasks per node is given by #SBATCH –ntasks-per-node=8 The number of nodes is demanded by #SBATCH –nodes=1

Alternatively, the total number of tasks can be asked with #SBATCH –ntasks=8 This second possibility WAS NOT TESTED YET!

Example

GP.nbr_mpi_dalton = 4

property nbr_parra

Type [integer]

Maximal number of core on which the first and third part of the run will be parralelized. This is number is NOT related to the parralelization of the QM runs. OpenMP is required to performed this parralelization. The default value is 1.

Example

GP.nbr_time_step = 80
GP.nbr_parra = 2

The total number of frame treat is 80, and 2 cores will be used. Each core will treat 40 frames.

GP.nbr_time_step = 5
GP.nbr_parra = 2

The total number of frame treat is 5, and 2 cores will be used. One core will treat 3 frames, the other 2.

Note

If GP.nbr_parra > GP.nbr_time_step, the GP.nbr_parra is set to GP.nbr_time_step.

Note

Be aware that the RAM may become important, so do not overload your (personal) computer and try small GP.nbr_parra to start with.

Note

If you use parallelization, it is recommanded to cut the MD trajectory to not overload the RAM and for savety (of the MD trajectory file: otherwise it would be read by several core at the same time.). Set GP.MD_cut_trajectory to True (default value).

property nbr_repetition_QM_perMT

Type list of [int] of length GP.nbr_type_molecule, ie one int for each MT

The number of repetitions for the bunches of (nbr_job_parr_QM) QM simulations run at the same time on the server. Using the parallel command that will run nbr_job_parr_QM at the same time.

For exemple, set to 1 to have one bunch of nbr_job_parr_QM QM simulation runing for every jobs submitted to the cluster .

Set 10 to have (10 * nbr_job_parr_QM) in each QM_todo file for the QM simulations running for every jobs submitted to the cluster .

The maximum number of QM simulation witch can be launched simultaneously is NOT affected by nbr_repetition_QM_perMT ! It remains nbr_job_parr_QM*max_submission_QM. But the TIME NEEDED for 1 job to be finished will be multiplied by nbr_repetition_QM_perMT.

This is particularly usefull if the QM calculation times for the available MTs are quite different. Using this attribute, you can send jobs on the cluster which will last about the same time by asking for more QM calculations per jobs for the small MT and fewer for the large one.

Example

GP.nbr_repetition_QM_perMT = [10,1] if one want 10 repetitions for the MT numbered 0,and 1 repatition for MT numbered 1

property nbr_time_step

Type [integer]

Define the total number of frame to treat. If set to 0, treat the maximal frame number possible depending on the total number of frame available in the MD trajectory and the value of GlobalParameter.trotter_step . The default value is 1.

Example

GP.nbr_time_step = 80
GP.trotter_step = 4

Will treat the MD snapshot each 4 available frame up to 80, ie the frame number 1, 5, 9, …

GP.nbr_time_step = 0
GP.trotter_step = 10

If the MD trajectory contains 100 frames, FROG will treat 10 snapshot separated by 10 frames, ie the frame 1, 11, .. 101

property nbr_time_step_core

Type [list]

A list of integer (size of the list = GP.nbr_parra). Every component define the number of time step to perform for every cores during a parralelized run – during the first and third part.

property nbr_type_molecule

Type [int]

The number of Molecule Type defined by the user – found in the list L_moleculetype.

property pass_first_part

Type [bool]

Defined if the first part should be perform or not. This appribute has been created to deal with QM calculation.

Default value False: Frog perform the first part and overwrite the previous results.

If set to True, the first part of the run will be skiped if possible all the moleculetype file are found in the dir_mol_times directory for every time step required.

Note

Frog do not check if the input file is the same for the previous calculation used for the restart, and the one provided (which have launch this calculation). The MT object are the same as the one defined in the previous calculation, while the GP is the one of the new parameters. This has been done in order to make possible to change the attribute relative to the QM management – in the GP parameters.

property preference_functional

Type [list]

Defines which functional should be used in the case where several molecule or different MT are merged into one QM calculation. .

Example

GP.preference_functional = [‘FunctionalA’, ‘FunctionalB’]

In this case, if one molecule has for functional the ‘FunctionalA’, and the other has ‘FunctionalB’ for its QMParameter, the final functional used for the graps of the 2 molecule will be ‘FunctionalA’.

property redo_QM

Type [string]

If set to ‘redo’, the QM calculation inputs are written in the first part for every molecule which need a QM calculation. The QM calculation should be then done in the second part to be read in the third part. This is the default.

If set to ‘do_not_redo’, Frog will try to check if the QM calculation for a molecule have been already performed. For a molecule which QM calculation have to be performed, it tries to open the expected Dalton result file. It check if the result is readable – meaning the QM calculation endded. If it is, the QM input for this configuration are not written, and the QM calculation is considerated are already performed. The results read in the third part are the one written in this file.

Warning

Frog does not check if the QM parameter for the target molecule (like the functional used) or the neighborhood is the same in the already available file and the input parameter. If you have a doubt, you should use redo_QM = ‘redo’.

Example

GP.redo_QM = 'redo'

property scratch_dir

Type [str]

Define the directory where the temporary Dalton file will be written. Note that this directory is NOT the one where the result of the QM simulation is stored. By default, if the QM simulation ended without an error, these temporary file will be deleted.

However, if many QM calculation are running at the same time, a large amount of (disk) memory can be used. Therefore, we recommand to use a /tmp or a /scratch to perform these calculation.

Today, the chosen implementation is to write in the submission script the line: < ‘export DALTON_TMPDIR=’ + scratch_dir >.

Therefore, we recommand to use: scratch_dir = ‘$SCRATCH_DIR’, and to define the variable “SCRATCH_DIR” in the GP.file_template_script_run_QM. This way, you can define very precisly where the temporary file should be written within your (cluister) submission file. You can for instance define an automatic selection within the submission file to chose which scratch directory to use in function of the cluster/node it is run on.

property submit_array_maxjobs

Type [str]

TODO!

property submit_job_array

Type [str]

The command to launch a sumission script in your cluster waiting queue. For instance ‘sbatch’ or ‘qsub’.

property total_number_molecule

Type [integer]

The total number of molecule read in the topology file.

Should not be defined by the user – would be overwritten in any case.

Note

The number of molecule and atom SHALL remain the same throughout all the MD trajectory. There is no safeguard to prevent Frog to make mistake if it is not the case!!!

property total_time_step

Type [integer]

The total number of time step read in the topology file. To define it, use GP.nbr_time_step and GP.trotter_step

Should not be defined by the user – would be overwritten in any case.

property trotter_step

Type [integer]

How many consecutive frame are not treated (1 means every frame are treated, 2, only one over 2, ect..). The default is 1. See GlobalParameter.nbr_time_step for more information.