3.6. All the GP parameters¶
- class Frog.class_GP.GlobalParameter[source]
Bases:
object
The GlobalParameter object is suposed to holds very basic information for the run. This object will be copied/passed to every threat if parrallel run is performed.
- property IS_layer_selection
Type [bool]
Set to True if there are any need of a layer geometry selection. It can be for a diagram or for QM selection. By default set to False.
Note
This attribute is set by Frog and should not be define by the user.
- MD_check_GP()[source]
MD part of the global parameter object (GP). The MD trajectory is openned and some general properties are stores in the GP objects.
- property MD_convertion_to_angstrom
Type [float]
The convertion to apply to get the distance of the MD trajectory in Angstrom. If not provieded, Frog assume that the unit of the MD is already Angstrom.
Example
If you want to rescale your MD trajectory by 0.529177, use:
GP.MD_convertion_to_angstrom = 0.529177
Position used by frog = (Position MD) x (GP.MD_convertion_to_angstrom)
Note
Use float not integer – 1.0 instead of 1 for instance.
Note
There is no real need to use Angstrom unit. You just have to be coherent with the values defined everywhere in Frog and in the Molecular library file. Be aware that the Anstrom unit are expected in the current procedure with Dalton.
- property MD_cut_trajectory
Type [boolean]
If set to True, the trajectory will be cut in several file in order to have one time step per file. This option is made in order to reduce the RAM memory needed. The piece of trajectory are saved in the GP.dir_mol_times directory. By default True.
Example
GP.MD_cut_trajectory = True
Note
If you deal with a very small trajectory, it may not worth it to cut it: set GP.MD_cut_trajectory. Otherwise, it seems a good safe-guard to avoid impacting the initial trajectory. Indeed, the initial trajectory will be read at the very begining of the Frog run to perform the cut, and then will not be open again.
- property MD_file_name_topology
Type [string]
The name of the molecular dynamic topology file, used to defined which atom are part of the same molecules. Apart from that, the molecule names, groups, masses or other informations is not taken into account. FROG is not designed to handle changes in the topology files in time. More informations about the supported topology files are available here. Even if the file is located in the working directory, we strongly recommend to use general path:
GP.MD_file_name_topology = '/home/user/MD/mytopology.data'
Warning
Mandatory parameter
- property MD_file_name_traj
Type [string]
The name of the molecular dynamic trajectory file. This file should respect the topology defined in
GlobalParameter.MD_file_name_topology
. More informations about the supported trajectory files are available here. Even if the file is located in the working directory, we strongly recommend to use general path:GP.MD_file_name_traj = '/home/user/MD/mytrajectory.dcd'
Warning
Mandatory parameter
- property MD_file_type
Type [string]
The type of topology and trajectory used as input, in
GlobalParameter.MD_file_name_topology
andGlobalParameter.MD_file_name_traj
respectively. The type used have to be compatible with the MDAnalysis. python module. More informations about the supported MD output are available here.Example
For an LAMMPS output, use:
GP.MD_file_name_topology = '/home/user/MD/mytopology.data' GP.MD_file_name_traj = '/home/user/MD/mytrajectory.dcd' GP.MD_file_type = 'LAMMPS'
You can try to open your topology/trajectory if you have trouble to deal with these parameters in Frog. To open the MD trajectory with MDAnalysis, Frog use:
u = MDAnalysis.Universe(GP.MD_file_name_topology, GP.MD_file_name_traj, format=GP.MD_file_type)
Note
Do not hesitate to try out combinaison of topology-trajectory which would have not been yet presented in the doc and to send us feedback for any working combinaison.
Warning
Mandatory parameter
- property box_size
Type [float, float, float]
The first time step MD box size. Used to print some information to the user.
Should not be defined by the user – would be overwritten in any case.
Note
In Frog, the MD box size is update for every frame. This variable is used just for printing user-friendly information.
- check_GP()[source]
This function tests if the parameter set in the input file are coherent. It explicitly print the value understood by the software. It also checks that files can be written/openned in several directories.
First it checks how the path are set by creating a file and/or a directory at the given location. This is done to avoid unpleasant surprises after hours of calculation… Depending on the general options/parameters asked, some attribute/value have to be given or not.
- check_all_MT_initialized(L_moleculetype)[source]
Check that the MT has been properly initialized: at least all the function defining the key objects have been called in the input file!
- property command_launch_job
Type [str]
The command to launch a sumission script in your cluster waiting queue. For instance ‘sbatch’ or ‘qsub’.
- cut_trajectory()[source]
- property dalton_run_option
Type [str]
option to add to the dalton command. Default = “” (no option). Please define this attribute only once and write all the option you need at once.
Example
GP.dalton_run_option = "-noarch" GP.dalton_run_option = "-D" GP.dalton_run_option = "-t" GP.dalton_run_option = "-noarch -D -t"
- property dir_mol_times
Type [string]
The directory used to store the molecular properties for each time step. These data are not humain-readable intented. These datas can become heavy depending on the number of molecule, time step and the number of analysis required. The path is updated using the GP.general_path attribute. The default value is “Molecule_times”.
Example
If no GP.dir_mol_times is declared, the results are saved at GP.general_path/Molecule_times.
If GP.dir_mol_times = Datas/Molecule_in_time, the results are saved at : GP.general_path/Datas/Molecule_in_time
- property dir_submission_file
Type [string]
The directory used to store the script needed to launch the QM run on a cluster. A large number of file can be created if many QM simulations are required, but it should not occupy a large amount of space. These script are humain-readable. The path is updated using the GP.general_path attribute. The default value is “Submission_script”.
Example
If no GP.dir_submission_file is declared, the results are saved at GP.general_path/Submission_script.
If GP.dir_submission_file = Dir_sge/, the results are saved at : GP.general_path/Dir_sge
- property dir_torun_QM
Type [string]
The directory used to store the QM directories: the input script and the results. A large number of directory can be created if many QM simulations are required and thus occupied large amount of space eventhough each file (input file or the result) are small. These data are humain-readable and we strongly recommand to have a look from time to time. The path is updated using the GP.general_path attribute. The default value is “QM_Simulations”.
Example
If no GP.dir_torun_QM is declared, the results are saved at GP.general_path/QM_Simulations.
If you want to set the QM datas at: “/scratch/myname/Datas/Frog/Water_QM”, you have to set GP.dir_torun_QM = “/scratch/myname/Datas/Frog/Water_QM”.
Note
If possible, we recommand to set this directory to a place where the writting/openning of the files are easely efficient.
- property env_authorised_pbc_condition
Type [list]
Defines which direction can be used to built environment is needed. By default all 3 direction are supposed to be PBC.
Example
GP.env_authorised_pbc_condition = [0, 1]
In this case, only the X and Y direction are understood as PBC.
Warning
FROG has been tested for 3D and 2D PBC. If you are using no PBC condition, you may want to be carefull…
- property file_template_script_run_QM
Type [str]
The file used to create every submission script. The line relative to the QM run are added at the end of the script. Please note that the file name is update with GP.general_path. The created submission files are written in the GP.dir_submission_file directory.
- find_molecule_type(molecule_number)[source]
Return the molecule type name for the molecule_number given. Return an Exception if it is was not implemented.
- property general_path
Type [string]
The general path used to defined all the other path of the directory. By default, it will be set to the working directory. You may define the directory with or without the final ‘/’:
GP.general_path = '/home/user/MD/My_systeme/Frog_Analysis' GP.general_path = '/home/user/MD/My_systeme/Frog_Analysis/'
If you want to not use this behaviour – for instance you want to define yourself every directory – you can set GP.general_path to ‘/’. You may also define the other directory attribute with a starting ‘//’. For instance, If you want to set the QM datas at: /scratch/myname/Datas/Frog/Water_QM
GP.general_path = '/home/user/MD/My_systeme/Whatever/' GP.dir_torun_QM = '//scratch/myname/Datas/Frog/Water_QM'
Note
The general_path does not affect the location of the GP.MD_file_name_topology and GP.MD_file_name_traj.
- initializating_software_friendly_GP(L_moleculetype)[source]
This function update the GP according to the given attributes/options.
MD analysis & Diagrams Preparing the molecules set Preparing the QM parts: Daltons inputs
- property layer_nbr_max
Type [int]
If there are several layer geometric selection (for diagram or QM selction), layer_nbr_max contains the maximal number of layer required. The aim is to do the layer attribution only once with this maximal layer number. If some diagrams use less possible layer (for instance 2 instead of 5), then the molecule in deeper layer are assigned to 0 (for instance the molecule at the layer 3 is assigned to 0).
Note
This attribute is set by Frog and should not be define by the user.
- layer_radii_MT_definition(L_moleculetype)[source]
Create a new topology file where the radii used by pytim to define the layer are set according to the MT values. The new topology file is savec as: GP.dir_mol_times/frog_topology.pqr
- property layer_which_radii_MT
Type [str or dict]
Defines how to set the VDW radii used by pytim to attributes layer to the molecule. Precisely, the radii is the optional argument ‘radii_dict’ of the function pytim.ITIM, see https://marcello-sega.github.io/pytim/guessing_radii.html
Possible values:
‘MD’: No values are set manually. Therefore, pytim try to use the Gromos 43A1 force field to attribute radii.
‘MT’: default value, uses the values defines in the MT, using the function info_molecule_for_layer of the molecular library file. Note that if this option is used, the GP.MD_cut_trajectory is set to True. Indeed, Frog needs to modify the topology file in order to make this option work.
your_dict: you can pass directly a dictionary, which will be used as pytim.ITIM(u, radii_dict = your_dict). The name of the directory does not matter: it just has to be a directory so that this option is detected by Frog.
- property max_submission_QM
Type [int]
The number of jobs which can be prepared by the software to perform the QM simulations. This option has been made in order to avoid trouble when submitting a lot of job to a cluster – for instance sending 100 000 jobs to a cluster…
Using this option, you might not be able to perform all the QM simulation at the same time. If it is so, you should wait that the QM calculation already sent end, then re-run the programme in order to treat the rest of the QM simulation and resumit to the cluster the rest of the QM simulation.
Example
GP.max_submission_QM = 100
- property nbr_job_parr_QM
Type [int]
The number of QM simulation which run at the same time on a server.
For exemple, set to 1 to have one QM simulation runing for every jobs submitted to the cluster – for monocore CPUs.
Set 8 to have 8 QM simulations running for every jobs submitted to the cluster – designed to run on multicore CPUs. Note that no memory sharing is needed to perform several QM simulation on a single server (no OpenMPi mandatory) since every QM simulation can be performed independently.
The maximum number of QM simulation witch can be launched simultaneously is: nbr_job_parr_QM*max_submission_QM.
Note
Please be aware that the memory needed for every QM simulation can be large: check the RAM available and where you write the temporary Dalton files. For instance, if 100 QM simulation are running on the same time on the cluster, a lot of reading/writting file will be occuring (temporary Dalton files) and may slow down the cluster. /tmp or a scratch directory should be used for these temporary file, see the scratch_dir variable.
Note
The template used to define how to send job to the cluster (GP.file_template_script_run_QM) must match with this nbr_job_parr_QM. If the template ask only for 4 cores while nbr_job_parr_QM = 8 you may have some trouble. See the Tutorials.
Example
GP.nbr_job_parr_QM = 16
- property nbr_mpi_dalton
Type [int]
The number of MPI processes for each dalton calculation.
Note
Please be aware that the memory needed for every QM simulation can be large: check the RAM available and where you write the temporary Dalton files.
Note
The template used to define how to send job to the cluster (GP.file_template_script_run_QM) must match with nbr_mpi_dalton. If the template ask only for 4 MPI processes while nbr_mpi_dalton = 8 you may have some trouble. The number of MPI processes asked for the job to the job manager should be (nbr_mpi_dalton * nbr_job_parr_QM)
For SLURM , if one demande one node using 8 mpi processes, the number of mpi tasks per node is given by #SBATCH –ntasks-per-node=8 The number of nodes is demanded by #SBATCH –nodes=1
Alternatively, the total number of tasks can be asked with #SBATCH –ntasks=8 This second possibility WAS NOT TESTED YET!
Example
GP.nbr_mpi_dalton = 4
- property nbr_parra
Type [integer]
Maximal number of core on which the first and third part of the run will be parralelized. This is number is NOT related to the parralelization of the QM runs. OpenMP is required to performed this parralelization. The default value is 1.
Example
GP.nbr_time_step = 80 GP.nbr_parra = 2
The total number of frame treat is 80, and 2 cores will be used. Each core will treat 40 frames.
GP.nbr_time_step = 5 GP.nbr_parra = 2
The total number of frame treat is 5, and 2 cores will be used. One core will treat 3 frames, the other 2.
Note
If GP.nbr_parra > GP.nbr_time_step, the GP.nbr_parra is set to GP.nbr_time_step.
Note
Be aware that the RAM may become important, so do not overload your (personal) computer and try small GP.nbr_parra to start with.
Note
If you use parallelization, it is recommanded to cut the MD trajectory to not overload the RAM and for savety (of the MD trajectory file: otherwise it would be read by several core at the same time.). Set GP.MD_cut_trajectory to True (default value).
- property nbr_repetition_QM_perMT
Type list of [int] of length GP.nbr_type_molecule, ie one int for each MT
The number of repetitions for the bunches of (nbr_job_parr_QM) QM simulations run at the same time on the server. Using the parallel command that will run nbr_job_parr_QM at the same time.
For exemple, set to 1 to have one bunch of nbr_job_parr_QM QM simulation runing for every jobs submitted to the cluster .
Set 10 to have (10 * nbr_job_parr_QM) in each QM_todo file for the QM simulations running for every jobs submitted to the cluster .
The maximum number of QM simulation witch can be launched simultaneously is NOT affected by nbr_repetition_QM_perMT ! It remains nbr_job_parr_QM*max_submission_QM. But the TIME NEEDED for 1 job to be finished will be multiplied by nbr_repetition_QM_perMT.
This is particularly usefull if the QM calculation times for the available MTs are quite different. Using this attribute, you can send jobs on the cluster which will last about the same time by asking for more QM calculations per jobs for the small MT and fewer for the large one.
Example
GP.nbr_repetition_QM_perMT = [10,1] if one want 10 repetitions for the MT numbered 0,and 1 repatition for MT numbered 1
- property nbr_time_step
Type [integer]
Define the total number of frame to treat. If set to 0, treat the maximal frame number possible depending on the total number of frame available in the MD trajectory and the value of
GlobalParameter.trotter_step
. The default value is 1.Example
GP.nbr_time_step = 80 GP.trotter_step = 4
Will treat the MD snapshot each 4 available frame up to 80, ie the frame number 1, 5, 9, …
GP.nbr_time_step = 0 GP.trotter_step = 10
If the MD trajectory contains 100 frames, FROG will treat 10 snapshot separated by 10 frames, ie the frame 1, 11, .. 101
- property nbr_time_step_core
Type [list]
A list of integer (size of the list = GP.nbr_parra). Every component define the number of time step to perform for every cores during a parralelized run – during the first and third part.
- property nbr_type_molecule
Type [int]
The number of Molecule Type defined by the user – found in the list L_moleculetype.
- property pass_first_part
Type [bool]
Defined if the first part should be perform or not. This appribute has been created to deal with QM calculation.
Default value False: Frog perform the first part and overwrite the previous results.
If set to True, the first part of the run will be skiped if possible all the moleculetype file are found in the dir_mol_times directory for every time step required.
Note
Frog do not check if the input file is the same for the previous calculation used for the restart, and the one provided (which have launch this calculation). The MT object are the same as the one defined in the previous calculation, while the GP is the one of the new parameters. This has been done in order to make possible to change the attribute relative to the QM management – in the GP parameters.
- property preference_functional
Type [list]
Defines which functional should be used in the case where several molecule or different MT are merged into one QM calculation. .
Example
GP.preference_functional = [‘FunctionalA’, ‘FunctionalB’]
In this case, if one molecule has for functional the ‘FunctionalA’, and the other has ‘FunctionalB’ for its QMParameter, the final functional used for the graps of the 2 molecule will be ‘FunctionalA’.
- property redo_QM
Type [string]
If set to ‘redo’, the QM calculation inputs are written in the first part for every molecule which need a QM calculation. The QM calculation should be then done in the second part to be read in the third part. This is the default.
If set to ‘do_not_redo’, Frog will try to check if the QM calculation for a molecule have been already performed. For a molecule which QM calculation have to be performed, it tries to open the expected Dalton result file. It check if the result is readable – meaning the QM calculation endded. If it is, the QM input for this configuration are not written, and the QM calculation is considerated are already performed. The results read in the third part are the one written in this file.
Warning
Frog does not check if the QM parameter for the target molecule (like the functional used) or the neighborhood is the same in the already available file and the input parameter. If you have a doubt, you should use redo_QM = ‘redo’.
Example
GP.redo_QM = 'redo'
- property scratch_dir
Type [str]
Define the directory where the temporary Dalton file will be written. Note that this directory is NOT the one where the result of the QM simulation is stored. By default, if the QM simulation ended without an error, these temporary file will be deleted.
However, if many QM calculation are running at the same time, a large amount of (disk) memory can be used. Therefore, we recommand to use a /tmp or a /scratch to perform these calculation.
Today, the chosen implementation is to write in the submission script the line: < ‘export DALTON_TMPDIR=’ + scratch_dir >.
Therefore, we recommand to use: scratch_dir = ‘$SCRATCH_DIR’, and to define the variable “SCRATCH_DIR” in the GP.file_template_script_run_QM. This way, you can define very precisly where the temporary file should be written within your (cluister) submission file. You can for instance define an automatic selection within the submission file to chose which scratch directory to use in function of the cluster/node it is run on.
- property submit_array_maxjobs
Type [str]
TODO!
- property submit_job_array
Type [str]
The command to launch a sumission script in your cluster waiting queue. For instance ‘sbatch’ or ‘qsub’.
- property total_number_molecule
Type [integer]
The total number of molecule read in the topology file.
Should not be defined by the user – would be overwritten in any case.
Note
The number of molecule and atom SHALL remain the same throughout all the MD trajectory. There is no safeguard to prevent Frog to make mistake if it is not the case!!!
- property total_time_step
Type [integer]
The total number of time step read in the topology file. To define it, use GP.nbr_time_step and GP.trotter_step
Should not be defined by the user – would be overwritten in any case.
- property trotter_step
Type [integer]
How many consecutive frame are not treated (1 means every frame are treated, 2, only one over 2, ect..). The default is 1. See
GlobalParameter.nbr_time_step
for more information.