Utils#

Mail : [alejandro.martinezleon@uni-saarland.de, ale94mleon@gmail.com] Affiliation : Jochen Hub’s Biophysics Group Affiliation : Faculty of NS, University of Saarland, Saarbrücken, Germany =============================================================================== DESCRIPTION : DEPENDENCIES : ===============================================================================

class palmiche.utils.tools.CHECK(logfile, lisfile=None)[source]#

This class test the normal ending of a GROMACS simulation using the log file. E.g. let say that we have the md.log of a simulation # CHECK = CHECK(‘md.log’) # if CHECK.performance: # #The simulation had a good ending # else: # #The simulations crashed.

__init__(logfile, lisfile=None)[source]#
daysdiff(startdatetime=None)[source]#

Take the difference between the estimated ended time of lisfile and datetime_start

Parameters:
  • lisfile (path) – [description]

  • datetime_start (datetime) – A date time.

Returns:

Days’ difference.

Return type:

float

estimated_end_datetime()[source]#
Read an md.lis looking for the last instance of the string with the estimation of the ended time.

“step 24940000, will finish Tue Oct 5 10:38:53 2021”

Parameters:

self.lisfile (path) – md.lis, the output of the gromacs execution.

Returns:

The estimated end of the simulation.

Return type:

datetime

class palmiche.utils.tools.CTE[source]#

Some important physical constants

palmiche.utils.tools.KbT(absolute_temperature)[source]#

Return the value of Kb*T in kJ/mol

Parameters:

absolute_temperature (float) – The absolute temperature in kelvin

class palmiche.utils.tools.Mol2(mol2_path: PathLike | str | bytes)[source]#

Read a mol2 molecule and storge the corresponded RDKit molecule in the attribute ‘mol’

Example

In [1]: from palmiche.utils.tools import Mol2, get_palmiche_data

In [2]: import os

In [3]: get_palmiche_data(file="samples/mol2.tar.gz", out_dir='.')

In [4]: molecule = Mol2('mol2/test.mol2')
The mol was created succesfully!

In [5]: for atom in molecule.mol.GetAtoms():
   ...:     print(atom.GetProp('atom_type'), atom.GetDoubleProp('charge'))
   ...: 
C -0.0681
C -0.0681
H 0.0227
H 0.0227
H 0.0227
H 0.0227
H 0.0227
H 0.0227
AddConformer(ref_mol: Mol) None[source]#

Add to self.mol the conformation of ref_mol. It is not needed that the molecules have the same atom indexes; palmiche.utils.tools.replace_conformer will be internally called.

Parameters:

ref_mol (Chem.rdchem.Mol) – The reference molecule with a conformational state.

ChangeName(mol_name: str) None[source]#

Change name of the molecule

Parameters:

mol_name (str) – Code name to the molecule. It is wise to use a code no larger than 3 characters.

__init__(mol2_path: PathLike | str | bytes) None[source]#
read() None[source]#

This function read the file and create the RDKit molecule and stores it in the attribute mol.

write(out_file: PathLike | str | bytes) None[source]#

Write the molecue in mol2 format.

Parameters:

out_file (PathLike) – The path to the file.

palmiche.utils.tools.RSS(data, AlignPoint, ref_index_data=0, NumbPoints=None, InterpolationKind='cubic')[source]#

This function will return a list of Residues Squared Sums.

Parameters:
  • data (list) – A list of np.arrays build in such a way that if reference = data[0]. Then for the reference: reference[:,0] give the independent variable and reference[:,1] the dependent one. NAd the same for the other entrances The first entrance of this list must be the reference

  • AlignPoint (_type_) – _description_

  • NumbPoints (int, optional) – _description_. Defaults to 1000.

  • InterpolationKind (str, optional) – _description_. Defaults to ‘cubic’.

Raises:

ValueError – _description_

Returns:

_description_

Return type:

_type_

palmiche.utils.tools.RUS(data, ref_index_data=0, NumbPoints=None, InterpolationKind='cubic')[source]#

RUS stands for Residues Unsigned Sums RUS = sum_{i}^{n} (M_i - R_i) Where M_i is the error value of the magnitude on i of the model and R_i for the reference. RUS < 0, The model presents lower errors than the reference (in general) RUS = 0, The model has the same errors as the reference or they cancell out RUS > 0, The model presents worst errors than the reference

palmiche.utils.tools.angle_between(v1, v2)[source]#

Returns the angle in radians between vectors ‘v1’ and ‘v2’:

# >>> angle_between((1, 0, 0), (0, 1, 0))
# 1.5707963267948966
# >>> angle_between((1, 0, 0), (1, 0, 0))
# 0.0
# >>> angle_between((1, 0, 0), (-1, 0, 0))
# 3.141592653589793
palmiche.utils.tools.aovec(vectors, round=None)[source]#

Return the average oriented vector Calculate all the angles respect to the cartessina axis and average the angles,andthen return the unitary vector ang(vec, x) = mean(ang(vectors_i, x)) ang(vec, y) = mean(ang(vectors_i, y)) ang(vec, z) = mean(ang(vectors_i, z)) vec = (cos(ang(vec, x); cos(ang(vec, y); cos(ang(vec, z)) :param vectors: [description] :type vectors: [type] :param round: [description] :type round: [type]

Returns:

[description]

Return type:

[type]

palmiche.utils.tools.backoff(file_path)[source]#
Parameters:

file (TYPE string) – DESCRIPTION: The name or the path f or the specific file

Returns:

  • None.

  • If the file already exist. it will made a back up to ./#{file}.{str(i)}#,

  • Where i is an integer.

palmiche.utils.tools.checkrun(user='$USER', partition='deflt')[source]#

Check for the running process on the cluster.

Returns:

The integers that identify the running process on the cluster.

Return type:

list of integers

palmiche.utils.tools.cp(src, dest, r=False)[source]#

This function makes use of the possible multiple CPU of the machine.

Parameters:
  • src (TYPE: string) – DESCRIPTION: Source Path to a directory or a file. regular expresion are accepted. The librery glob is used for that.

  • dest (TYPE: string) – DESCRIPTION: Destination Path

  • r (TYPE, optional) – DESCRIPTION. The default is False: If True, Also directories will be copy. If not, and a directory was given as src, a Raise Exception will be printed Another Raise Exception will be printed if the destination path doesn’t exist.

Return type:

None.

palmiche.utils.tools.get_atom_index(file_path, H_atoms=True)[source]#
Parameters:
  • file_path (TYPE, str) – DESCRIPTION. The file with the atoms, Could be an itp, gro or pdb.

  • H_atoms (TYPE, optional) –

    DESCRIPTION. The default is True. :
    True:

    Return all the list of atoms.

    False:

    Just return the heavy atoms (non hydrogens).

Returns:

atom_index – DESCRIPTION.

Return type:

TYPE

palmiche.utils.tools.get_palmiche_data(file: PathLike | str | bytes, out_dir: PathLike | str | bytes = '.') None[source]#

Get data from the data directory of Palmiche

Parameters:
  • file (PathLike) – this is the path relative to the data directory of Palmiche. For example, To get the amber99sb-star-ildn force field, you must provided GROMACS.ff/amber99sb-star-ildn.ff.tar.gz

  • out_dir (PathLike, optional) – Where the file will be decompress, by default ‘.’

palmiche.utils.tools.get_top_sections(topology, dictionary=False)[source]#

The flag dictionary is available because if in the topology there are two sections with the same name, then the last one is the only one that will be save. IN the case that

Parameters:
  • topology (path to a gromacs topology file) – DESCRIPTION.

  • dictionary (TYPE, optional) – DESCRIPTION. The default is False. If True instead of a list return a dict with keywords the sections If not, a list with elements [key, info], …

Returns:

sections – DESCRIPTION. The info of the topology file

Return type:

list or dict depend on dictionary

palmiche.utils.tools.get_vec_COM(conf, tpr, ndx, group1, group2, round=None)[source]#

Take two groups of a index file and return the vector formed by the coordinates of group2, group1 in this order.py

Parameters:
  • conf (path) – the configuration file [gro, pdb, xtc, etc..]

  • tpr (path) – tpr file, or gro, pdb, etc…

  • ndx (path) – index file, need to have the defined groups

  • group1 (str) – name of the first group in the index file.

  • group2 (str) – name of the second group in the index file.

  • round (int or None) – the specification to round.

Returns:

the unitary vector formed by group2, group1

Return type:

np.array

palmiche.utils.tools.job_launch(shell='sbatch', script_name='job.sh')[source]#
Parameters:
  • shell (TYPE, optional) – DESCRIPTION. The default is “sbatch”.

  • script_name (TYPE, optional) – DESCRIPTION. The default is “job.sh”. If a regular expresion is provided then the function will execute the first ordered alphabetically. E.g: job.* was provided and there job.sh and job.bash. Then it will use job.bash.

Returns:

JOBIDs – DESCRIPTION. The ID of the launch in case of sbatch was used as shell

Return type:

list of integers.

palmiche.utils.tools.job_launch_list(job_path_list, shell='sbatch')[source]#

Same as job_launch, but the path to the jobs are provided

Args: job_path_list (_type_): _description_ shell (str, optional): _description_. Defaults to “sbatch”.

Returns: _type_: _description_

palmiche.utils.tools.multi_run(commands, nPar, shell=True, executable='/bin/bash')[source]#

This will run as many runs as nPar.

Parameters:
  • commands (list) – A list of string to be run in the specified shell.

  • nPar (int) – How many processes are running simultaneously

  • shell (bool, optional) – belongs to run(). Defaults to True.

  • executable (str, optional) – belongs to tun(). Defaults to ‘/bin/bash’.

  • Popen (bool, optional) – belongs to run(). Defaults to False.

palmiche.utils.tools.my_guess_types(names)[source]#

Guess the type of the atoms. It will simply remove the number of each string. Useful for the guessing of the ligands atoms

Parameters:

names (list) – a list of atoms names

palmiche.utils.tools.replace_conformer(mol: Mol, ref_mol: Mol, inplace: bool = True)[source]#

Will replace the conformation state of mol by the conformational state of ref_mol. Preserving the atom order of mol.

Parameters:
  • mol (Chem.rdchem.Mol) – The molecule to add the conformation

  • ref_mol (Chem.rdchem.Mol) – The molecule with the reference conformation

  • inplace (bool, optional) – If True mol will be modify inplace, if False a new instance will be return, by default True

Returns:

The new instance of mol only if inplace = True, if not None will be return

Return type:

Chem.rdchem.Mol

Raises:

ValueError – If ref_mol does not have a valid conformational state.

palmiche.utils.tools.rm(pattern, r=False)[source]#
Parameters:
  • patterns (string) – input-like Unix rm

  • r (TYPE, bool) – DESCRIPTION. The default is False. If True delete also directories

Return type:

None.

palmiche.utils.tools.sign_change_index(values: list) int[source]#

This function check wheter the sign of the values change. For example, for the list [-3,-2,0,1,2], the return value will be 2. On value “0” the sig change.

Parameters:

values (list) – Iterable of numerical values

Returns:

The index. In case that the sign does not change or values has less than 2 elements, it will return None.

Return type:

int

palmiche.utils.tools.unit_vector(vector)[source]#

Returns the unit vector of the vector.

palmiche.utils.gmxtrjconv.gmxtrjconv(tpr, conf, xtc, trans=[0, 0, 0], dt=0, vmd=False)[source]#

This function try to create a nice visualization of th MD simulation. The idea is test several translation till the desire part of the system is as far as possible from the edges. In this way will not break down through the periodic boundaries. THis “breaking” would happen for system with more that one molecule (a pentamer) when one of them goes to “other side” of the periodic boundary. But for a molecule itself i will not break.

Parameters:
  • tpr (str path-like) – tpr file

  • conf (str path-like) – configuration file, pdb or gro extensions

  • xtc (str path-like) – trajectory file

  • trans (list, optional) – the x, y, z translation amounts. Defaults to [0,0,0].

  • dt (int, optional) – The interval of time to be considered. For initial testing is wise to use a high value, that give only a few frame of the trajectory. Defaults to 0.

  • vmd (bool, optional) – This will try to open vmd in order that you visualize the system on the testing phase. Defaults to True.

palmiche.utils.gmxtrjconv.main()[source]#

This is just the main function of the script, see __doc__ for details

class palmiche.utils.mdp.MDP(type='production', **user_keywords)[source]#

This is a wrap around the mdp file of GROMACS, the units are the used by GROMACS: time: ps distance: nm etc…

__init__(type='production', **user_keywords) None[source]#
annealing1(NumbGroups, temp, heat_fraction=0.25, nstenergy_same_as_nstcalcenergy=False)[source]#

This is usefull for AleWeights Generate the annealing section of the MDP. The annealing strategy is the folloging: Heat the system during heat_fraction*self.time at the begining and aftter cold the system till the end of the simulation. For all the groups will be used ‘single’ as type of annealing and will be used the same temperatures. E.g: NumGroups = 2 temp = [200,210,220] self.time = 25000 ps heat_fraction = 0.25

OUT: annealing = ‘single single’ annealing_npoints = ‘5 5’ annealing_time = ‘0 3125 6250 15625 25000 0 3125 6250 15625 25000’ annealing_temp = ‘200 210 220 210 200 200 210 220 210 200’

Args: NumbGroups (int): The number of temperature couplead and to be used in the Annealing temp (list): List of temperature to use the heat_fraction (float, optional): How much of the simulation time will be spent in the heating process. Defaults to 0.25.

annealing2(GroupBool, temp_list, heating_frac=0.25, cooling_frac=0.5)[source]#

In this case we are able to specify which groups will change the temperature. Generate the annealing section of the MDP. The annealing strategy is the folloging: Heat the system during heat_fraction*self.time at the begining and aftter cold the system till cooling_frac*self.time. Then mantain the target temperature. This is useful for equilibration. the end of the simulation. For all the groups will be used ‘single’ as type of annealing and will be used the same temperatures.

@@@@@@@ Mejorar el ejemplo@@@@@@@ E.g: NumGroups = 2 temp = [200,210,220] self.time = 25000 ps heat_fraction = 0.25

OUT: annealing = ‘single single’ annealing_npoints = ‘5 5’ annealing_time = ‘0 3125 6250 15625 25000 0 3125 6250 15625 25000’ annealing_temp = ‘200 210 220 210 200 200 210 220 210 200’

Args: NumbGroups (int): The number of temperature couplead and to be used in the Annealing temp (list): List of temperature to use the heat_fraction (float, optional): How much of the simulation time will be spent in the heating process. Defaults to 0.25.

annealing3(NumbOfGroups, temp, heating_time=100, constant_temp_time=2000)[source]#

Generate the annealing section of the MDP. The annealing strategy is the following:

# T3 _____ # | # T2 ____| # | # T1 ____|

This is used as input of PandeWeights method Heat the system during heat_fraction*self.time at the begining and aftter cold the system till the end of the simulation. For all the groups will be used ‘single’ as type of annealing and will be used the same temperatures. @@@@Mejorar el ejemplo, cambiar la variable temp por temperatures E.g: NumGroups = 2 temp = [200,210,220] self.time = 25000 ps heat_fraction = 0.25

OUT: annealing = ‘single single’ annealing_npoints = ‘5 5’ annealing_time = ‘0 3125 6250 15625 25000 0 3125 6250 15625 25000’ annealing_temp = ‘200 210 220 210 200 200 210 220 210 200’

Args: NumbGroups (int): The number of temperature couplead and to be used in the Annealing temp (list): List of temperature to use the heat_fraction (float, optional): How much of the simulation time will be spent in the heating process. Defaults to 0.25.

pull(dist2pull, ligands2pull, flat_bottom_init=0.4, flat_bottom_k=500, orient_restrain_init=45, orient_restrain=True, orient_restrain_k=8274.41, orient_restrain_vec=(0, 0, -1), **pullkeywords)[source]#

This will genrate the pull section with cylinder flat bottom restrain and orientation restrain orient_restrain_init = 0.349 # 20 degrees 20/180*3.141 orient_restrain_k = 1308.36 # in Kj/mol*rad^2, leads to sigma = 2.5 degrees

Parameters:
  • dist2pull (TYPE, optional, float) – DESCRIPTION. How long we want to pull in nm.

  • ligands2pull (TYPE, optional, list of strings) – DESCRIPTION. The list of names used in the index file for the ligands to pull if you just want to use the function to modify mdp options

  • *args (TYPE, tup) – DESCRIPTION. Any possible tuple of two members: (key, value)

Return type:

None.