Many quantum chemistry projects have reached a point where setup and analysis consumes more human time than CPU time, e.g. it takes all day to set-up enough input files to keep the computer busy overnight. Many people use scripts to automatically extract and analyse the data, but few use scripts to generate the coordinates.
Here I show how this can easily be done using the RDKit toolkit. The following python script adds F and OH substituents to benzene making all possible 91 combinations of mono-, di-, hexa-substituted molecules.
However, the script can easily be changed to do something else by changing parent_smiles and rxn_smarts_list in line 4 and 5. If you are not familiar with SMILES start here and there are plenty of GUIs, such as this, that generate SMILES.
To use the Reaction SMARTS you have to learn SMARTS, which can be a bit tricky, but it is a very powerful tool. For example, if you change [cX3;H1:1]>>[*:1]F to [*;H1:1]>>[*:1]F then the program will add H to any atom with one H, i.e. also the OH group to create the OF substituent. So it you set substitutions = 2, you'll get mono-substituted Ph-OF in addition to mono- and di-substituted Ph-F and Ph-OH.
Similarly, ff you add [cX3;H1:1][cX3;H1:2]>>[c:1]1[c:2]cccc1 to the list (and use substitutions = 2) you'll get un- and mono-substituted napthalene as well as un-substituted anthracene and phenanthrene.
In my experience, the only thing that limits what I can build with this approach is my understanding of SMARTS. Hope this is of some use to you.

This work is licensed under a Creative Commons Attribution 4.0
Here I show how this can easily be done using the RDKit toolkit. The following python script adds F and OH substituents to benzene making all possible 91 combinations of mono-, di-, hexa-substituted molecules.
However, the script can easily be changed to do something else by changing parent_smiles and rxn_smarts_list in line 4 and 5. If you are not familiar with SMILES start here and there are plenty of GUIs, such as this, that generate SMILES.
To use the Reaction SMARTS you have to learn SMARTS, which can be a bit tricky, but it is a very powerful tool. For example, if you change [cX3;H1:1]>>[*:1]F to [*;H1:1]>>[*:1]F then the program will add H to any atom with one H, i.e. also the OH group to create the OF substituent. So it you set substitutions = 2, you'll get mono-substituted Ph-OF in addition to mono- and di-substituted Ph-F and Ph-OH.
Similarly, ff you add [cX3;H1:1][cX3;H1:2]>>[c:1]1[c:2]cccc1 to the list (and use substitutions = 2) you'll get un- and mono-substituted napthalene as well as un-substituted anthracene and phenanthrene.
In my experience, the only thing that limits what I can build with this approach is my understanding of SMARTS. Hope this is of some use to you.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from rdkit import Chem | |
from rdkit.Chem import AllChem | |
parent_smiles = 'c1ccccc1' | |
rxn_smarts_list = ['[cX3;H1:1]>>[*:1]F','[cX3;H1:1]>>[*:1]O'] | |
molecule_substitutions = {} | |
molecules = [] | |
mol = Chem.MolFromSmiles(parent_smiles) | |
molecule_substitutions[0] = [mol] | |
molecules.append(mol) | |
substitutions = 6 | |
for i in range(1,substitutions+1): | |
molecule_substitutions[i] = [] | |
smiles_list = [] | |
for mol in molecule_substitutions[i-1]: | |
for rxn_smarts in rxn_smarts_list: | |
rxn = AllChem.ReactionFromSmarts(rxn_smarts) | |
new_mols = rxn.RunReactants((mol,)) | |
for new_mol in new_mols: | |
new_smiles = Chem.MolToSmiles(new_mol[0],isomericSmiles=True) | |
if new_smiles not in smiles_list: | |
smiles_list.append(new_smiles) | |
molecule_substitutions[i].append(Chem.MolFromSmiles(new_smiles)) | |
molecules += molecule_substitutions[i] | |
# Write 3D coordinates in sdf format | |
folder = "/Users/jan/Desktop/" | |
for i, mol in enumerate(molecules): | |
mol = Chem.AddHs(mol) | |
AllChem.EmbedMolecule(mol) | |
file = folder+"comp"+str(i)+".sdf" | |
writer = Chem.SDWriter(file) | |
writer.write(mol) |

This work is licensed under a Creative Commons Attribution 4.0