Coot-RDKit Integration
Coot-RDKit Integration
This skill provides guidance for using RDKit within Coot’s Python environment for molecular manipulation and visualization.
Key Integration Points
Module Import
Use coot_headless_api (NOT chapi):
import coot_headless_api
from rdkit import Chem
from rdkit.Chem import AllChem
Creating RDKit Molecules from Coot Monomers
import coot_headless_api
import base64
from rdkit import Chem
# Initialize molecules container
molecules = coot_headless_api.molecules_container_t(False) # False = not verbose
# Get monomer from Coot's library
imol = molecules.get_monomer("AMP") # or any other monomer code
# Get RDKit molecule as pickled base64
pickle_base64_str = molecules.get_rdkit_mol_pickle_base64("AMP", imol)
# Decode and create RDKit molecule
pickle_bytes = base64.b64decode(pickle_base64_str)
rdkit_mol = Chem.Mol(pickle_bytes) # Use Chem.Mol(), NOT pickle.loads()
Molecular Manipulation
Atom Substitution
from rdkit import Chem
# Make editable copy
mol_edit = Chem.RWMol(rdkit_mol)
# Replace atom (e.g., phosphorus to sulfur)
for atom in mol_edit.GetAtoms():
if atom.GetSymbol() == 'P':
atom.SetAtomicNum(16) # 16 = sulfur
break
# Convert back to read-only molecule
modified_mol = mol_edit.GetMol()
Chem.SanitizeMol(modified_mol)
2D Structure Visualization
CRITICAL: Always Regenerate 2D Coordinates
When removing hydrogens or modifying structure, ALWAYS regenerate 2D coordinates:
from rdkit.Chem import AllChem
# Remove hydrogens
mol_no_h = Chem.RemoveHs(mol)
# IMPORTANT: Regenerate 2D coords AFTER removing hydrogens
AllChem.Compute2DCoords(mol_no_h)
# Now generate visualization
Generating SVG Diagrams
from rdkit.Chem.Draw import rdMolDraw2D
drawer = rdMolDraw2D.MolDraw2DSVG(400, 400)
drawer.DrawMolecule(mol_no_h)
drawer.FinishDrawing()
svg_string = drawer.GetDrawingText()
# Save without displaying (see Data Handling below)
Data Handling Best Practices
NEVER Display Large String Data
Do NOT return or print large strings (SVG, base64, etc.) as this causes slow response times:
BAD:
svg_data # This displays all the text - SLOW!
GOOD:
# Just save directly without displaying
# (use len() to verify if needed, though even this may not return properly)
Efficient File Writing
Write files directly without displaying content:
From Coot Python:
svg_content = drawer.GetDrawingText()
# Don't display svg_content - just reference it
Then in bash or file creation:
# Use the variable directly without echoing/catting the content
Coot Python Limitations
Single-Line Return Values Only
Coot’s Python environment only returns values from single-line expressions:
Works:
Chem.MolToSmiles(mol) # Returns SMILES string
Doesn’t return properly:
x = 5
y = 10
x + y # Won't return the value
Workaround - Define function in one call, execute in next:
# Call 1: Define
def my_function():
x = 5
y = 10
return x + y
# Call 2: Execute
my_function() # Now returns 15
Common Workflows
Modify and Visualize Ligand
- Load monomer from Coot library
- Convert to RDKit molecule
- Make modifications (atom substitution, etc.)
- Remove hydrogens if desired for cleaner diagram
- Regenerate 2D coordinates (critical!)
- Generate SVG diagram
- Save to file without displaying