• RE: Using Python API to AutoEdit my pandas DataFrame


    I don't know if you can easily read your molecule from your pd dataframe. check for supported file format (https://downloads.ccdc.cam.ac.uk/documentation/API/descriptive_docs/io.html#id2

    How do you create this dataframe?

    Thus, the must complicated should be to create molecule object understandable by ccdc API after that, it is relatively easy to use functions mimicking auto-edit structure and export in mol2 using for instance :


    Add hydrogen atoms to the molecule.

    Parameters: mode – ‘all’ to generate all hydrogens (throws away existing hydrogens) or ‘missing’ to generate hydrogens deemed to be missing.
    Raises: RuntimeError if any heavy atom has no site.
    Raises: RuntimeError if any atoms are of unknown type.
    Raises: RuntimeError if any bonds are of unknown type.




    Assign bond types to the molecule.

    Parameters: which – may be ‘all’ or ‘unknown’
    Raises: ValueError if an unrecognised which parameter is provided





  • RE: empty ccdc number was returned for some entries in Python API


    The same for me. Python results should reflect results you could obtain with conquest or Mercury. In both, there is no CCDC number for your structures. The reason why there is a number in web CSD and not in 2020.1 CSD - I do not know! (a  database update problem?) but I never tried with previous version of CSD.

    Maybe, the CCDC staff should be helpful on this question?



  • RE: Get molecules names from crystal.

  • RE: Chirality assessment


    +1 because I have also some issues about chirality assessment.


    For your structures  BOBDON, BOSMOM, LAWCOC I think the probleme is related to disorder. If you carrefully look on the molecules you can notice that two atoms occupy the same position. For instance in BOBDON: in the asymmetric unit (AU)  (the half of one molecule) look at C3U and N1U atoms, when you reconstruct the whole molecule, because of disorder, C3 is generated on the same position of N1 and N1 on the position of C3. Then by applying the routine mol.assign_bond_type(), I think it will probably generate 4 bonds for carbon atoms. You can check that with atom.bonds function. e.g: for C2 atom we have  (Bond(Single Atom(Br2) Atom(C2)), Bond(Single Atom(C1) Atom(C2)), Bond(Single Atom(C2) Atom(C3)), Bond(Single Atom(N1B) Atom(C2)).

    Thus it is assign to chiral center... Maybe developpers can add some exceptions for this particular case (when two atoms are on the same coordinates?) 

    For HODKER results are the same in python and Mercury. The difference you noticed is probably because in auto-edit in Mercury, the default assign.bond type is set on "Unknown" and in your routine you  assign to "All" bonds... compare with mol.assign_bond_types('Unknown')....


    For my case I noticed that many structures containing Boron, Phosphorus, or Nitrogen atoms detected chiral in Mercury are not detected chiral  in python API. For instance: XONPUO, XONMOF, YOWQIM. It seems occur when mol.add.hydrogens() is applied (it seems remove some H atoms! for instance only 8 hydrogens for YOWQIM after mol.add.hydrogens() instead of 10 without this function...)

    Why  this behavior difference between Mercury "add Missing H"  and python "mol.add.hydrogens()"? 


    Best Regards, 




  • Best way to deternine molecular Point group?


    what should be the best procedure to determine molecular point group from CSD data?

    I currently proceed as follow:


    Nevertheless, I feelit is not the best procedure because the point group of a lot of molecules is uncorrectly determined (a lot of them fall in C1). I guess it is due to coordinates of molecules in the solid state. How to add some tolerance on these coordinates? 

    Best Regards, 

  • MolecularDescriptors.rmsd - How to use "invert keyword?


    I would like to have more information about the function MolecularDescriptors.rmsd

    In the documentation we have that description: 

    static MolecularDescriptors.rmsd(mol1mol2atoms=Noneoverlay=Falseexclude_hydrogens=Truewith_symmetry=True)

    Return the RMSD of two molecules.

    Both molecules should have the same atoms if atoms is None.

    • atoms – a list of pairs ccdc.molecule.Atom or None
    • overlay – Whether to overlay the molecules before calculating RMSD
    • exclude_hydrogens – Whether all-atom or heavy atom RMSD should be calculated
    • with_symmetry – Whether to allow symmetrical matches
    • invert – if doing molecular overlay first, allow molecular inversion
    • rotate_torsions – if doing molecular overlay first, allow torsion driving to the same set of values



    My question is how to use the keyword "invert"?, when I tried I obtained a TypeError...





  • RE: How to handle errors in script?

    Hi Florian,

    thank you very much. It is "crystal clear" now!

    I added the code and it's working. 

    Best Regards,

  • How to handle errors in script?


    I am pretty new in python language and I am trying to run script on a large dataset (from gcg file) and using the functions:

    - mol.assign_bond_types('Unknown')

    - comp.normalise_atom_order()

    Sometimes I get the message: IndexError: list index out of range or a RuntimeError (principally due to the complexity of the structure, I guess)
    How to handle it and to pass to the next structure if the error occurs? 

    Thank you !