What is a good method for extracting a list of interatomic distances between two elements from a folder full of CIFs? The CIFs in their current format do not contain a list of bond lengths, but opening and resaving in Mercury to generate such lists isn't an option since the distances are often longer than would be considered a bond (plus that would take ages).

I'm just wondering if anyone else has had to do this, and whether or not there might be programs/scripts that already exist for this purpose.

Many thanks,


Dear Jack,

I don't have an existing script to do this, but here is a sketch of how you might go about it.

import os
import glob

from ccdc import io, descriptors

directory = '/path/to/directory/of/cifs'

cif_files = glob.glob(os.path.join(directory, '*.cif'))

for cif_file in cif_files:
    with io.MoleculeReader(cif_file) as rdr:
        cif = rdr[0]
        for a1 in cif.atoms:
            for a2 in cif.atoms:
                print a1.label, a2.label, descriptors.MolecularDescriptors.atom_distance(a1, a2)

It is worth noting that if you use an EntryReader to read the CIF file, all of the CIF attributes are available, for example,

cif_entry = io.EntryReader(cif_file)
a1 = cif_entry.attributes['_geom_bond_atom_site_label_1']
a2 = cif_entry.attributes['_geom_bond_atom_site_label_2']
length = cif_entry.attributes['_geom_bond_distance']

for a1, a2, length in zip(a1, a2, length):
    print a1, a2, length

Hope this is helpful.  Please come back with any questions you have about this, or indeed any other API matter.

Best wishes



