yes, all the searches except TextNumericSearch will accept a list of identifiers, a molecule, a crystal, an entry or a file. By default it will search the CSD. You will find it faster to do the TextNumericSearch first, since that is much faster than the substructure search.
It is not currently possible to make a combined search directly, though this is under consideration for a future release. Instead you can perform a second search on the results of the first search. For example:
text_hits = text_numeric_search.search()
# this gives 3234 hits
smarts_hits = s.search([h.identifier for h in text_hits])
# This gives 231 hits, of which 112 are different structures
I hope this is helpful. Please get back in touch if anything is unclear.
you can generate molecules outside the unit cell with the translate parameter of the symmetric_molecule() method if you want explicit control of the symmety operator, or you can generate expanded representations of the crystal through methods such as packing_shell() and molecular_shell(). So, for example, you could use the molecular shell:
mol = crystal.molecular_shell()
atoms_of_interest = [a for a in mol.atoms if a.label == 'Te1']
min_dist = min(MolecularDescriptors.atom_distance(a, b) for a in atoms_of_interest for b in atoms_of_interest if a != b)
or by hand you can calculate all translated symmetric molecules:
expansions = [
crystal.symmetric_molecule(symmop, (i, j, k))
for symmop in cry.symmetry_operators
for i in range(-2, 3)
for j in range(-2, 3)
for k in range(-2, 3)
min_dist = min(
for i in range(len(expansions)) for j in range(len(expansions))
if i != j
Is either of these solutions what you are after?
for this I think you will need to know the symmetry operators of the crystal, and to know which of the symmetry operators you are interested in using. For example,
from ccdc import io, descriptors
csd = io.EntryReader('csd')
crystal = csd.crystal('AABHTZ')
base_mol = crystal.molecule
symm_mol = crystal.symmetric_molecule(crystal.symmetry_operators)
print descriptors.MolecularDescriptors.atom_distance(base_mol.atom('N2'), symm_mol.atom('N2')
Hope this is helpful; please let me know if you would like any more help.
Thanks for the suggestions, Paul, I'll certainly consider them for the next release.
firstly you are right to spot that TextNumericSearch doesn't take a settings parameter. I shall fix this for the next release.
Secondly, you can exclude entries with specific elements using the search_settings class:
search_settings.must_not_have_elements = [
'Ar', 'K', ...
from ccdc.molecule import Atom
ats = 
for i in range(18, 93):
ats[-1].atomic_number = i
search_settings.must_not_have_elements = [a.atomic_symbol for a in ats]
This is slightly clumsy because there is no atomic_number keyword for Atom creation.
Thirdly you don't have to use a bogus search to extract all entries of a database matching specific search criteria. You can set up the search_settings as above, then iterate over the csd:
from ccdc import io
csd = io.EntryReader('csd')
for e in csd:
Fourthly the year range is a pair of numbers, interpreted as an inclusive range, rather than the list of values you have given. The first two values of the range have been used as the inclusive range, so you are getting hits from 1970-1971. The query should be written:
It would have been helpful if the API had made this clear.
Lastly, it is perhaps counter-intuitive that a TextNumericSearch with no criteria returns no hits. It would probably be better to raise an exception as the other classes do. I shall consider this for the next release.
Thank you for your questions; it is feedback like this that helps me to make the API better.
thinking about your search requests I realise that an SQL derived database really is overengineering.a solution to the problem. Since the ReducedCellSearch is very fast, and the number of hits returned is very small a simple filtering of the hits is more than fast enough. I've attached a simple example script which performs a couple of queries of the sort you are describing.
I'm sure you'll have no difficulty adjusting the script for your purposes, but if you do, please raise the issue here.
Okey-doke, Dean, I'll rustle something up. Might take me a couple of days, so please be patient.
we don't have any methods to search by spacegroup symbol or formula, so iterating over hit structures would be the only way to do it. If you have to do many of these searches it would be fairly simple to make an SQLite database containing terms of interest, then to join the results of a ReducedCellSearch with a query of this database.
If you like I can provide a prototype of how to go about this.
I'm not entirely clear what you are trying to do here. Let me know if I've got the wrong end of the stick:
You run a reduced cell search on the CSD, or another database of structures, retrieving some hits. You then wish to filter these results according to further criteria, e.g. chemical formula, or space groups.
You can do a simple filter of the hits, assuming there are not too many of them, simply by iterating over the hits:
for h in hits:
c = h.crystal
if c.spacegroup_symbol == ...
Alternatively you can use any of the search classes except TextNumericSearch on an individual crystal structure.
Hope this is helpful; if not please ask again.