It is now possible to generate a 3D structure from a SMILES string in the CSD Python API. Here we’ll explain how it works.
I’m a Research and Applications Scientist on the Materials Science team at CCDC, and I recently taught a session at the Rigaku School for Practical Crystallography, which ran from June 7–18, 2021. The school focused on practical applications of software, techniques and technologies for crystallography. This blog contains links to my recorded modules as well as a self-assessment quiz you can use to check what you’ve learned.
Here we highlight a paper which employed SMILES to mine the Cambridge Structural Database to identify hydrate-anhydrate pairs. Part of our series highlighting examples of CCDC tools in action by scientists around the world.
One of the major developments in the 2020.1 CSD Release is the addition of the CSD Pipeline Pilot component collection, which will allow you to build custom tools for analysing CSD structural data without writing code.
As well as allowing research to be done faster and more efficiently, this should remove barriers to entry and allow more people to create custom analyses.
Machine learning is a fast growing area of active research within structural science and it is particularly effective in the crystallographic structural sciences due to the wealth of highly accurate structural data available. A key part of machine learning though is having effective molecular descriptors to represent complex chemical information about molecules and structures into easily machine-interpretable vectors of numbers to feed into machine learning algorithms.