Note: this post was originally published 6th December 2021, and has been updated to reflect new developments in these features.
Here we look at how you can use SMARTS and SMILES in Mercury and the CSD Python API to perform substructure searches, and generate 3D molecules from strings to support your cheminformatics work. Using SMARTS and SMILES allows you to automate large numbers of queries, or perform complex searches that may not be possible by other methods.
The CSD GitHub repository is the place to download, edit, and share python scripts to perform chemistry tasks and analyses using the CSD Python API. Here we’ll explain how to access it.
I’m a Research and Applications Scientist on the Discovery Science team at CCDC. In this short blog and accompanying video, I walk through how to make the most of your CSD-Enterprise licence using the tools in CSD-Discovery. The video highlights the available software in the Discovery suite and how it might fit into a drug development workflow. I also present real-world examples of a variety of research applications, including identifying dynamic disorder in semiconductors, advancing COVID-19 research and understanding ALR2 inhibitors for the treatment of diabetes complications.
More advanced settings are now available when docking with GOLD programmatically through the API. Here’s what has changed in the 2021.1 release.
It is now possible to generate a 3D structure from a SMILES string in the CSD Python API. Here we’ll explain how it works.
I’m a Research and Applications Scientist on the Materials Science team at CCDC, and I recently taught a session at the Rigaku School for Practical Crystallography, which ran from June 7–18, 2021. The school focused on practical applications of software, techniques and technologies for crystallography. This blog contains links to my recorded modules as well as a self-assessment quiz you can use to check what you’ve learned.
Here we highlight a paper which employed SMILES to mine the Cambridge Structural Database to identify hydrate-anhydrate pairs. Part of our series highlighting examples of CCDC tools in action by scientists around the world.
One of the major developments in the 2020.1 CSD Release is the addition of the CSD Pipeline Pilot component collection, which will allow you to build custom tools for analysing CSD structural data without writing code.
As well as allowing research to be done faster and more efficiently, this should remove barriers to entry and allow more people to create custom analyses.
Machine learning is a fast growing area of active research within structural science and it is particularly effective in the crystallographic structural sciences due to the wealth of highly accurate structural data available. A key part of machine learning though is having effective molecular descriptors to represent complex chemical information about molecules and structures into easily machine-interpretable vectors of numbers to feed into machine learning algorithms.