Love it or hate it, Excel is extensively used in industrial settings. Often folk like to use Excel to analyse the results of their work generated from some script or workflow. The usual workflow is to dump out a CSV file and load this into Excel, but sometimes this is somewhat inefficient. (How many times have you dumped a CSV file, loaded it, re-filtered it etc. etc. only then to discover that maybe the output needs one extra field you should have added, so you end up repeating the analysis run with a small script change and then going through all the steps on loading the Excel file again. If you end up doing this a lot, it gets inefficient!).
After celebrating the huge milestone for structural chemistry with the addition of the millionth structure into the CSD in June 2019, the 2020.0 CSD Release now contains 1,034,174 entries and 1,016,168 unique structures. That means an increase of more than 60,000 entries, and we are well on our way to the next million!
Completeness is an important measure of data integrity and is essential to capture all relevant information about an experiment. This also helps ensure research data is FAIR (Findable, Accessible, Interoperable and Re-usable). With this in mind, CCDC is investigating the completeness of the crystallographic data we hold in our archive. The aim of this investigation is to identify the trends in the information submitted to us, highlight where data is missing and work to enable the capture of any absent information during deposition to prevent the loss of valuable metadata in the future. This blog will highlight some of our initial findings.
The current licensing system for the CSD has served us well for over 20 years, but it is finally starting to show its age. It ties us in to a yearly release cycle and limits the components that we may individually licence. It comes from a time before virtual machines existed and is not fully compatible with the world of computers that we now live in. As such we are excited to announce a long overdue licensing system upgrade will be rolled out as part of our 2020.0 CSD release this December.
In the year that the CSD hit one million structures we wanted to highlight and thank some of the most prolific contributors to the database. The 10th and final person in this series is Brian W. Skelton. Brian is currently 1st in our annual CSD author statistics and so we wanted to thank him for his contributions by doing what we do best – searching the CSD!
With the Cambridge Structural Database (CSD) reaching a million structures earlier this year, there is now, more than ever, the opportunity to harness the power of this data through effective visualisation, analysis and extraction.
We are excited to announce that we will be launching H-bond Coordination Quick-View in Mercury as part of our upcoming 2020.0 release in December! This latest development will enable quick and easy hydrogen-bond likelihood analysis using coordination numbers for the observed structure.
In the year that the CSD hit one million structures we wanted to highlight and thank some of the most prolific contributors to the database. The 9th person in this series is Allan H. White who is currently 2nd in our annual CSD author statistics!
2019 has been an exciting year for CCDC. We've attended many conferences both in the UK and internationally, delivered workshops worldwide, formed a new scientific advisory board, and of course reached a million structures in the Cambridge Structural Database! We've also been growing our team both in the UK and the US to help us continue our mission of advancing structural science into 2020 and beyond. We'd like to introduce one of our newest team members, Rob Willacy, who joins us from GSK. He tells us about his passion for understanding solid state chemical problems and how he hopes to apply his experience at CCDC to drive the centre's materials tools and services.
In the year that the CSD hit one million structures, we wanted to highlight and thank some of the most prolific contributors to the database. Our 8th CSD Hero in this series is Arnold L. Rheingold who currently sits 3rd in our annual CSD author statistics!