Advancing Structural Science - predictions for the future from data leaders

The past year has pushed the importance and challenges of scientific discovery into the spotlight. What lessons can we learn, and what changes can scientific leaders make going forward? In a virtual "fireside chat" the leaders of the CCDC and the RCSB PDB explored this topic, and we share some key learnings from the discussion here.

 

 1) More fields will work with a "digital first" mindset

Computational chemistry and biology have been well used in drug discovery for decades, but in the future more areas of research will make use of computational and cheminformatics techniques.

The PDB is now being used for diverse applications from bioengineering to understanding photosynthesis, as well as emerging areas of the pharmaceutical space like precision medicine.

The CSD has seen an increase in biotech users, and is developing further in the materials and particle science space. Growth in metal organic frameworks (MOFs) and covalent organic frameworks (COFs) has been huge in recent years, and is set to continue as more applications are found for these materials.

 

2) The importance of data quality and accessibility will be realised

The use of computational and informatics methods in all phases of scientific discovery and development makes access to high quality, reliable data essential.

The CSD and PDB have curated and managed molecular structure data for the public benefit for decades, but there are as many (or more) structures in inaccessible silos. Access to these confidential or proprietary structures could help to advance research further and faster.

Efforts to make use of the "data behind doors" must continue - with projects like D3R beginning to do this by sharing post-competitive data from various companies, but it must continue on a larger scale to unlock the full potential of the data.

     

3) AI and the prediction of structures will continue to advance - but experimental results are still essential

The success of the DeepMind team in the recent CASP-14 protein structure prediction challenge was incredible progress - but there is still some way to go.

Small molecule binding and protein-protein interactions as emerging applications where AI has begun to show promise. Protein-ligand docking, to optimise affinity and selectivity, will be of special interest due to its applications for drug repurposing and drug discovery.

Continued blind testing through initiatives like CASP and the CSP Blind Test will continue to push these methods forward.

 

     

  

4) We will carry lessons learnt on collaboration into the future

Efforts such as PostEra Moonshot borne in response to the COVID-19 pandemic saw massive collaboration across scientific disciplines and geographical borders to address the problem.

This is how science should be done to solve those big issues. There's no room for nationalism - it needs to be fast, open, and collaborative.

We've learned what to do, and what not to do - we need to prepare broadly to meet future challenges with the right tools, in the right way.

 

About the CCDC and PDB

The CSD and PDB have been the guardians of structural science data for a combined 106 years. They curate and distribute small molecule organic, metal-organic, and macro-molecular structures, which have been experimentally determined by the scientific community. Both organisations realise the value and power that such data holds when it is managed and made available - and the two databases are used in global academic and industrial research.

The insights shared in this article are taken from the discussion webinar in December 2020 which welcomed the leaders of these two organisations, Dr Juergen Harter and Dr Stephen K. Burley.

 

Watch the full conversation on-demand here.