BIOINFORMATICS IN DRUG DESIGNING

Shrikant Sharma

Research Associate

Role of Bioinformatics in Drug designing

Bioinformatics plays an important role in the design of new drug compounds. Rational Drug Design (RDD). Rational drug design is a process used in the biopharmaceutical industry to discover and develop new drug compounds. RDD uses a variety of computational methods to identify novel compounds, design compounds for selectivity, efficacy and safety, and develop compounds into clinical trial candidates. These methods fall into several natural categories

structure-based drug design, ligand-based drug design, de novo design and homology modeling ,depending on how much information is available about drug targets and potential drug compounds. We sh'll focus on structure-based drug design in this article and describe a few of its salient features.

Structure-Based Drug Design (SBDD). Structure-based drug design is one of several methods in the rational drug design toolbox. Drug targets are typically key molecules involved in a specific metabolic or cell signaling pathway that is known, or believed, to be related to a particular disease state. Drug targets are most often proteins and enzymes in these pathways. Drug compounds are designed to inhibit, restore or otherwise modify the structure and behavior of disease-related proteins and enzymes. SBDD uses the known 3D geometrical shape or structure of proteins to assist in the development of new drug compounds. The 3D structure of protein targets is most often derived from x-ray crystallography or nuclear magnetic resonance (NMR) techniques. X-ray and NMR methods can resolve the structure of proteins to a resolution of a few angstroms (about 500,000 times smaller than the diameter of a human hair). At this level of resolution, researchers can precisely examine the interactions between atoms in protein targets and atoms in potential drug compounds that bind to the proteins. This ability to work at high resolution with both proteins and drug compounds makes SBDD one of the most powerful methods in drug design.

SBDD methods have been used in designing drugs for a well known cancer-related protein complex. Two protein targets that have been studied extensively in cancer research are p53 and MDM2. These two proteins form a single p53-MDM2 complex as part of a cell-signaling pathway that regulates cell division. Mutated forms of p53-MDM2 result in various forms of tumors and cancers. Several decades of research have been aimed at designing small-molecule compounds that restore the normal function of p53-MDM2, and consequently reduce or eliminate certain forms of cancer.

One well-known anticancer drug 'nutlinÃ¢' - has been developed by Roche Pharmaceuticals to restore the normal functioning of MDM2. SBDD methods played an important role in this development. The beauty of the SBDD method is the extremely high level of detail that it reveals about how drug compounds and their protein targets interact. We can identify the exact location of all five nutlinÃ¢ compounds, their individual 3D orientations relative to MDM2 surface and interior amino acids, and how deeply embedded each nutlinÃ¢ compound is in the interior of MDM2. This information is useful in designing the 3D shape of the nutlinÃ¢ parent compound or various analogues of the drug. This information also assists researchers in designing drug compounds that bind selectively and tightly to MDM2, thus leading to more potent and safer cancer drugs.

Docking Ligands. One of the key benefits of SBDD methods is the exceptional capability it provides for docking putative drug compounds (ligands) in the active site of target proteins. Most proteins contain pockets, cavities, surface depressions and other geometrical regions where small-molecule compounds can easily bind. With high-resolution x-ray and NMR structures for proteins and ligands, researchers can show precisely how ligands orient themselves in protein active sites. Open source bioinformatic tools such as VMD and NAMD. Furthermore, it is well known that proteins are often flexible molecules that adjust their shape to accommodate bound ligands. In a process called molecular dynamics, SBDD allows researchers to dock ligands into protein active sites and then visualize how much movement occurs in amino acid sidechains during the docking process. In some cases, there is almost no movement at all (i.e., rigid-body docking); in other cases, such as with the HIV-1 protease enzyme, there is substantial movement. Flexible docking can have profound implications for designing small-molecule ligands so this is an important feature in SBDD methods.

LEAD OPTIMIZATION- After a number of lead compounds have been found, SBDD techniques are especially effective in refining their 3D structures to improve binding to protein active sites, a process known as lead optimization. In lead optimization researchers systematically modify the structure of the lead compound, docking each specific configuration of a drug compound in a proteins active site, and then testing how well each configuration binds to the site. In a common lead optimization method known as bioisosteric replacement, specific functional groups in a ligand are substituted for other groups to improve the binding characteristics of the ligand. With SBDD researchers can examine the various bioisosteres and their docking configurations, choosing only those that bind well in the active site.

COMPUTER-AIDED DRUG DESIGN (CADD) is a specialized discipline that uses computational methods to simulate drug-receptor interactions. CADD methods are heavily dependent on bioinformatics tools, applications and databases. As such, there is considerable overlap in CADD research and bioinformatics.

Virtual High-Throughput Screening (vHTS). Pharmaceutical companies are always searching for new leads to develop into drug compounds. One search method is virtual high-throughput screening. In vHTS, protein targets are screened against databases of small-molecule compounds to see which molecules bind strongly to the target. If there is a hit with a particular compound, it can be extracted from the database for further testing. With today's computational resources, several million compounds can be screened in a few days on sufficiently large clustered computers. Pursuing a handful of promising leads for further development can save researchers considerable time and expense. ZINC is a good example of a vHTS compound library. Sequence Analysis. In CADD research, one often knows the genetic sequence of multiple organisms or the amino acid sequence of proteins from several species. It is very useful to determine how similar or dissimilar the organisms are based on gene or protein sequences. With this information one can infer the evolutionary relationships of the organisms, search for similar sequences in bioinformatic databases and find related species to those under investigation. There are many bioinformatic sequence analysis tools that can be used to determine the level of sequence similarity.

HOMOLOGY MODELING - Another common challenge in CADD research is determining the 3-D structure of proteins. Most drug targets are proteins, so it's important to know their 3-D structure in detail. It's estimated that the human body has 500,000 to 1 million proteins. However, the 3-D structure is known for only a small fraction of these. Homology modeling is one method used to predict 3-D structure. In homology modeling, the amino acid sequence of a specific protein (target) is known, and the 3-D structures of proteins related to the target (templates) are known. Bioinformatics software tools are then used to predict the 3-D structure of the target based on the known 3-D structures of the templates. MODELLER is a well-known tool in homology modeling, and the SWISS-MODEL Repository is a database of protein structures created with homology modeling. Similarity Searches. A common activity in biopharmaceutical companies is the search for drug analogues. Starting with a promising drug molecule, one can search for chemical compounds with similar structure or properties to a known compound. There are a variety of methods used in these searches, including sequence similarity, 2D and 3D shape similarity, substructure similarity, electrostatic similarity and others. A variety of bioinformatic tools and search engines are available for this work.

DRUG LEAD OPTIMIZATION- When a promising lead candidate has been found in a drug discovery program, the next step (a very long and expensive step!) is to optimize the structure and properties of the potential drug. This usually involves a series of modifications to the primary structure (scaffold) and secondary structure (moieties) of the compound. This process can be enhanced using software tools that explore related compounds (bioisosteres) to the lead candidate. OpenEye's WABE is one such tool. Lead optimization tools such as WABE offer a rational approach to drug design that can reduce the time and expense of searching for related compounds.

PHYSICOCHEMICAL MODELING-Drug-receptor interactions occur on atomic scales. To form a deep understanding of how and why drug compounds bind to protein targets, we must consider the biochemical and biophysical properties of both the drug itself and its target at an atomic level. Swiss-PDB is an excellent tool for doing this. Swiss-PDB can predict key physicochemical properties, such as hydrophobicity and polarity that have a profound influence on how drugs bind to proteins.

DRUG BIOAVAILABILITY AND BIOACTIVITY- Most drug candidates fail in Phase III clinical trials after many years of research and millions of dollars have been spent on them. And most fail because of toxicity or problems with metabolism. The key characteristics for drugs are Absorption, Distribution, Metabolism, Excretion, Toxicity (ADMET) and efficacy in other words bioavailability and bioactivity. Although these properties are usually measured in the lab, they can also be predicted in advance with bioinformatics software.

BENEFITS OF CADD

CADD methods and bioinformatics tools offer significant benefits for drug discovery programs.

COST SAVINGS-The Tufts Report suggests that the cost of drug discovery and development has reached $800 million for each drug successfully brought to market. Many biopharmaceutical companies now use computational methods and bioinformatics tools to reduce this cost burden. Virtual screening, lead optimization and predictions of bioavailability and bioactivity can help guide experimental research. Only the most promising experimental lines of inquiry can be followed and experimental dead-ends can be avoided early based on the results of CADD simulations.

TIME-TO-MARKET. The predictive power of CADD can help drug research programs choose only the most promising drug candidates. By focusing drug research on specific lead candidates and avoiding potential 'dead-end' compounds, biopharmaceutical companies can get drugs to market more quickly.

INSIGHT- One of the non-quantifiable benefits of CADD and the use of bioinformatics tools is the deep insight that researchers acquire about drug-receptor interactions. Molecular models of drug compounds can reveal intricate, atomic scale binding properties that are difficult to envision in any other way. When we show researchers new molecular models of their putative drug compounds, their protein targets and how the two bind together, they often come up with new ideas on how to modify the drug compounds for improved fit. This is an intangible benefit that can help design research programs.

CADD and bioinformatics together are a powerful combination in drug research and development. An important challenge for us going forward is finding skilled, experienced people to manage all the bioinformatics tools available to us, which will be a topic for a future article.

DRUG DESIGN

Drug design is the approach of finding drugs by design, based on their biological targets. Typically a drug target is a key molecule involved in a particular metabolic or signalling pathway that is specific to a disease condition or pathology, or to the infectivity or survival of a microbial pathogen.

Some approaches attempt to stop the functioning of the pathway in the diseased state by causing a key molecule to stop functioning. Drugs may be designed that bind to the active region and inhibit this key molecule. However these drugs would also have to be designed in such a way as not to affect any other important molecules that may be similar in appearance to the key molecules. Sequence homologies are often used to identify such risks.

Other approaches may be to enhance the normal pathway by promoting specific molecules in the normal pathways that may have been affected in the diseased state.

The structure of the drug molecule that can specifically interact with the biomolecules can be modeled using computational tools. These tools can allow a drug molecule to be constructed within the biomolecule using knowledge of its structure and the nature of its active site. Construction of the drug molecule can be made inside out or outside in depending on whether the core or the R-groups are chosen first. However many of these approaches are plagued by the practical problems of chemical synthesis.

Newer approaches have also suggested the use of drug molecules that are large and proteinaceous in nature rather than as small molecules. There have also been suggestions to make these using mRNA. Gene silencing may also have therapeutical applications.

COMPUTER-ASSISTED DRUG DESIGN

Computer-assisted drug design uses computational chemistry to discover, enhance, or study drugs and related biologically active molecules. Methods used can include simple molecular modeling, using molecular mechanics, molecular dynamics, semi-empirical quantum chemistry methods, ab initio quantum chemistry methods and density functional theory. The purpose is to reduce the number of targets for a good drug that have to be subjected to expensive and time-consuming synthesis and trialling.

In recent years, we have seen an explosion in the amount of biological information that is available. Various databases are doubling in size every 15 months and we now have the complete genome sequences of more than 100 organisms. It appears that the ability to generate vast quantities of data has surpassed the ability to use this data meaningfully. The pharmaceutical industry has embraced genomics as a source of drug targets. It also recognises that the field of bioinformatics is crucial for validating these potential drug targets and for determining which ones are the most suitable for entering the drug development pipeline.

All marketed drugs today target only about 500 gene products. The elucidation of the human genome which has an estimated 30,000 to 40,000 genes, presents immense new opportunities for drug discovery and simultaneously creates a potential bottleneck regarding the choice of targets to support the drug discovery pipeline. The major advances in genomics and sequencing means that finding an attractive target is no longer a problem but finding the targets that are most likely to succeed has become the challenge. The focus of bioinformatics in the drug discovery process has therefore shifted from target identification to target validation.

A lot of factors need to be taken into account concerning a candidate target from a multitude of heterogeneous resources. The types of information that one needs to gather about potential targets include nucleotide and protein sequencing information, homologues, mapping information, function prediction, pathway information, disease associations, variants, structural information, gene and protein expression data and species/taxonomic distribution among others. Different bioinformatics tools can be used to gather this information. The accumulation of this information into databases about potential targets means that the pharmaceutical companies can save themselves much time, effort and expense exerting bench efforts on targets that will ultimately fail. The information that is gathered helps to characterise the different targets into families and subfamilies. It also classifies the behaviour of the different molecules in a biochemical and cellular context. Decisions about which families provide the best potential targets is guided by a number of criteria. It is important that the potential target has a suitable structure for interacting with drug molecules. Structural genomics helps to prioritise the families in terms of their 3D structures.

Sometimes we want to develop broad spectrum drugs that are effective against a wide range of pathogenic species while at other times we want to develop narrow spectrum drugs that are highly specific to a particular organism. Comparative genomics helps to find protein families that are widely taxonomically dispersed and those that are unique to a particular organism.

Clustering algorithms are used to organise this expression data into different biologically relevant clusters. We can then compare the expression profiles from the diseased and healthy cells to help us understand the role our gene or protein plays in a disease process. All of these computational tools can help to compose a detailed picture about a protein family, its involvement in a disease process and its potential as a possible drug target.

Following on from the genomics explosion and the huge increase in the number of potential drug targets, there has been a move from the classical linear approach of drug discovery to a non linear and high throughput approach. The field of bioinformatics has become a major part of the drug discovery pipeline playing a key role for validating drug targets. By integrating data from many inter-related yet heterogeneous resources, bioinformatics can help in our understanding of complex biological processes and help improve drug discovery.

Drug Design based on Bioinformatics Tools

The processes of designing a new drug using bioinformatics tools have open a new area of research. However, computational techniques assist one in searching drug target and in designing drug in silco, but it takes long time and money. In order to design a new drug one need to follow the following path.

Identify Target Disease: One needs to know all about the disease and existing or traditional remedies. It is also important to look at very similar afflictions and their known treatments.

Target identification alone is not sufficient in order to achieve a successful treatment of a disease. A real drug needs to be developed.This drug must influence the target protein in such a way that it does not interfere with normal metabolism. One way to achieve this is to block activity of the protein with a small molecule. Bioinformatics methods have been developed to virtually screen the target for compounds that bind and inhibit the protein. Another possibility is to find other proteins that regulate the activity of the target by binding and formiong a complex.

Study Interesting Compounds: One needs to identify and study the lead compounds that have some activity against a disease. These may be only marginally useful and may have severe side effects. These compounds provide a starting point for refinement of the chemical structures.

Ã�Â· Detect the Molecular Bases for Disease: If it is known that a drug must bind to a particular spot on a particular protein or nucleotide then a drug can be tailor made to bind at that site. This is often modeled computationally using any of several different techniques. Traditionally, the primary way of determining what compounds would be tested computationally was provided by the researchers\' understanding of molecular interactions. A second method is the brute force testing of large numbers of compounds from a database of available structures.

Rational drug design techniques: These techniques attempt to reproduce the researchers\' understanding of how to choose likely compounds built into a software package that is capable of modeling a very large number of compounds in an automated way. Many different algorithms have been used for this type of testing, many of which were adapted from artificial intelligence applications. The complexity of biological systems makes it very difficult to determine the structures of large biomolecules. Ideally experimentally determined (x-ray or NMR) structure is desired, but biomolecules are very difficult to crystallize.

Refinement of compounds: Once you got a number of lead compounds have been found, computational and laboratory techniques have been very successful in refining the molecular structures to give a greater drug activity and fewer side effects. This is done both in the laboratory and computationally by examining the molecular structures to determine which aspects are responsible for both the drug activity and the side effects.

Quantitative Structure Activity Relationships (QSAR): This computational technique should be used to detect the functional group in your compound in order to refine your drug. This can be done using QSAR that consists of computing every possible number that can describe a molecule then doing an enormous curve fit to find out which aspects of the molecule correlate well with the drug activity or side effect severity. This information can then be used to suggest new chemical modifications for synthesis and testing.

Solubility of Molecule: One need to check whether the target molecule is water soluble or readily soluble in fatty tissue will affect what part of the body it becomes concentrated in. The ability to get a drug to the correct part of the body is an important factor in its potency. Ideally there is a continual exchange of information between the researchers doing QSAR studies, synthesis and testing. These techniques are frequently used and often very successful since they do not rely on knowing the biological basis of the disease which can be very difficult to determine.

Drug Testing: Once a drug has been shown to be effective by an initial assay technique, much more testing must be done before it can be given to human patients. Animal testing is the primary type of testing at this stage. Eventually, the compounds, which are deemed suitable at this stage, are sent on to clinical trials. In the clinical trials, additional side effects may be found and human dosages are determined.

Blogs

BIOINFORMATICS IN DRUG DESIGNING

Shrikant Sharma

More from Technology blogs