Strength in numbers: more is better for drug safety/risk prediction

The re-use of data is an ever present issue within the pharmaceutical industry.  Publicly available data is an important resource for companies looking to develop new drugs, and find new uses for current ones.  The search has just become easier with the publication of the DrugMatrix database by the National Toxicology Program.

DrugMatrix is a large molecular toxicology reference database and informatics system. It contains data on the effects of more than 600 therapeutic, industrial and environment chemicals at a variety of doses and exposure times. For each of these compounds, relevant data curated from the literature is available, as well as assay results for inhibition of 132 protein targets. These have been chosen for their importance in drug development, and so among them, we can find drug-metabolizing enzymes or proteins involved in important toxicities.

The core of DrugMatrix is a set of highly standardised toxicological experiments performed in male, Sprague-Dawley rats, resulting in a wealth of data regarding histopathology, clinical chemistry and gene expression responses elicited by 638 compounds. The main strength of this database is that it provides the basis to linking macroscopic observations to alterations in genetic pathways. And the great news is, Safety Intelligence Program (SIP) users can now access this fantastic resource, with the addition of ~88,000 curated assertions with DrugMatrix evidence. The inclusion of the DrugMatrix data in SIP makes it possible to answer questions that were currently not addressable through the DrugMatrix interface, as the data is now integrated with knowledge extracted from several other relevant sources such as Medline, DailyMed and FDA NDAs.

dmblog1

DrugMatrix interface showing the results of an experiment where the administration of 2mg/kg Cisplatin for 3 days caused a 1.4-fold increase in blood urea nitrogen.

To give an example, let’s look at the effects of Cisplatin in rats. According to DrugMatrix, this compound increases blood urea nitrogen level, which is an important safety signal because it is an indicator of renal health. If the kidneys are not working properly and the glomerular filtration rate decreases, blood urea nitrogen will increase. This compound level can also be associated with heart failure, dehydration, fever or high-protein diet.

dmblog2

SIP ToxPath knowledgebase summary matrix showing that Cisplatin-induced increases in blood urea nitrogen occur in different species and relevant datasources

The next thing we might be interested in knowing is whether the effect is replicated in other animal species. While DrugMatrix only includes rat information, a quick search in SIP will point to the answer. Firstly, we will see that apart from DrugMatrix, there are Medline records describing the same observation in Sprague-Dawley and Fischer 344 strains. Increases in blood urea nitrogen are also described in mouse and rabbit. More importantly, this finding is also seen in humans according to DailyMed and the Electronic Medicines Compendium.

But should we worry about kidney function because of this increase? We can search SIP to see if Cisplatin is known to be associated with kidney dysfunction in patients. Again, we can see that Cisplatin is linked to liver disorder-related biomedical observations in 43 assertions from 6 different datasources. This makes sense, as Cisplatin is a well-known nephrotoxicant and kidney toxicity is dose-limiting in this type of chemotherapy.

dmblog4

Summary matrix of associations between Cisplatin and kidney disorder in humans, and the sources where the data has been obtained from.

Finally, we might also be interested in assessing whether compounds that cause an increase in blood urea nitrogen share a similar structure or protein target. While DrugMatrix is an excellent tool for this task, it only queries the 600+ compounds that are included in the dataset. Conversely, at present, SIP contains 85,996 compounds and 22,036 proteins, which are part of 2,574,454 assertions, and this allows the users to expand the search and increases the likelihood of finding meaningful results.

This is a very good example that when it comes to toxicology and pathology data, there is certainly strength in numbers, and when two powerful tools such as these are put together, their usefulness is greatly enhanced.

A small subset of curated SIP assertions linking Cisplatin to kidney disorder in humans.

A small subset of curated SIP assertions linking Cisplatin to kidney disorder in humans.

Genome Browsing via SRS

From time to time, we are asked how genome browsing is supported in SRS.  It’s a very good question, because SRS’ foundation is the integration of life sciences searching, browsing, and launching of bioinformatics analyses.

I have prepared a screencast that illustrates one approach for genome browsing in SRS, as developed by Martin Hilbers.  Even if you don’t require genome browsing, this video could serve as a review of SRS searches and analyses, or as an introduction to how web applications can interface with SRS.

SRS_genome_browse

The importance of sharing data

I was concerned to read about the recent conviction of a former CRO researcher under the U.K’s Good Laboratory Practice law.  Steven Eaton has been charged with manipulating animal study data to show drugs were a success, when in fact they failed. The Medicines and Healthcare Products Regulatory Agency says that he selectively reported drug concentration results, and an internal review found that Eaton had been meddling with data since 2003.

This caused a number of drugs to be delayed on their way to market, as hundreds of safety studies were reviewed, taking up valuable time and resources.  Human lives may also have been at risk, as these drugs go to clinical trials following successful animal studies.

An Omniviz screenshot of pre-clinical data - animals have been clustered according to clinical chemistry parameters.  The intermediate dose group is shown in white.  If more people looked at this data, there would be more chance of realising any significance behind the outlying points.

An Omniviz plot of pre-clinical data – animals have been clustered according to clinical chemistry profile. The intermediate dose group is shown in white, and a clear outlier is visible at the top of the image. Tools such as this give greater visibility over data, and make it easy to spot out of place results.

I guess what this goes to show, is the importance of data visibility and sharing.  If more people see data from a study, it takes the onus off one person to analyse the data and report results, and means there is a greater chance of discrepancies being noticed.  In order to get to human trials, a drug must go through a bare minimum of 9 regulatory toxicology studies, including repeat dose toxicity studies in 2 different species, chronic toxicity studies in 2 species, and a 24 month carcinogenicity study in 2 species.  And that’s just if no safety issues are flagged, and doesn’t account for non-regulatory studies.  If lots of people are allowed access to that data, for one thing it would mean that no one person would be able to manipulate it, but it might also open the door to new uses for a drug.

SEND is the FDA’s preferred way to present non-clinical data, it is completely electronic so the sharing potential is huge.  In a few years’ time, all detailed, tabulated, non-clinical study data submitted to the FDA will need to be in this format, and lots of solutions are available to do this.  This means that not only will data be highly sharable within a particular company, but any studies that are outsourced or otherwise done by someone else will be in the same format, and be accessible to lots of different eyes.  Instem is a provider of key laboratory information management software (LIMS) and also of tools for integration and visualisation of non-clinical data.

How do you define a New Chemical Entity?

What’s the definition of a New Chemical Entity? A new chemical entity (NCE) is, according to the U.S. Food and Drug Administration, a drug that contains no active moiety that has been approved by the FDA in any other application submitted under section 505(b) of the Federal Food, Drug, and Cosmetic Act. This might seem pretty straightforward, but Keryx are currently finding that there are issues. Keryx (as you may have read in the news feeds) are having worries over the potential exclusivity of their new drug, Zerenex, as the active ingredient may not be distinct enough from Otsuka’s FerriSeltz, approved more than 15 years ago (according to IPD Analytics) – and this piqued my curiosity.

So, what is the definition of an “active moiety”?  Again, from the FDA website: “An active moiety means the molecule or ion, excluding those appended portions of the molecule that cause the drug to be an ester, salt (including a salt with hydrogen or coordination bonds), or other noncovalent derivative (such as a complex, chelate, or clathrate) of the molecule, responsible for the physiological or pharmacological action of the drug substance.”  Let’s look at the two compounds that are under scrutiny.

Zerenex_FerriSeltz

Fig 1: 2D chemical images of Zerenex and FerriSeltz active ingredients

Zerenex_FerriSeltz_BMO_matrix

Fig 2: Summary matrix from SIP ToxPath knowledgebase, of associations between either FerriSeltz (ferric ammonium citrate) or Zerenex (ferric citrate) and biomedical observations, systematically extracted from Medline, regulatory documents, and public domain toxicology databases, and formatted into assertional metadata, for rapid search and analysis.

If you look at the structures (Fig.1), you can see the potential similarity. However, the two indications are very different – FerriSeltz was approved for upper gastrointestinal enhancement during MRI;  Zerenex aims to lower blood levels of phosphorous in patients undergoing kidney dialysis. The two compounds also have quite different biological “fingerprints” (Fig 2), looking at the range of effects on various biomedical observations, reported across current and historic key public domain data sources. Looking at the tissue effects of the two active ingredients again gives a different summary matrix (data not shown).  For example, from the biological effects matrix, it can be seen that there are only a few effects that both compounds have in common (effects on ferritin biosynthesis, iron level, oxidative stress) and many effects where there is no overlap (e.g. ALT and AST levels). These data provide more insight into the two different compounds, and hint to possible paths for further investigations.

Meanwhile, I await with interest the outcome of the FDA deliberations – as I am sure, so do Keryx.

OmniViz : Raising the Legacy – A Geek’s Guide to Developing OmniViz 6.1

Some Notes on the Development of the New Version of Omniviz

Last year Instem Scientific became actively involved in planning and developing the new version of OmniViz – our advanced data analytics platform. In particular we decided to address long standing concerns of compatibility with latest versions of Microsoft Windows such as Windows 7 and Windows 8  (64-bit versions), compatibility with Linux and Mac/OSX, as well integration with the latest versions of Microsoft Office 2007 and 2010. This is the behind-the-scenes story….

When Instem acquired BioWisdom

When Instem acquired BioWisdom at the start of 2011 there was a period of review, during which we decided which products would undergo development. One of the products needing some care and attention was OmniViz, Instem Scientific’s tool for exploratory analysis, data projection, text analytics, and more.

OmniViz had seen over a decade of development, which used a ‘bespoke build system’, a term to strike fear into the heart of many a software developer. The build system was a mixture of Makefiles, shell scripts and native applications that were required to run on specifically configured machines with Cygwin and a ClearCase client installed. Then on top of this, there were also additional dependencies on specific network servers. All of this build system needed some attention.

In comparison the source code had had a lucky break, as there had been an effort to move the code and the build system into subversion and so preparatory steps had been taken to extract the code from the legacy build system. This comprised a mixture of scripts, java code and pre-compiled binaries and did mean that the source code had survived.

A third key component of OmniViz is the installer. OmniViz users will know that it does not install as a single executable but is dependent on a whole supporting cast of configuration files and externally managed applications, one of which being the bundled java virtual machine itself. However the fate of the installer was not as fortunate as that of the code base, and so to simplify matters we decided to graft in a new installer.

Find a build system

An initial survey of the landscape showed quite quickly that a new build process needed to be found. Having successfully used the Maven build system on many projects, I decided to give it a try.

So into a clean eclipse workspace I poured the OmniViz source code and all. Then after picking out and smoothing over jar and file dependencies, I added a magic Maven pom and stirred over a low heat until I had some artefacts that would compile.

It turned out that Maven was a great choice for this project. As ever the winning feature of Maven was the dependency management, but this time the code base also had some special compilation requirements which were taken care of, and the modularisation helped to break down a mammoth compilation stage.

Find an install tool

So OmniViz was now compiled, but it still required an installer. OmniViz is quite demanding about its supporting files, and changing the code on this scale was not an option so the demands on the installer were high. After a some experimentation I settled on the open source izpack.

Izpack is a fully java base installed with support for large amounts of customisation, the ability to run ant scripts and lots of plug-in modules. Then to round this off there is a Maven plug-in to support izpack to the whole installer building process could be integrated into the build system without too much blood being spilt.

A welcome side effect of adopting a new installation tool was that the installer could take care of ensuring a valid version of the JRE was installed, while not requiring that the application come bundled with a fixed JRE version.

The first deployable

So now we finally had a deployable application, with optional components. Also using izpack a collection of separate importers were built that could be installed to expand the import features of the application. Also now that OmniViz was using the system JRE the code was updated to support both JRE 6 and 7, bringing it up to date with the current java releases and taking advantage of better operating system integration.

At this point we were also able to run OmniViz properly on Windows 7 and even Windows 8 beta.

32 to 64

The next challenge was to break the 32bit barrier. With computing power ever growing, the restrictions of 32bit applications are getting tight. Users now demand more processing power as data volumes increase and more complex processing is required. But OmniViz had some limitations, as it was dependent on some native 32bit libraries, for features such as OpenGL rendering. So the goal was set to remove the 32bit dependencies, which involved moving to full java2d support, avoiding the use of native libraries and replacing native library functionality with pure java code.

Once these ties were cut we could fire up OmniViz with a 64bit JRE. This was a big step forward as we could start to access data sets that had previously been too big for OmniViz to handle, and our application scientists were starting to demand more powerful 64bit machines with lots of memory to crunch data.

Linux and Mac

As I am not a Windows user through choice and since the shackles of native executables had been cut, the challenge of alternative operating systems lay ahead. So due to the power of java and the clever design of izpack, after a short time OmniViz was up an running on 64bit Linux with a good amount of memory to draw upon when required. After some further installer configuration we also had a version running in-house on the Mac.  While we’re not currently supporting OmniViz on the Linux and Mac, if there’s sufficient demand who knows, as the groundwork has been laid…

Let the trials begin

So, to cut a long story short, back in October 2012 we were able to let the new puppy out on a leash for beta testing. While OmniViz is used daily in-house it has many features that need testing and each user seems to want to use it for different things, and has their own preference for tools that they use regularly. Once beta-testing was complete and the final changes made, OmniViz 6.1 was released in November 2012.  As I have started to learn since dusting off the code base, there is a large user base out there who discovered OmniViz long before I did, and love the functionality and capabilities OmniViz offers. But there’s still room for at least one more. So if you would like to find out more about the tamed beast that is OmniViz get in touch.