Improve Drug Safety through Deep Learning

Improve Drug Safety through Deep Learning

Pharmaceutical companies use ProQuest Dialog to support regulation compliance. Information professionals at these “Big Pharma” companies construct complicated searches, tailored to specific drugs and run them against medical databases like Medline and Embase. The primary user need is to retrieve regulatory compliance information as quickly as possible while maintaining precision of results.

Once the search results are retrieved from ProQuest Dialog, these literary references are ingested into a specialized workflow tool called the Drug Safety Triager. Each reference is manually reviewed through a process known as literature review. Highly qualified and trained screeners review each reference against a specific set of criteria. If the reference meets any of the specific criteria, it is then subjected to further review by drug safety specialists, and ultimately reported to regulatory agencies such as the U.S. Food and Drug Administration (FDA) and European Medicines Agency (EMA). Drug safety screening is critical to patient safety. It is expensive due to the need for highly experienced screeners and medical practitioners. The process is time intensive due to the manual nature of the work.

The objective of this project is to leverage to power of machine learning to improve the literature review process. By using historical data generated by the manual review process, we will train
multiple models to identify references that contain reportable drug safety information. We will look at identifying various features of these documents, and look for ways in which to streamline the literature review process. We will build a simple UI workflow suggesting the criteria identified by the model to the literature screener.

As a starting point, we have several models we have developed in-house, including Convolutional Neural Networks (CNN) with LSTM, Linear Regressions, and Naive Bayes, and we will attempt to both improve upon these models and explore new and innovative approaches. We will start with some classic, basic machine learning models and move on toward more complex and cutting edge
techniques. Approaches we will consider may include (but will not be limited to):
Deep Learning
Natural Language Processing (NLP)
Convolutional Neural Networks (CNN)
Recurrent Neural Networks (RNN)
Bayesian Algorithms
Long Short Term Memory (LSTM)
Decision Trees
Supervised/Unsupervised Learning

This project requires students to develop an understanding of a typical drug safety review process used by top pharmaceutical companies. They will have access to massive quantities of real-world,
labeled data, and to the powerful hardware (e.g. AWS GPU instances) required to process it. A successful project execution will result in a web UI that will apply the machine learning models to a
literature reference and make suggestions to the user.


Students who successfully match to this project team will be required to sign the following two documents in January 2018:

Click here to view Student IP Agreement

Click here to view NDA

How to Apply

Project Features

  • Skill level All levels
  • Students 5-7 Students
  • Likely Majors Any, CE, CS, ECE, EE, IOE, MATH, SI, STATS
  • Course Substitutions A&D Elec, Honors, IOE Capstone, IOE Grad, Data Science, ECE Cognate, EECS 498, SI, SI PEP, EE MDE, CE MDE
  • IP & NDA Required? Yes
  • Summer Opportunity See Complete Description for Details
  • Machine Learning and Data Science

    Strong (or intermediate) machine learning experience. Familiarity with algorithm creation and selection, training techniques.

    • Likely Majors: CE, CSE/CS-LSA, Data Science
  • Software Development

    Basic programming experience. Web programming experience using HTML and CSS. Web service creation and core HTTP concepts JavaScript experience with tools like AngularJS or React.

    • Likely Majors: CE, EE, CSE/CS-LSA, Any
  • Human Factors Product Design

    Experience presenting complex data in easy to understand way. Experience designing interactions based on structured data. Usability testing. Use case modeling, requirements gathering Ability to code in HTML, CSS, JavaScript a plus.

    • Likely Majors: SI, IOE
  • Algorithm and Data Analysis

    • Likely Majors: CSE/CS-LSA, MATH, STATS

Sponsor Mentor: Kevin Hastie Screen Shot 2017-08-14 at 18.23.34
Lead Software Architect
Kevin earned his Computer Science degree at Washington University in St. Louis, then migrated to Chicago to work for a few startups and tech companies. But what brought him to Ann Arbor was actually the in-between. He wanted to be somewhere big enough to be challenged and not get bored—but somewhere small enough to see real innovation occur.
Faculty Mentor: Sugih Jamin 
Associate Professor, EECS
Sugih Jamin is an Associate Professor in the Department of Electrical Engineering and Computer Science at the University of Michigan. He received his Ph.D. in Computer Science from the University of Southern California, Los Angeles in 1996. He received the National Science Foundation (NSF) CAREER Award in 1998, the Presidential Early Career Award for Scientists and Engineers (PECASE) in 1999, and the Alfred P. Sloan Research Fellowship in 2001. He co-founded a peer-to-peer live streaming company, Zattoo, in 2005.