The Statistics Online Computational Resource (SOCR) develops novel AI methods, computational tools, modeling apps and AI infrastructure for analyzing “Big Data.” The latter are very large, heterogeneous, time-varying, multisource, and incomplete datasets that are difficult to interpret and model in meaningful ways using classical probability, statistical, or algorithmic approaches. The SOCR team designs and disseminates educational materials, web-services, and advanced data science methods and tools in probability, statistics, machine learning, and health analytics. This research team will:

  1. Enhance SOCR analysis toolbox and visualization components with an emphasis on Big Biomedical and Neuroscience Data. The toolbox will be designed to run in a web browser and enhance the visual presentation and interpretation of Big Data. The creation of the toolbox will allow many more researchers (including students) to learn about, appreciate, contribute, and apply complex analytics to their work, making Big Data much easier to turn into “impactful results” and actionable “decision making.”
  2. Implement powerful, modern, and portable webapps (HTML5/JavaScript/Rshiny/Rmarkdown/Jupiter) that can be used to model various interesting processes, enable exploratory and quantitative data analyses, and facilitate the understanding of high-dimensional and complex information.
  3. Develop advanced AI/ML data analytics, e.g., generative foundational AI models and applications, compressive big data analytics, statistical obfuscation techniques, and Bayesian approaches to address specific biomedical, healthcare, neuroimaging-genetics, and other applications.
  4. Expand the novel Spacekime Analytics method for mathematical representation, statistical inference, and computational prediction of large longitudinal information.

More details are provided on the SOCR Research website ( 

Team Organization

Each SOCR sub-team is coached by an experienced student that reports to the SOCR faculty and the PI. Sub-teams are mostly focused around developing the mathematical foundations, building particular algorithms, and designing statistical approaches for addressing applications. The sub-teams are flexibly structured to promote creativity, provide opportunity for student growth, and nurture team-science. We have the following project sub-teams: SOCRAT, CBDA, DataSifter, Data Analytics, Data Science Fundamentals, Spacekime analytics, (see SOCR website). As students develop skills and build confidence, they should expect increasing responsibility on assignments with multiple parts of the SOCR Lab. 

Highly-motivated and self-driven students are encouraged to apply. First-year undergraduates through master’s students will be matched to ongoing and new SOCR R&D Projects. The most engaged students will be encouraged to stay on the team for more than the two-semester minimum and may be supported through summer SOCR/MDP Fellowships. Student leadership roles are available in the lab, and experienced students will be a natural fit for these positions as their knowledge grows over time.

Below are the skills needed for this project. Students with the following relevant skills and interest in the project are encouraged to apply! Although the team consists of subteams, students apply to the project as a whole, rather than individual roles on the team.

Programming (3 Students)

Preferred skills: HTML5, JavaScript, Web-based functional development, Intuitive UI/UX design, Experience with Adobe Illustrator, Canvas, and/or R/Python a plus

Likely majors: CS, SI, ANY

Analytics (3 Students)

Preferred skills: Amazon AWS Elastic Computing, Statistical modeling, high-throughput data analytics, machine learning, R/Python


Methods (DataSifter & CBDA) (4 Students)

Preferred skills: Technical math background, AI/ML, R-computing


Data Science Fundamentals (3 Students)

Preferred Skills: Students with strong mathematics and physics background and significant computational R-programming skills. Strong motivation and interests in graduate-level fundamentals of data science principles are necessary. Trainees will work directly with the PI. Students should be familiar with information measures, entropy KL divergence, ODEs/PDEs, Dirac’s bra-ket operators. Review the website.

Likely majors: PHYSICS, MATH or ENGR background 

Apprentice Researcher (4 Students)

Requirements: Interest in project material, willingness to develop skills. Open to first- and second-year undergraduate students ONLY.


Faculty Sponsor

Ivo Dinov giving a talk

Ivo Dinov

Professor, Computational Medicine and Bioinformatics, Health Behavior and Biological Sciences; and Associate Director, Michigan Precision Health, Education and Training Workgroup.

Dr. Dinov is a professor of Computational Medicine and Bioinformatics, Health Behavior and Biological Sciences at the University of Michigan, the director of the Statistics Online Computational Resource (SOCR), and an associate director at the Michigan Precision Health, Education and Training Workgroup. Dr. Dinov develops advanced mathematical models for representation, scientific computing, statistical analysis and interactive visualization of multi-dimensional, multimodal and informatics biomedical data (Big Data). With expertise in human brain imaging, statistical computing and high-throughput distributed data processing, he approaches biomedical and health science research from the perspective of team-science Big Data applications in informatics, multimodal biomedical image analysis, distributed genomics computing, complex-time representation of longitudinal data, spacekime analytics, and health analytics.

Weekly Meeting Time and Location: For MDP academic credit, MDP-SOCR R&D courses (e.g., ENGR 255/355, NURS 995, ENG 455/599) are hybrid – we will mainly meet face-to-face twice a semester, and very often coordinate synchronously via SOCR Zoom-channel, and asynchronously via Cloud services. The entire SOCR team gathers once each semester (in-person, with Zoom support), and twice each month typically on Fridays or Tuesdays at 8 am – 9 am ET via SOCR-Zoom channel (distance synchronous communication), and weekly in smaller project-specific team sessions. We coordinate asynchronously all progress, challenges, developments and interactions via G-Drive. Each sub-team arranges a convenient time to meet and work together following university guidelines. Annual, two-term enrollment commitments begin each January.

Course Substitutions: Honors, CS-ENG/DS-ENG/EE/CE-ENGR 355 and higher can count toward Flex Tech

These substitutions/departmental courses are available for students in these respective majors.  MDP does not yet have a formal agreement with other departments for substitutions/departmental courses not listed.  Please reach out to your home department’s academic advisor about how you might apply MDP credits to your degree plan. 

Citizenship Requirements: This project is open to all students on campus

IP/NDA: Students who successfully match to this project team will be required to sign an Intellectual Property (IP) Agreement prior to participation in January 2024.

Summer Opportunity: Summer research fellowships may be available for qualifying students

More information is available at 

Learn more about the expectations for this type of MDP project