Photo captions provide context to images, facilitate better communication, and increase accessibility. The student team will use Azure machine learning services, in addition to student developed functions, to build a tool including an API that is capable of creating image captions when given an image input.
Photo captions go beyond merely labeling an image; they add depth, meaning, and context to visual content. They facilitate better communication, understanding, and emotional connections between the photographer or content creator and their audience.
A photo caption provides context and information about the subject, location, and event captured in the image. It helps viewers understand the photo’s purpose and significance, preventing misinterpretations. Good photo captioning can enhance engagement with the subject. In online contexts, descriptive captions contribute to search engine optimization (SEO). They also improve the discoverability of images, by making them more likely to appear in relevant search results. Captions increase accessibility for people using auto reading software, and can also be helpful for individuals with slow internet connections or restricted data plans, where images might not load quickly, or at all. In these cases, a caption can provide a description of the image they can’t see.
The student team will use Azure machine learning services to build an API that can create image captions when given an image input.
The team will then develop front end tools that use this API. That might include:
- An Adobe InDesign plugin
- A Drupal/WordPress plugin (These are the primary CMS’s used on campus)
- A Google Chrome plugin for Google Docs/Slides
Many photos in UM’s online presence (web, social media, etc.) have poorly written captions, or no captions at all. This reduces the quality of overall communication, and creates a barrier to those who rely on screen readers to engage with the online environment.
Minimum Viable Product Deliverable (Minimum level of success)
- Literature review of the current technology for photo recognition, captioning and best practices in accessibility
- Perform user interviews with key stakeholders within the University who would use the tool; staff in marketing, accessibility, IT infrastructure, etc.
- Develop a first version end-to-end prototype demonstrating the feasibility of the concept. Collect feedback from key stakeholders
- The backend model will be chosen for the students, but students will have to use the Azure platform to build an API connection to the backend
- The Team will need to properly document this API so that others can use it
- The team will then build a few front-end systems to utilize the API and provide image captions to marketing and other teams on campus
- Indesign, Drupal, and Google may be selected as options
Expected Final Deliverable (Expected level of success)
- Incorporate the feedback from key stakeholders
- Provide a 2nd version of the prototype end-to-end system with additional functionality as determined by student team
- Verify the extent to which the prototype meets the requirements of the project
Stretch Goal Opportunities: (High level of success)
- Build at least 10-15 front end clients
- Track usage by University departments
- Track changes to captions made by humans
Accessibility Design (1-2 Students)
Specific Skills: Practical knowledge of best practice in accessibility design, interest in developing further skills in this area – experience with screen readers would be a big plus
Students must have basic coding or prototyping skills, and be prepared participate in technical development
Likely Majors: SI
Machine Learning (2-3 Students)
Specific Skills: General knowledge and skills in Machine Learning/Artificial Intelligence, experience incorporating ML and AI techniques into general programming front end/back end
Students should have completed at least 1 course in AI/ML/CV and EECS 281 (or equivalent. Experience with Microsoft Azure platform is highly valued
Likely Majors: CS, DATA, MATH
Front End/Back End Programming (2-3 Students)
Specific Skills: General programming skills, good software engineering practice and design, willingness to quickly develop new tech stack skills
EECS 281 (or equivalent) is required
Likely Majors: CS, DATA
Additional Desired Skills/Knowledge/Experience
- Please include a description of your experience with the different elements in the tech stack in your Experience & Interest Form: Azure ML services, Azure API manager, Python, OpenAI Api, huggingface API’s
- Practical experience developing any of following plugins: Drupal; WordPress; Adobe InDesign; Google Chrome or Google Slides
- Experience working with Microsoft Azure Machine Learning Module
- The ML development will utilize Python; students should have experience in Python, or be prepared to quickly develop their skills
- Experience and enthusiasm for working in the evolving field of AI
- An interest in Design for Accessibility and improving accessibility of online resources. Experience with best practice implementation of accessibility standards
Sponsor and Faculty Mentor
Interim Director of Emerging Technology and AI Services at ITS
Don has 28 years of IT experience and has led numerous infrastructure projects. He has a particular interest in process improvement and planning the adoption of new IT services. On the weekends Don enjoys car repair and auto racing.
Project Meetings: During the winter 2024 semester, the U-M ITS team will meet in the Duderstadt Center on Mondays from 2:00 – 4:00 PM.
Work Location: Most of the work will take place on campus in Ann Arbor.
Course Substitutions: CE MDE, ChE Elective, CS Capstone/MDE, DS Capstone, EE MDE, CoE Honors, IOE Senior Design, SI Elective/Cognate
Citizenship Requirements: This project is open to all students on campus. International Students: CPT declaration (curricular practical training) is NOT required for this project because the sponsor (ITS Department) is part of the University.
IP/NDA: Students will sign standard University of Michigan IP/NDA documents.
Summer Project Activities: There will be no summer activity for this project.