苹果Computer Vision/Machine Learning Intern (Multi-modality LLM)
任职要求
Minimum Qualifications • M.S. or PhD in Electrical Engineering/Computer Science or a related field (mathematics, physics or computer engineering), with a focus on computer vision and/or machine learning • Rich experiences in video machine learning covering one of the topics: Multi-Modal LLM / Video Quality Assessment/ Post Training • Proven prototyping skills and proficient in coding (C, C++, Python) • Excellent written and verbal communications skills, be comfortable presenting research to l…
工作职责
The computer vision algorithm intern will work in a dynamic team as part of the Video Engineering org which develops multi-modality based video quality assessment technologies in Apple Platform. We balance research and product to deliver the highest quality, state-of-the-art experiences, innovating through the full stack, and partnering with cross-functional teams to influence what brings our vision to life and into customers hands. Keywords: Multi-Modal LLM; Video Quality Assessment; Post-training
The computer vision algorithm intern will work in a dynamic team as part of the Video Engineering org which develops on-device computer vision and machine perception technologies across Apple’s products. We balance research and product to deliver the highest quality, state-of-the-art experiences, innovating through the full stack, and partnering with cross-functional teams to influence what brings our vision to life and into customers hands. Keywords: Agentic AI; Multi-Modal LLM; Video Foundation Model; Video Generative Editing
The computer vision algorithm intern will work in a dynamic team as part of the Video Engineering org which develops on-device computer vision and machine perception technologies across Apple’s products. We balance research and product to deliver the highest quality, state-of-the-art experiences, innovating through the full stack, and partnering with cross-functional teams to influence what brings our vision to life and into customers hands. Keywords: Object detection and segmentation; Multiple sensor fusion; Activity Recognition; Video Caption

We are seeking students motivated to advance the state-of-the-art in computer vision and generative AI. Our projects will broadly focus on image/video generation and understanding, 3D generation and reconstruction. Through collaboration, we aim to make significant product impacts and publish seminal works in top-tier conferences. Key Responsibilities: 1. Conduct research and development in computer vision and generative modeling, with a focus on image and video generation and editing. 2. Implement and experiment with state-of-the-art methods and models. 3. Collaborate closely with researchers and engineers to explore new research directions and contribute to impactful product solutions. 4. Contribute to research publications in top-tier venues. Basic