小鹏汽车Research Intern (Multimodal)

实习兼职2025-07-02地点：深圳 | 北京 | 上海状态：招聘

扫码手机上打开

任职要求

1. 计算机、电子工程、人工智能等相关领域本科及以上学历在读
2. 具有扎实的机器学习算法基础，在计算机视觉、自然语言处理、图形学等相关专业领域有研究经验
3. 熟练使用PyTorch/TensorFlow等深度学习框架，具备良好的代码实现能力
4. 具有良好的团队合作能力和沟通能力
【加分项】
1. 曾…

登录查看完整任职要求

微信扫码，1秒登录

工作职责

1. 构建行业领先的具身智能原生多模态大模型、世界模型，具备应用于通用人形机器人乃至更多具身场景下的潜力
2. 打造技术影响力，引领国际行业发展

📮 投递简历 ✨AI模拟面试

难度：

包括英文材料

学历+

机器学习+

算法+

OpenCV+

NLP+

PyTorch+

TensorFlow+

深度学习+

还有更多 •••

登录查看完整学习资料

相关职位

Research Intern (Spatial Intelligence)

实习

我们致力于构建下一代空间智能（Spatial Intelligence）系统，让AI不仅能“看懂世界”，更能理解空间结构、推理物体关系、规划行动轨迹，并在虚拟或真实环境中持续学习与演化。你将与团队一起：研发具备空间理解、物体感知、轨迹预测与交互规划能力的智能体模型；构建融合视觉语言模型（VLM）与世界模型（World Model）的系统，实现3D场景、深度、物理与可供性（Affordance）的联合建模；使用 Game Engine（Unreal / Unity / Isaac Sim）搭建高保真虚拟环境，用于数据生成与智能体评测；基于 vLLM / Ray 构建高效多模态数据管线，实现大规模生成、自动标注与验证；推动空间智能在机器人与具身智能领域的应用落地。

更新于 2025-10-27深圳

Computer Vision/Machine Learning Intern (Multi-modality LLM)

实习Machine

The computer vision algorithm intern will work in a dynamic team as part of the Video Engineering org which develops multi-modality based video quality assessment technologies in Apple Platform. We balance research and product to deliver the highest quality, state-of-the-art experiences, innovating through the full stack, and partnering with cross-functional teams to influence what brings our vision to life and into customers hands. Keywords: Multi-Modal LLM; Video Quality Assessment; Post-training

更新于 2025-11-04北京

Computer Vision/Machine Learning Intern (Agentic AI)

实习Machine

The computer vision algorithm intern will work in a dynamic team as part of the Video Engineering org which develops on-device computer vision and machine perception technologies across Apple’s products. We balance research and product to deliver the highest quality, state-of-the-art experiences, innovating through the full stack, and partnering with cross-functional teams to influence what brings our vision to life and into customers hands. Keywords: Agentic AI; Multi-Modal LLM; Video Foundation Model; Video Generative Editing

更新于 2025-10-21北京

3D Vision Algorithm Intern (3D Human Reconstruction & Tracking)

实习算法类

Location & Duration Sydney Central; 6-12 months Role Overview You will participate in the research and development of human aesthetic enhancement and spatiotemporally consistent editing technologies at Meitu. You will work directly with real, product-scale datasets and state-of-the-art algorithms. Depending on the internship track, your work may include (but is not limited to): · Fine-grained and controllable image / video aesthetic enhancement · 2D / 3D human tracking and 3D reconstruction · Regression, reconstruction, and structural constraints of digital human models (e.g., SMPL) This role offers the opportunity to produce both production-ready technical outcomes and high-quality academic research results. It is a research-and-engineering-oriented internship, ideal for candidates with strong interest and capability in 3D vision fundamentals, human visual quality enhancement, video generation models, and 3D human modelling. Key Responsibilities · Research and implement algorithms related to depth estimation, multi-view generation, and 2D / 3D tracking with spatiotemporal reconstruction · Follow state-of-the-art 3D vision papers and open-source projects; reproduce experiments and adapt methods to practical applications · Collaborate with data teams to refine the 3D aesthetic development pipeline, improve data collection and quality evaluation, and establish foundations for high-quality scaling · Explore the integration of human structure priors (Skeleton / SMPL / Mesh) with multi-modal cues such as depth, normals, and optical flow in reconstruction and generative models · Assist in building data processing, evaluation, and visualization tools (e.g., immersive video aesthetic editing) to support rapid iteration · Enable high-quality projection of 3D features into 2D visual outputs, with the goal of producing A-level or above academic publications

更新于 2026-01-21