logo of xpeng

小鹏汽车Research Intern (Multimodal)

实习兼职地点:深圳 | 北京 | 上海状态:招聘

任职要求


1. 计算机、电子工程、人工智能等相关领域本科及以上学历在读
2. 具有扎实的机器学习算法基础,在计算机视觉自然语言处理、图形学等相关专业领域有研究经验
3. 熟练使用PyTorch/TensorFlow深度学习框架,具备良好的代码实现能力
4. 具有良好的团队合作能力和沟通能力
【加分项】
1. 曾…
登录查看完整任职要求
微信扫码,1秒登录

工作职责


1. 构建行业领先的具身智能原生多模态大模型、世界模型,具备应用于通用人形机器人乃至更多具身场景下的潜力
2. 打造技术影响力,引领国际行业发展
包括英文材料
学历+
机器学习+
算法+
OpenCV+
NLP+
PyTorch+
TensorFlow+
深度学习+
还有更多 •••
相关职位

logo of xpeng
实习

我们致力于构建下一代 空间智能(Spatial Intelligence)系统,让AI不仅能“看懂世界”,更能理解空间结构、推理物体关系、规划行动轨迹,并在虚拟或真实环境中持续学习与演化。 你将与团队一起: 研发具备空间理解、物体感知、轨迹预测与交互规划能力的智能体模型; 构建融合 视觉语言模型(VLM)与世界模型(World Model) 的系统,实现3D场景、深度、物理与可供性(Affordance)的联合建模; 使用 Game Engine(Unreal / Unity / Isaac Sim) 搭建高保真虚拟环境,用于数据生成与智能体评测; 基于 vLLM / Ray 构建高效多模态数据管线,实现大规模生成、自动标注与验证; 推动空间智能在机器人与具身智能领域的应用落地。

更新于 2025-10-27深圳
logo of apple
实习Machine

The computer vision algorithm intern will work in a dynamic team as part of the Video Engineering org which develops multi-modality based video quality assessment technologies in Apple Platform. We balance research and product to deliver the highest quality, state-of-the-art experiences, innovating through the full stack, and partnering with cross-functional teams to influence what brings our vision to life and into customers hands. Keywords: Multi-Modal LLM; Video Quality Assessment; Post-training

更新于 2025-11-04北京
logo of apple
实习Machine

The computer vision algorithm intern will work in a dynamic team as part of the Video Engineering org which develops on-device computer vision and machine perception technologies across Apple’s products. We balance research and product to deliver the highest quality, state-of-the-art experiences, innovating through the full stack, and partnering with cross-functional teams to influence what brings our vision to life and into customers hands. Keywords: Agentic AI; Multi-Modal LLM; Video Foundation Model; Video Generative Editing

更新于 2025-10-21北京
logo of apple
实习Machine

The computer vision algorithm intern will work in a dynamic team as part of the Video Engineering org which develops on-device computer vision and machine perception technologies across Apple’s products. We balance research and product to deliver the highest quality, state-of-the-art experiences, innovating through the full stack, and partnering with cross-functional teams to influence what brings our vision to life and into customers hands. Keywords: Object detection and segmentation; Multiple sensor fusion; Activity Recognition; Video Caption

更新于 2025-10-21北京