小鹏汽车Research Scientist-Reinforcement Learning
任职要求
学历要求:硕士及以上学历,具有机器人、控制、人工智能、计算机、自动化等相关背景。 技术能力: 扎实的强化学习理论基础,熟悉主流算法(如 PPO、TD3、SAC、Behavior Cloning 等)。 熟悉 PyTorch、Isaac Gym、Isaac Lab、MuJoCo、Gym、RLlib 等工具和框架。 熟悉运动控制、动力学建模、人形机器人运动规划等相关知识。 熟悉并行训练、分布式采样、多环境仿真等加速训练手段。 编程能力:精通 Python,熟悉C++ 加分;良好的代码组织和工程化能力。 英文能力:能阅读英文论文和文档,跟进前沿研究成果。
工作职责
强化学习算法研发与优化 设计并实现适用于人形机器人的强化学习算法(如 PPO、SAC、TD3、RLHF 等)。 探索基于 模仿学习、分层强化学习 等方法提升训练效率和泛化能力。 仿真环境构建与训练调试 熟练使用 Isaac Gym、Isaac Lab、MuJoCo 等构建高保真仿真环境。 搭建从感知到控制的闭环 RL 训练系统,包括奖励设计、状态定义、终止条件等模块。 在仿真中对人形机器人进行行走、站立、奔跑、上下坡、障碍避让等技能的训练和调试。 算法评估与系统优化 设计通用评估指标评估策略稳定性、收敛速度、鲁棒性等。 对训练 pipeline 进行系统优化(如并行采样、分布式训练、重参数化等)。 与机器人硬件团队协作 推动仿真到真实(Sim2Real)落地,参与策略在真实人形机器人上的迁移与调试。 参与系统集成和调试,包括控制接口适配与策略部署。
We empower our people to stay resilient and relevant in a constantly changing world. We're looking for people who are always searching for creative ways to grow and learn. People who want to make a real impact, now and in the future. Does that sound like you? Then it seems like you'd make a great addition to our vibrant international team. DAI AIX – AI Acceleration and Exploration, is working on the cutting-edge research of Data Analytics and AI with Siemens global technology network, and consulting, co-creation, data driven applications for the end customers. Research Scientist is to do applied research for Industrial AI applications in the team. We are seeking a Reinforcement Learning (RL) Specialist to lead the design, implementation, and optimization of RL-driven systems for post-training of foundation models. The primary focus of this role is advancing our RL capabilities for real-world applications such as industrial control systems and LLM agents. You will develop cutting-edge algorithms, improve post-training efficiency, and deploy scalable RL solutions in industry. You'll make an impact by • Research on state-of-the-art data analytics & AI technologies on a general range. • Mainly focus on modern foundation model applications in industrial scenarios1. Context engineering for foundation models2. Development of agent systems for industrial applications3. Task-specific model finetuning • Partially work with multi-modal applications • Participating in both internal & external research projects • Assist deployment of customer development/deployment project
We empower our people to stay resilient and relevant in a constantly changing world. We're looking for people who are always searching for creative ways to grow and learn. People who want to make a real impact, now and in the future. Does that sound like you? Then it seems like you'd make a great addition to our vibrant international team. DAI AIX – AI Acceleration and Exploration, is working on the cutting-edge research of Data Analytics and AI with Siemens global technology network, and consulting, co-creation, data driven applications for the end customers. Research Scientist is to do applied research for Industrial AI applications in the team. We are seeking a Reinforcement Learning (RL) Specialist to lead the design, implementation, and optimization of RL-driven systems for post-training of foundation models. The primary focus of this role is advancing our RL capabilities for real-world applications such as industrial control systems and LLM agents. You will develop cutting-edge algorithms, improve post-training efficiency, and deploy scalable RL solutions in industry. You'll make an impact by • 1. Reinforcement learning development for post-training: • Design and implement state-of-the-art RL algorithms (e.g., PPO, SAC, DQN) for post-training of foundation models like LLMs and time series foundation models. • Implement distributed RL training pipelines using frameworks like Ray RLlib, Deepspeed, or custom solutions. • Design and implement benchmark pipelines for model evaluation. • 2. Align foundation models like LLMs and time series foundation models with specific areas/tasks through techniques like SFT, RL. • 3. Coding & Infrastructure: • Write production-grade Python code using PyTorch, numpy, and pandas. • Manage Linux-based clusters for distributed training and deployment. • 4. All other support required by the line manager if necessary.
• Design and implement advanced LLM-based architectures and agentic systems for real-world product scenarios.• Lead model training and evaluation efforts, including data preprocessing, fine-tuning, and inference optimization.• Collaborate across teams to deliver robust, scalable models aligned with product objectives and user value.• Apply and adapt research ideas to solve practical challenges in reasoning, planning, memory, and alignment.• Monitor and improve model performance post-deployment through data-driven iteration and error analysis.• Contribute to technical discussions, model reviews, and best practices within the applied science community.
1. Generative AI Model Development: -Design and develop generative AI models, including language models, image generation models, and multimodal models. -Explore and implement advanced techniques in areas such as transformer architectures, attention mechanisms, and self-supervised learning. -Conduct research and stay up-to-date with the latest advancements in the field of generative AI. 2. Data Acquisition and Preprocessing: -Identify and acquire relevant data sources for training generative AI models. -Develop robust data preprocessing pipelines, ensuring data quality, cleanliness, and compliance with ethical and regulatory standards. -Implement techniques for data augmentation, denoising, and domain adaptation to enhance model performance. 3. Model Training and Optimization: -Design and implement efficient training pipelines for large-scale generative AI models. -Leverage distributed computing resources, such as GPUs and cloud platforms, for efficient model training. -Optimize model architectures, hyperparameters, and training strategies to achieve superior performance and generalization. 4. Model Evaluation and Deployment: -Develop comprehensive evaluation metrics and frameworks to assess the performance, safety, and bias of generative AI models. -Collaborate with cross-functional teams to ensure the successful deployment and integration of generative AI models into client solutions. 5. Collaboration and Knowledge Sharing: -Collaborate with data engineers, software engineers, and subject matter experts to develop innovative solutions leveraging generative AI. -Contribute to the firm's thought leadership by presenting at conferences, and participating in industry events.