小鹏汽车Research Scientist-Reinforcement Learning
任职要求
学历要求:硕士及以上学历,具有机器人、控制、人工智能、计算机、自动化等相关背景。 技术能力: 扎实的强化学习理论基础,熟悉主流算法(如 PPO、TD3、SAC、Behavior Cloning 等)。 熟悉 PyTorch、Isaac Gym、Isaac Lab、MuJoCo…
工作职责
强化学习算法研发与优化 设计并实现适用于人形机器人的强化学习算法(如 PPO、SAC、TD3、RLHF 等)。 探索基于 模仿学习、分层强化学习 等方法提升训练效率和泛化能力。 仿真环境构建与训练调试 熟练使用 Isaac Gym、Isaac Lab、MuJoCo 等构建高保真仿真环境。 搭建从感知到控制的闭环 RL 训练系统,包括奖励设计、状态定义、终止条件等模块。 在仿真中对人形机器人进行行走、站立、奔跑、上下坡、障碍避让等技能的训练和调试。 算法评估与系统优化 设计通用评估指标评估策略稳定性、收敛速度、鲁棒性等。 对训练 pipeline 进行系统优化(如并行采样、分布式训练、重参数化等)。 与机器人硬件团队协作 推动仿真到真实(Sim2Real)落地,参与策略在真实人形机器人上的迁移与调试。 参与系统集成和调试,包括控制接口适配与策略部署。
We empower our people to stay resilient and relevant in a constantly changing world. We're looking for people who are always searching for creative ways to grow and learn. People who want to make a real impact, now and in the future. Does that sound like you? Then it seems like you'd make a great addition to our vibrant international team. DAI AIX – AI Acceleration and Exploration, is working on the cutting-edge research of Data Analytics and AI with Siemens global technology network, and consulting, co-creation, data driven applications for the end customers. Research Scientist is to do applied research for Industrial AI applications in the team. We are seeking a Reinforcement Learning (RL) Specialist to lead the design, implementation, and optimization of RL-driven systems for post-training of foundation models. The primary focus of this role is advancing our RL capabilities for real-world applications such as industrial control systems and LLM agents. You will develop cutting-edge algorithms, improve post-training efficiency, and deploy scalable RL solutions in industry. You'll make an impact by • 1. Reinforcement learning development for post-training: • Design and implement state-of-the-art RL algorithms (e.g., PPO, SAC, DQN) for post-training of foundation models like LLMs and time series foundation models. • Implement distributed RL training pipelines using frameworks like Ray RLlib, Deepspeed, or custom solutions. • Design and implement benchmark pipelines for model evaluation. • 2. Align foundation models like LLMs and time series foundation models with specific areas/tasks through techniques like SFT, RL. • 3. Coding & Infrastructure: • Write production-grade Python code using PyTorch, numpy, and pandas. • Manage Linux-based clusters for distributed training and deployment. • 4. All other support required by the line manager if necessary.
• Design and implement advanced LLM-based architectures and agentic systems for real-world product scenarios.• Lead model training and evaluation efforts, including data preprocessing, fine-tuning, and inference optimization.• Collaborate across teams to deliver robust, scalable models aligned with product objectives and user value.• Apply and adapt research ideas to solve practical challenges in reasoning, planning, memory, and alignment.• Monitor and improve model performance post-deployment through data-driven iteration and error analysis.• Contribute to technical discussions, model reviews, and best practices within the applied science community.
• Owns the science roadmap for grounding—including retrieval, re-ranking, attribution, and reasoning—driving initiatives from problem framing to production impact. Designs and evolves state-of-the-art retrieval and RAG orchestration across documents, tables, code, and images. • Builds citation and provenance systems (e.g., passage highlighting, quote-level alignment, confidence scoring) to reduce hallucinations and increase user trust. Leads experimentation and evaluation using A/B testing, interleaving, NDCG, MRR, precision/recall, and calibration curves to guide measurable trade-offs. • Advances tool-augmented grounding through schema-aware retrieval, function calling, knowledge graph joins, and real-time connectors to databases, cloud object stores, search indexes, and the web. Partners with platform engineering to productionize models with scalable inference, embedding services, feature stores, caching, and privacy-compliant multi-tenant systems. • Nurtures collaborative relationships with product and business leaders across Microsoft, influencing strategic decisions and driving business impact through technology. Authors white papers, contributes to internal tools and services, and may publish research to generate intellectual property. • Bridges the gap between researchers (e.g., Microsoft Research) and development teams, applying long-term research to solve immediate product needs. Leads high-stakes negotiations to ensure cutting-edge technologies are applied practically and effectively. • Identifies and solves significant business problems using novel, scalable, and data-driven solutions. Shapes the direction of Microsoft and the broader industry through pioneering product and tooling work. • Mentors applied scientists and data scientists, establishing best practices in experimentation, error analysis, and incident review. Collaborates cross-functionally with PMs, research, infrastructure, and security teams to align on milestones, SLAs, and safety protocols. • Communicates clearly through design documentation, progress updates, and presentations to executives and customers. Contributes to ethics and privacy policies, identifies bias in product development, and proposes mitigation strategies.
Are you looking to work at the forefront of Machine Learning and AI? Would you be excited to apply Generative AI algorithms to solve real world problems with significant impact? The Generative AI Innovation Center helps AWS customers implement Generative AI solutions and realize transformational business opportunities. This is a team of strategists, scientists, engineers, and architects working step-by-step with customers to build bespoke solutions that harness the power of generative AI. Starting in 2024, the Innovation Center launched a new Custom Model and Optimization program to help customers develop and scale highly customized generative AI solutions. The team helps customers imagine and scope bespoke use cases that will create the greatest value for their businesses, define paths to navigate technical or business challenges, develop and optimize models to power their solutions, and make plans for launching solutions at scale. The GenAI Innovation Center team provides guidance on best practices for applying generative AI responsibly and cost efficiently. You will work directly with customers and innovate in a fast-paced organization that contributes to game-changing projects and technologies. You will design and run experiments, research new algorithms, and find new ways of optimizing risk, profitability, and customer experience. We’re looking for Applied Scientists capable of using GenAI and other techniques to design, evangelize, and implement state-of-the-art solutions for never-before-solved problems. As an Applied Scientist, you will - Collaborate with AI/ML scientists and architects to research, design, develop, and evaluate generative AI solutions to address real-world challenges - Interact with customers directly to understand their business problems, aid them in implementation of generative AI solutions, brief customers and guide them on adoption patterns and paths to production - Help customers optimize their solutions through approaches such as model selection, training or tuning, right-sizing, distillation, and hardware optimization - Provide customer and market feedback to product and engineering teams to help define product direction