英伟达Deep Learning Performance Architect - Intern - 2026
任职要求
• BS or higher degree in a relevant technical field (CS, EE, CE, Math, etc.). • Strong programming skills in Python, C, C++. • Strong background in computer architecture. • Experience with performance modeling, architecture simulation, profiling, and analysis. • Prior experience with LLM or generative AI algorithms. Ways to stand out from the crowd: • GPU Computing and parallel programming models such as CUDA and OpenCL. • Architecture of or workload analysis on other deep learning accelerators. • Deep neural network training, inference and optimization in …
工作职责
NVIDIA is developing processor and system architectures that accelerate deep learning and high-performance computing applications. We are looking for an intern deep learning system performance architect to join our AI performance modelling, analysis and optimization efforts. In this position, you will have a chance to work on DL performance modelling, analysis, and optimization on state-of-the-art hardware architectures for various LLM workloads. You will make your contributions to our dynamic technology focused company. What you’ll be doing: • Analyze state of the art DL networks (LLM etc.), identify and prototype performance opportunities to influence SW and Architecture team for NVIDIA's current and next gen inference products. • Develop analytical models for the state of the art deep learning networks and algorithm to innovate processor and system architectures design for performance and efficiency. • Specify hardware/software configurations and metrics to analyze performance, power, and accuracy in existing and future uni-processor and multiprocessor configurations. • Collaborate across the company to guide the direction of next-gen deep learning HW/SW by working with architecture, software, and product teams.
1、深入探索LLM在搜索场景中的推理能力与深度研究(Deep Research)模式,优化信息整合与总结效果,打造高效、精准的智能搜索产品,推动AI技术在实际应用中的突破; 2、AI搜索总结Agent研发: 1)设计并实现基于LLM的搜索总结Agent,提升搜索结果的理解、推理与结构化总结能力; 2)探索LLM Reasoning技术(如思维链、多步推理),优化复杂查询的Deep Research模式,实现长文本理解与跨文档信息融合; 3)构建端到端系统,涵盖意图识别、知识检索、结果生成与偏好对齐,提升用户体验; 3、模型优化及应用: 1)通过指令微调(Instruction Tuning)、偏好对齐(RLHF/DPO)等技术优化模型在搜索场景的适应性; 2)探索多模态信息(文本、代码、结构化数据)融合的搜索与生成技术; 3)研究未来生活中的创新应用场景(如个性化知识助手、自动化研究工具),探索技术边界。
1、负责搭建快手NLP技术体系,包括但不限于文本分类、知识图谱、翻译、对话等; 2、与业务部门进行沟通与协作,交付满足产品需求的核心算法模型与能力。
1、负责AI小快智能助理机器人的研究和开发; 2、优化基础模型,并采用RAG、Agent等大模型衍生框架,来提升相关业务指标; 3、持续跟进并深入调研大模型前沿技术、开源方案,跟踪业内大模型领域的最新进展并推进相关研究,探寻将最新技术应用到AI小快的可能性。
• Be a subject matter expert on databases, particularly on Relational Databases, able to discuss with customer on database modelling, migration, performance testing and day to day operations • Have wide ranging experience with open source and commercial databases such as MySQL, PostgreSQL, Oracle & SQL Server…etc • Be familiar with major cloud service providers in using it for deployment of workloads (AWS, Alicloud, GCP, Azure…etc) • Be a technical expert on all aspects of OceanBase (compatibility assessment, deployment, administration, development, migration…etc) • Facilitate introduction, discussion and demonstration of OceanBase’s technology, vision and value proposition either via individual or group sessions with customers • Engage and discover prospects’ pain points, business/technical challenges and identify how OceanBase can add value • Run Proof of Concept projects with customers to validate OceanBase’s capabilities Support post sale OceanBase implementation with OceanBase delivery team • Maintain deep understanding of competitive as well as complementary technologies and how to position OceanBase DB in relation to them • Provide guidance on how to resolve customer-specific technical challenges • Collaborate with Product Management, Engineering, and Marketing to continuously improve OceanBase product and position in the market