英伟达Senior DGX Cloud AI Infrastructure Software Engineer
任职要求
• Minimum of 8+ years of experience in developing software infrastructure for large scale AI systems. • Bachelor's degree or higher in Computer Science or a related technical field (or equivalent experience). • Strong debugging skills and experience in analyzing and triaging AI applications from the application level to the hardware level. • Proven track record in building and scaling large-scale distributed systems. • Experience with AI training and inferencing and data infrastructure services. • Familiar in operating large-scale observability platforms for monitoring and logging (e.g., ELK, Prometheus, Loki). • Proficiency in programming languages such as Python, C/C++, script languages • Excellent communication and collaboration skills, and a culture of diversity, intellectual curiosity, problem solving, and openness are essential. Ways to stand out from the crowd: • Experience in working with the large scale AI cluster • Strong understanding of NVIDIA GPUs, network technologies (RDMA, IB, NCCL) • Good understanding on DL frameworks internal PyTorch, TensorFlow, JAX, and Ray • Experience and root cause analysis of failures and datacenter scale • Strong background in software design and development. NVIDIA leads the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions, from artificial intelligence to autonomous cars. NVIDIA is looking for exceptional people like you to help us accelerate the next wave of artificial intelligence.
工作职责
Joining NVIDIA's DGX Cloud Team means contributing to the infrastructure that powers our innovative AI research. This team focuses on optimizing efficiency and resiliency of AI workloads, as well as developing scalable AI and Data infrastructure tools and services. Our objective is to deliver a stable, scalable environment for AI researchers, providing them with the necessary resources and scale to foster innovation. We are seeking an AI infrastructure software engineer to join our team. You'll be instrumental in designing, building, and maintaining AI infrastructure that enable large-scale AI training and inferencing. The responsibilities include implementing software and systems engineering practices to ensure high efficiency and availability of AI systems.As a senior DGX Cloud AI Infrastructure software engineer at NVIDIA, you will have the opportunity to work on innovative technologies that power the future of AI and data science, and be part of a dynamic and supportive team that values learning and growth. The role provides the autonomy to work on meaningful projects with the support and mentorship needed to succeed, and contributes to a culture of blameless postmortems, iterative improvement, and risk-taking. If you are seeking an exciting and rewarding career that makes a difference, we invite you to apply now! What you’ll be doing: • Develop infrastructure software and tools for large-scale AI, LLM, and GenAI infrastructure. • Develop and optimize tools to improve infrastructure efficiency and resiliency. • Root cause and analyze and triage failures from the application level to the hardware level • Enhance infrastructure and products underpinning NVIDIA's AI platforms. • Co-design and implement APIs for integration with NVIDIA's resiliency stacks. • Define meaningful and actionable reliability metrics to track and improve system and service reliability. • Skilled in problem-solving, root cause analysis, and optimization.
NVIDIA data center systems, such as DGX and HGX, have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. These platforms bring together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We are hiring Sr. Software Engineer who will help build simulators for our DGX Server platforms. Simulations play a significant role in building scalable systems at Speed of Light! You will work with world class engineering teams across HW and SW. What you’ll be doing: • Contribute to architect and develop simulation platform for next-gen NVIDIA DGX platforms. • Build, integrate and enhance simulator components with new HW features and write supporting technical documents. • Bring full SW stack up on DGX Simulator; work closely with hardware modeling, kernel & platform driver teams distributed globally. • Improve performance, fix bugs across user and kernel stack, and automate execution flow.
Role 1、Collabration with multiple 3rd party service providers to manage mobility full lifecycle cases. 2、As a main point of contact of employees to provide excelent service including conducting briefing calls, responding internal instant message tool, emails and telephone in a timely manner. 3、Assist in developing communication and materials in relation to Global Mobility matters. 4、Monitor regular reports and ensure billing accuracies. 5、Other job duties as assigned by superior.
About Level Infinite Level Infinite 是腾讯旗下国际游戏业务品牌,致力为全球玩家带来充满乐趣、原汁原味的游戏体验,让玩家随时随地进入游戏世界;并通过打造包容、互通、便捷的玩家社区,促进分享交流。 Level Infinite为合作工作室提供一系列支持及服务,助力其释放产品潜能。 Level Infinite发行产品包括《PUBG MOBILE》、《Goddess of Victory: NIKKE 》《Honor of King》等热门游戏,并与合作工作室联手推出了来自Fatshark的《战锤40K:暗潮》、Funcom的《沙丘:觉醒》、Inflexion Studios的《夜莺》在内多款产品。 想进一步了解Level Infinite,请访问www.levelinfinite.com,并在Twitter、Facebook、Instagram和Youtube上关注我们的官方账号。 Job Responsibilities 1、根据发行战略布局,协助业务负责人制定业务发展策略,提供行业和品类产品发展策略支持; 2、负责部门产品的预算及资源管理工作,包括制定资源分配标准,建立各类子场景预算管理机制,管理预算使用及效果数据,优化资源使用效率; 3、负责协调与组织内外部资源,确保项目团队相关方协同工作、目标一致。 Work Location: China-Shenzhen
This senior legal counsel is a member of the M&A legal team and will work and advise on the company’s international and domestic mergers, acquisitions, dispositions, joint ventures, investments and similar transactions. Responsibilities includes:. 1.Drafting, reviewing and negotiating transaction documents, such as equity share and asset subscription/purchase agreements, partnership/LLC/shareholders agreements, investment agreements, confidentiality agreements, side letters, engagement letters, letter of intent, bid letters, transaction term sheets, etc. 2.Working closely and coordinating with the company’s M&A investment team, financial and tax teams, divisional business teams with the review, analysis, implementation and execution of transactions. 3.Managing and coordinating due diligence reviews and reviewing relevant due diligence reports. 4.Retaining, managing and liaising with outside counsels and overseeing the working products provided by outside counsels. 5.Providing legal support for post-investment company management. 6.Contributing to the various internal knowledge management initiatives, including a precedent document database, template agreements, presentations and training initiatives.