logo of amd

AMDAI Framework Engineer

社招全职 Engineering地点:上海状态:招聘

任职要求


Skilled engineer with strong technical and analytical expertise in C++ development within Linux environments. The ideal candidate will thrive in both collaborative team settings and independent work, with the ability to define goals, manage development efforts, and deliver high-quality solutions. Strong problem-solving skills, a proactive approach, and a keen understanding of software engineering best practices are essential.   KEY RESPONSIBILITIES:  Optimize Deep Learning Frameworks: Enhance and optimize frameworks like TensorFlow and PyTorch for AMD GPUs in open-source repositories. Develop GPU Kernels: Create and optimize GPU kernels to maximize performance for specific AI operations. Develop & Optimize Models: Design and optimize deep learning models specifically for A…
登录查看完整任职要求
微信扫码,1秒登录

工作职责


THE ROLE:  As a core member of the team, you will play a pivotal role in optimizing and developing deep learning frameworks for AMD GPUs. Your experience will be critical in enhancing GPU kernels, deep learning models, and training/inference performance across multi-GPU and multi-node systems. You will engage with both internal GPU library teams and open-source maintainers to ensure seamless integration of optimizations, utilizing cutting-edge compiler technologies and advanced engineering principles to drive continuous improvement.
包括英文材料
C+++
Linux+
开发框架+
还有更多 •••
相关职位

logo of amd
社招 Enginee

Position Overview We are seeking a highly experienced engineer specializing in large language model (LLM) inference performance optimization. You will be a core member of our team, responsible for building and optimizing the LLM inference performance with high-throughput, low-latency on AMD Instinct GPUs. If you are passionate about pushing performance boundaries and have deep, hands-on expertise with cutting-edge technologies like vLLM or SGLang, we invite you to join us. Key Responsibilities 1. Core System Optimization: Lead the development, tuning, and customization of LLM performance optimization on AMD GPUs, leveraging and extending frameworks like vLLM or SGLang to address performance bottlenecks in production environments. 2. Performance Analysis & Tuning: Conduct end-to-end performance profiling using specialized tools. Perform deep optimization of compute-bound operators (e.g., Attention), memory I/O, and communication to significantly increase throughput and reduce latency. 3. Model Architecture Adaptation: Demonstrate expertise in mainstream LLM architectures (e.g., DeepSeek, Qwen, Llama, ChatGLM) and optimize inference for their specific characteristics (e.g., RoPE, SWA, MoE, GQA). 4. Algorithm & Principle Application: Leverage your deep understanding of core algorithms (Transformer, Attention, MoE) to implement advanced optimization techniques such as PagedAttention, FlashAttention, continuous batching, quantization, and model compression. 5. Technology Foresight & Implementation: Research and prototype state-of-the-art optimization techniques (e.g., Speculative Decoding, Weight-Only Quantization) and drive their adoption into production systems. Qualifications: Mandatory

更新于 2025-12-02上海
logo of amd
社招 Enginee

THE ROLE: AMD is looking for a world class AI frameworks engineer who can provide technical leadership in the development of various AI frameworks in the AMD ecosystem. You will need to drive technical direction for next generation frameworks for AI model training and inference for wide variety of AMD devices, current and future, such as MI Instinct, and Radeon GPUs, XDNA devices, including the recently released Ryzen AI, Alveo V70 and Versal ACAP, and datacenter CPUs such as EPYC. You will work enhance the AI framework capabilities to enable cutting-edge models on onto AMD’s cutting-edge hardware.

更新于 2025-08-21北京
logo of nvidia
社招

NVIDIA is now looking for LLM Train Framework Engineers for the Megatron Core team. Megatron Core is open-source, scalable, and cloud-native frameworks built for researchers and developers working on Large Language Models (LLM) and Multimodal (MM) foundation model pretraining and post-training. Our GenAI Frameworks provide end-to-end model training, including pretraining, alignment, customization, evaluation, deployment, and tooling to optimize performance and user experience. Build on Megatron Core Framework's capabilities by inventing advanced distributed training algorithms and model optimizations. Collaborate with partners to implement optimized solutions. What you’ll be doing: • Build and develop open source Megatron Core. • Address extensive AI training and inference obstacles, covering the entire model lifecycle including orchestration, data pre-processing, conducting model training and tuning, and deploying models. • Work at the intersection of AI applications, libraries, frameworks, and the entire software stack. • Spearhead advancements in model architectures, distributed training strategies, and model parallel approaches. • Enhance the pace of foundation model training and optimization through mixed precision formulas and advanced NVIDIA GPU structures. • Performance tuning and optimizations of deep learning framework and software components. • Research, prototype, and develop robust and scalable AI tools and pipelines.

更新于 2025-10-13上海
logo of microsoft
社招Technolo

• Drive technical sales with decision makers using demos and PoCs to influence solution design and enable production deployments. • Lead hands-on engagements—hackathons, code-with sessions, and architecture workshops—to accelerate adoption of Microsoft’s developer tools and cloud platforms. • Build trusted relationships with developers and platform leads, co-designing secure, scalable architectures and solutions • Resolve technical blockers and objections, collaborating with engineering to share insights and improve products. • Maintain deep expertise in AI Foundry & App architecture (Agentic AI framework, Semantic Kernel, Foundry SDK, Responsible AI) and App architecture/cloud native dev (APIs, containerization, microservices, event-driven, Python, Java or .NET). • Maintain and grow expertise in AI Management & Security (Gen AI Ops, Sentinel, orchestrator, monitoring). • Represent Microsoft through thought leadership in developer communities and customer forums

更新于 2025-09-26深圳