logo of xpeng

小鹏汽车GPU tools高级/资深/专家工程师

社招全职1年以上地点:上海状态:招聘

任职要求


职位要求:
1.一年以上GPU工具开发经验,C/C++编程功底扎实。
2.熟悉CUDA工具之一的原理设计和实现。
3.熟悉CUDA工具的UI和功能。
4.工作积极主动,良好的分析和解决问题能力

工作职责


1.负责GPGPU CUDA tools设计开发,比如gpusmi,nsight,trace,debug等工具。
2.tools UI开发。
3.支持和驱动交互。
包括英文材料
C+
C+++
CUDA+
相关职位

logo of sensetime
社招后端开发

1. Architect and implement agentic workflows that plan, reason, call tools/APIs, and coordinate with humans or other agents. 2. Select, extend, or build frameworks (e.g., LangChain, AutoGen, CrewAI, MetaGPT, LangGraph) to accelerate delivery while avoiding vendor lock-in. 3. Own the MLOps lifecycle: data collection, evaluation harnesses, safety filters, CI/CD, and observability for deployed agents. 4. Integrate enterprise systems & data sources (REST/GraphQL, Kafka, vector databases, Kubernetes) so agents can act on real business objects. 5. Mentor and review code for junior engineers; drive best practices in prompt engineering, evaluation, and secure coding. 6. Research emerging techniques (toolformer, self-reflection, role specialization) and translate findings into the product roadmap.

更新于 2025-07-23
logo of microsoft
社招Software

• Lead the software development in C/C++, Python, and in GPU languages such as CUDA, ROCm, or Triton• Analyze metrics and identify opportunities based on offline and online testing, develop and deliver robust and scalable solutions.• Work with cutting-edge hardware stacks and a fast-moving software stack to deliver best-of-class inference and optimal cost.• Engage with key partners to understand and implement inference and training optimization for state-of-the-art LLMs and other models.

更新于 2025-09-23
logo of nvidia
社招

• Architect Performance Tooling: Develop infrastructure tools/libraries for GPU performance analysis, visualization, and automated workflows used across GPU SW/HW development life cycle.   • Unlock Architectural Insights: Analyze GPU workloads to identify bottlenecks and define new hardware profiling features that enhance perf debug and profiling capabilities.  • AI-Powered Automation: Build AI/ML-driven tools to automate performance analysis, generate perf optimization guidance, and improve user experience of profiling infrastructure.  • Cross-Stack Collaboration: Partner with kernel developers, system software teams, and hardware architects to support performance study, improve CUDA software stack, and co-design performance-centric solutions for current and next-generation GPU architecture

更新于 2025-09-18
logo of nvidia
社招

We are seeking a skilled developer to build production-grade AI applications, focusing on LLM-based agents and tool-using systems. You will integrate large language models (LLMs), retrieval-augmented generation (RAG), and external tools/APIs on GPU-accelerated stacks, enhancing agent frameworks for reliability, scalability, and safety. What You’ll Be Doing: • Design, implement, and deploy AI-powered features using LLMs, including autonomous and multi-agent workflows. • Build agent toolchains, including planning, tool/function calling, memory management, RAG integration, and enterprise API connectivity. • Enhance agent frameworks with custom planners, routers, concurrency control, state management, and retry mechanisms. • Develop evaluation and observability systems to monitor agent performance (success rates, tool-call accuracy, latency, cost, traces). • Implement safety and compliance measures, including content filtering, PII handling, and policy enforcement using guardrail frameworks. • Optimize inference pipelines for GPU performance, latency, and cost; deploy via microservices and APIs. • Manage CI/CD, containerization, and deployment; maintain monitoring, logging, and alerting; and produce clear documentation.

更新于 2025-10-10