小鹏汽车计算平台高级软件工程师
社招全职3年以上地点:广州状态:招聘
任职要求
1. 计算机 / 软件工程硕士或同等经验,3年及以上大规模数据处理经验;有大规模模型训练与推理场景支持经验者优先; 2. 精通 Python,具备扎实软件工程基础,良好编程规范和代码质量意识; 3. 有以下至少一项实际项目经验;两项及以上者优先: a. 大规模数据加载机制(如 PyTorch DataLoader、NVIDIA DALI、TensorFlow Dataset、Hugging Face Datasets) b. Parquet/ORC 等列式存储格式及相关生态(如Petastorm),能设计高效的分区、压缩与向量化读取流程,优化批量数据访问性能。 c. Linux文件系统与网络I/O,能针对NFS、对象存储等场景进行性能调优;有云存储系统(如阿里云OSS、CPFS、火山引擎vePFS)相关经验。 4. 具备关系型数据库(MySQL/Po…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1. 负责小鹏汽车“扶摇”AI平台数据处理相关的软件开发工作,包括数据加载工具(XDataLoader)和数据集管理平台(XDataset),提供统一的数据加载、转换、缓存与预取能力;目标解决大规模数据加载过程中出现的性能瓶颈、数据一致性、系统稳定性等问题,服务AI大模型的训练和推理; 2. 开发并维护高性能 DataLoader SDK,支持自定义采样、并行读取、缓存预取与数据增强等功能,优化多线程/进程流水线,降低I/O与预处理延迟,简化算法团队接入并提升加载效率; 3. 搭建通用Dataset管理系统,实现多源异构数据(图片、视频、点云、传感器等)的统一接入、解析与格式化; 4. 协同算法团队及其他技术团队,深入理解业务需求,快速响应并落地实现。
包括英文材料
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
编程规范+
[英文] Google Style Guides
https://google.github.io/styleguide/
Every major open-source project has its own style guide: a set of conventions (sometimes arbitrary) about how to write code for that project. It is much easier to understand a large codebase when all the code in it is in a consistent style.
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
Parquet+
https://www.youtube.com/watch?v=KLFadWdomyI
Learn all about Apache Parquet, a column-based file format that's popular in the Hadoop/Spark ecosystem.
Linux+
https://ryanstutorials.net/linuxtutorial/
Ok, so you want to learn how to use the Bash command line interface (terminal) on Unix/Linux.
https://ubuntu.com/tutorials/command-line-for-beginners
The Linux command line is a text interface to your computer.
https://www.youtube.com/watch?v=6WatcfENsOU
In this Linux crash course, you will learn the fundamental skills and tools you need to become a proficient Linux system administrator.
https://www.youtube.com/watch?v=v392lEyM29A
Never fear the command line again, make it fear you.
https://www.youtube.com/watch?v=ZtqBQ68cfJc
性能调优+
https://goperf.dev/
The Go App Optimization Guide is a series of in-depth, technical articles for developers who want to get more performance out of their Go code without relying on guesswork or cargo cult patterns.
https://web.dev/learn/performance
This course is designed for those new to web performance, a vital aspect of the user experience.
https://www.ibm.com/think/insights/application-performance-optimization
Application performance is not just a simple concern for most organizations; it’s a critical factor in their business’s success.
https://www.oreilly.com/library/view/optimizing-java/9781492039259/
Performance tuning is an experimental science, but that doesn’t mean engineers should resort to guesswork and folklore to get the job done.
MySQL+
https://juejin.cn/post/7190306988939542585
这是一篇 MySQL 通关一篇过硬核经验学习路线,包括数据库相关知识,SQL语句的使用,数据库约束,设计等。
[英文] MySQL Tutorial
https://www.mysqltutorial.org/
your go-to resource for mastering MySQL in a fast, easy, and enjoyable way.
https://www.youtube.com/watch?v=5OdVJbNCSso
MySQL SQL tutorial for beginners
https://www.youtube.com/watch?v=7S_tz1z_5bA
This beginner-friendly course teaches you SQL from scratch.
PostgreSQL+
[英文] PostgreSQL Tutorial
https://neon.com/postgresql/tutorial
This PostgreSQL tutorial helps you quickly understand PostgreSQL.
[英文] PostgreSQL Tutorial
https://www.pgtutorial.com/
This PostgreSQL tutorial will teach you about PostgreSQL from beginner to advanced.
https://www.youtube.com/watch?v=qw--VYLpxG4
It is the most advanced open source database system widely used to build back-end systems.
https://www.youtube.com/watch?v=SpfIwlAYaKk
Learn PostgreSQL, one of the world's most advanced and robust open-source relational database systems.
NoSQL+
https://nosql-database.org/
Everything about NoSQL Systems – Types, Benefits, and Real-World Uses
https://piaosanlang.gitbooks.io/mongodb/content/section1.1.html
NoSQL(NoSQL = Not Only SQL ),即"不仅仅是SQL",指的是非关系型的数据库。是对不同于传统的关系型数据库管理系统的统称。
https://www.youtube.com/watch?v=0buKQHokLK8
NoSQL databases can operate in multiple modes: as key-value store, document store or wide column store.
Redis+
[英文] Developer Hub
https://redis.io/dev/
Get all the tutorials, learning paths, and more you need to start building—fast.
https://www.runoob.com/redis/redis-tutorial.html
REmote DIctionary Server(Redis) 是一个由 Salvatore Sanfilippo 写的 key-value 存储系统,是跨平台的非关系型数据库。
https://www.youtube.com/watch?v=jgpVdJB2sKQ
In this video I will be covering Redis in depth from how to install it, what commands you can use, all the way to how to use it in a real world project.
MongoDB+
https://learnxinyminutes.com/mongodb/
MongoDB is a NoSQL document database for high volume data storage.
https://studio3t.com/academy/#courses
The fastest way to learn MongoDB
https://www.youtube.com/watch?v=c2M-rlkkT5o
This video will give you and introduction to MongoDB in 1 Hour. Afterwards I recommend exploring aggregation, replication, and sharding.
https://www.youtube.com/watch?v=ExcRbA7fy_A&list=PL4cUxeGkcC9h77dJ-QJlwGlZlTd4ecZOA
You'll learn how to use MongoDB (a NoSQL database) from scratch. You'll also learn how to integrate it into a simple Node.js API.
缓存+
https://hackernoon.com/the-system-design-cheat-sheet-cache
The cache is a layer that stores a subset of data, typically the most frequently accessed or essential information, in a location quicker to access than its primary storage location.
https://www.youtube.com/watch?v=bP4BeUjNkXc
Caching strategies, Distributed Caching, Eviction Policies, Write-Through Cache and Least Recently Used (LRU) cache are all important terms when it comes to designing an efficient system with a caching layer.
https://www.youtube.com/watch?v=dGAgxozNWFE
Apache+
https://www.apache.org/
The Apache® Software Foundation (ASF) provides software for the public good, guided by community over code.
Ray+
https://github.com/ray-project/ray
Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://www.youtube.com/watch?v=FhXfEXUUQp0
In this video, I'll teach you everything you need to know about Apache Ray!
https://www.youtube.com/watch?v=fMiAyj2kgac
Using powerful machine learning algorithms is easy using Ray.io and Python.
https://www.youtube.com/watch?v=q_aTbb7XeL4
Parallel and Distributed computing sounds scary until you try this fantastic Python library.
还有更多 •••
相关职位
社招5年以上A120749
1、负责Devops平台/运维平台的整体架构设计和技术选型,制定技术发展路线; 2、主导Devops工具链的建设和集成,包括但不限于CI/CD、配置管理、监控告警、日志分析等工具; 3、优化和改进现有运维流程,通过自动化等方式提高运维效率,降低运维成本; 4、负责平台的性能优化、安全加固和高可用性设计,保障平台的稳定运行,并编写和维护平台相关的技术文档和操作手册,提供技术支持和培训。
更新于 2025-01-08北京
社招5年以上技术类
1、负责金融平台实时交易链路与清结算相关业务的测试工作,包含但不限于收费计息、公司行动、期权期货、结单、税务等业务,涉及服务端、web端以及全流程测试2、参与需求评审,以专业测试视角对需求合理性进行评估,并提出建议和意见3、根据产品需求、技术方案文档,设计并执行高质量测试用例,保证对需求的全面覆盖4、运用先进测试工具和自动化方法,提高测试效率和项目质量5、持续优化测试流程,与开发、产品等跨团队协作,共同提升产品品质
更新于 2025-08-28深圳
社招3年以上后端开发
容器统一调度与在离线混部方向 岗位职责 1.负责公司容器调度平台的架构设计和核心功能开发,包括容器资源管理、调度优化、弹性伸缩等模块。 2.设计和实现在线与离线任务的混部调度方案,优化集群资源的整体利用率,实现计算、存储和网络资源的高效调度。 3.针对不同业务场景,研究并改进 Kubernetes 调度算法,包括任务优先级、抢占机制、节点选择等,提升集群的资源分配效率和稳定性。 4.与多集群管理平台、资源隔离、QoS 管理等模块协同工作,确保在复杂场景下的资源调度策略具备高可用性和可扩展性。 5.跟踪云原生生态的最新发展趋势,研究并应用新技术以提升系统性能和调度灵活性。 6.支持系统的性能监控与故障诊断,参与系统优化和技术问题的快速解决,保障系统的高效稳定运行。
更新于 2025-09-13上海|北京|杭州