
地平线【地瓜机器人】大模型部署工程师
社招全职芯片序列地点:北京 | 南京状态:招聘
包括英文材料
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
缓存+
https://hackernoon.com/the-system-design-cheat-sheet-cache
The cache is a layer that stores a subset of data, typically the most frequently accessed or essential information, in a location quicker to access than its primary storage location.
https://www.youtube.com/watch?v=bP4BeUjNkXc
Caching strategies, Distributed Caching, Eviction Policies, Write-Through Cache and Least Recently Used (LRU) cache are all important terms when it comes to designing an efficient system with a caching layer.
https://www.youtube.com/watch?v=dGAgxozNWFE
vLLM+
https://www.newline.co/@zaoyang/ultimate-guide-to-vllm--aad8b65d
vLLM is a framework designed to make large language models faster, more efficient, and better suited for production environments.
https://www.youtube.com/watch?v=Ju2FrqIrdx0
vLLM is a cutting-edge serving engine designed for large language models (LLMs), offering unparalleled performance and efficiency for AI-driven applications.
TensorRT+
https://docs.nvidia.com/deeplearning/tensorrt/latest/getting-started/quick-start-guide.html
This TensorRT Quick Start Guide is a starting point for developers who want to try out the TensorRT SDK; specifically, it demonstrates how to quickly construct an application to run inference on a TensorRT engine.
LMDeploy+
https://lmdeploy.readthedocs.io/en/latest/get_started/get_started.html
This tutorial shows the usage of LMDeploy on CUDA platform.
llama.cpp+
https://blog.steelph0enix.dev/posts/llama-cpp-guide/
No LLMs were harmed during creation of this post.
https://github.com/ggml-org/llama.cpp/discussions/15396
This is a detailed guide for running the new gpt-oss models locally with the best performance using llama.cpp.
https://www.youtube.com/watch?v=EPYsP-l6z2s
In this guide, you'll learn how to run local llm models using llama.cpp.
相关职位

社招软件序列
1、负责地瓜芯片平台算法工具链、算法参考方案的开发和维护。 2、负责大客户的芯片算法工具链的技术交流,承接定制化需求开发和问题支持; 3、跟踪机器人算法趋势,承担LLM/VLM等新算法在地瓜芯片平台的调优和部署; 4、挖掘和掌握客户需求,了解行业竞品发展趋势、分析评测等工作。
更新于 2025-02-10

社招算法序列
1、参与前沿算法部署:视觉算法模型导出,量化,工程部署与优化,应用示例开发。 2、参与算法方案交付工作:构建算法业务流,算法方案迭代与测试,交付工作支持。 3、探索机器人的无限可能,将各类算法或开源项目应用于机器人,探索具身智能。
更新于 2025-09-29

社招算法序列
负责机器人领域端侧大模型的研发和应用,研判大模型未来发展趋势为后续芯片NPU规划提供输入。主要工作方向包括: 1. 探索LLM、VLM、VLA大模型在端侧性能与精度极限 2. 跟进与判断大模型发展趋势为后续芯片NPU规划提供输入
更新于 2025-08-18