字节跳动Content Understanding Multimodal Model Algorithm Engineer | 内容理解多模态大模型算法工程师-电商-筋斗云人才计划

校招全职A19978A2025-05-26地点：新加坡状态：招聘

扫码手机上打开

任职要求

1. Got doctor degree, preferably with a background in artificial intelligence, computer science, or mathematics.
2. Possess solid programming skills, a strong foundation in data structures and algorithms, and proficiency in using various algorithmic and engineering frameworks.
3. Prior publications in international conferences or journals (including but not limited to ACL, EMNLP, NeurIPS, ICML, ICLR, CVPR) are preferred.
4. Strong foundation in machine learning, with in-depth under…

登录查看完整任职要求

微信扫码，1秒登录

工作职责

Team Introduction：
Through algorithm optimization and collaboration with business teams, the team conducts comprehensive quality and ecosystem governance for ByteDance's e-commerce products. This involves combating risks, violations, and low-quality issues, as well as constructing and optimizing a healthy e-commerce ecosystem. The team aims to maximize platform governance effectiveness while improving operational efficiency and reducing costs. Additionally, the team is dedicated to advancing cutting-edge AI technologies to drive business transformation and development through technical innovation, covering diverse fields including but not limited to NLP, CV, multimodal models, large models, graph algorithms, and sequence algorithms.

团队介绍：
平台治理算法团队，通过优化算法，和业务团队协作，对字节旗下的电商产品进行全方位的质量和生态的治理，既包括风险、违规和低质问题的打击，也包括健康电商生态的建设和优化，在最大程度的优化平台治理的效果的同时提升治理的工作效率，降低成本。另外一方面，平台治理算法团队致力于攻坚前沿的AI技术，以技术驱动推动业务的变革和发展，领域涉及广泛，包括但不限于NLP/CV/多模态/大模型/图算法/序列算法等。

课题目标和必要性：
电商智能审核业务比较复杂，随着审核技术的不断演进，各个领域面临着新的风险问题和对抗形式，这对大模型的应用提出了新的挑战。例如，在电商审核业务中，涉及审核PBR变更、长文本、长时序、多语言、少样本和AIGC生成对抗等问题时，现有的开源大模型表现往往不尽人意。因此，针对这些挑战，我们亟需研发专门针对电商智能审核的大模型，以提升其在电商治理中的有效性和适应性。特别的，针对电商业务特点，我们需要探索高质量的数据自动生成、高效的MOE Embedding、Auto-prompt生成、高质量 COT输出、大模型知识蒸馏等。此外，该模型应能够满足电商审核业务的需求，实现高准确率的自主决策和可解释性的COT生成，显著减少误判。针对动态变化的审核PBR变更，它能够通过RAG模块自动检索类似的审核案例，将复杂的审核PBR分解为简单的原子任务，自动拆分出驳回和豁免原子任务，并自动调用相应的Tools来解决这些任务，从而建立“知道拒绝并且知道为何拒绝”的业内领先智能审核系统。最终，大模型智能审核系统的审核效果需要接近或者超过人工审核，往全机审的路线上演进。
课题内容：
电商智能审核多模态大模型，主要研究点包括但不限于：
1、模态融合能力：提升文本、音频、图像、视频和直播等多模态的细粒度理解能力，实现高准确率的自主决策和可解释性的COT生成；
2、Few-Shot能力：探索电商多语言、长时序和少样本问题，增强Few-Shot和Zero-Shot能力，针对多变的业务规则具备复杂指令和Auto-prompt生成能力；
3、攻防对抗能力：研究AIGC图像视频的判别，增强审核大模型对隐晦、抽象的生成式内容的攻防对抗能力；
4、 Agent能力：具备调用RAG模块，使用Tools，和Auto-planning能力；提升大模型的动态推理和反思能力。
涉及的研究方向：大模型，多模态大模型，Few-Shot，AIGC判别，AIGC数据生成，强化学习，Agent。

📮 投递简历 ✨AI模拟面试

难度：

包括英文材料

NeurIPS+

ICML+

ICLR+

CVPR+

NLP+

学历+

还有更多 •••

登录查看完整学习资料

相关职位

Recommendation Large Model Algorithm Engineer | 推荐大模型算法研究工程师-TikTok 算法-筋斗云人才计划

校招A177421

Team Introduction: TikTok is a global short-video platform available in 150 countries and regions. Our mission is to inspire creativity and bring joy by helping users discover real and interesting moments that make life better. TikTok's global headquarters are in Los Angeles and Singapore, and we also have offices in New York City, London, Dublin, Paris, Berlin, Dubai, Jakarta, Seoul, and Tokyo. TikTok Research & Development (R&D) Team: The TikTok R&D team is dedicated to building and maintaining industry-leading products that drive the success of TikTok’s global business. By joining us, you'll work on core scenarios such as user growth, social features, live streaming, e-commerce consumer side, content creation, and content consumption, helping our products scale rapidly across global markets. You'll also face deep technical challenges in areas like service architecture and infrastructure engineering, ensuring our systems operate with high quality, efficiency, and security. Meanwhile, our team also provides comprehensive technical solutions across diverse business needs, continuously optimizing product metrics and improving user experience. Here, you'll collaborate with leading experts in exploring cutting-edge technologies and pushing the boundaries of what's possible. Every line of your code will serve hundreds of millions of users. Our team is professional and goal-oriented, with an egalitarian and easy-going collaborative environment. Research Project Introduction: With the advancement of hardware computing and the continuous breakthroughs of large models in CV/NLP/multimodal learning and even AGI fields, the large computing driven in recommendation scenarios are increasingly capable of capturing user preferences in a more comprehensive and nuanced way. This enables a deeper understanding of user needs and the discovery of latent interests, ultimately leading to enhanced user experiences. As a critical component of short video recommendation systems, the ranking module is responsible for fine-grained matching between users and content, selecting the videos users are most likely to be engaged with. In this context, the key research focus is how to best leverage large computing to maximize the model’s memory, generalization, and reasoning capabilities. 团队介绍： TikTok是一个覆盖150个国家和地区的国际短视频平台，我们希望通过TikTok发现真实、有趣的瞬间，让生活更美好。TikTok 在全球各地设有办公室，全球总部位于洛杉矶和新加坡，办公地点还包括纽约、伦敦、都柏林、巴黎、柏林、迪拜、雅加达、首尔和东京等多个城市。 TikTok研发团队，旨在实现TikTok业务的研发工作，搭建及维护业界领先的产品。加入我们，你能接触到包括用户增长、社交、直播、电商C端、内容创造、内容消费等核心业务场景，支持产品在全球赛道上高速发展；也能接触到包括服务架构、基础技术等方向上的技术挑战，保障业务持续高质量、高效率、且安全地为用户服务；同时还能为不同业务场景提供全面的技术解决方案，优化各项产品指标及用户体验。在这里，有大牛带队与大家一同不断探索前沿，突破想象空间。在这里，你的每一行代码都将服务亿万用户。在这里，团队专业且纯粹，合作氛围平等且轻松。课题介绍：随着硬件算力的发展以及大模型在CV/NLP/多模态以至于AGI领域的不断突破，推荐场景下的大算力驱动能够帮助模型更全面深刻理解用户偏好，进而更好地理解用户需求，挖掘用户潜在兴趣，进而带来更好地用户体验。排序模块作为整个短视频推荐系统中非常重要的一环，承载着用户与视频之间的细粒度匹配挖掘进而挑选出用户最感兴趣的视频。如何找到合适的路径来最大化大算力下模型的记忆、泛化、推理能力，成为了研究的重中之重。

更新于 2025-05-26新加坡

西门子中国研究院大模型AI Agent研究员（北京、苏州）

社招1-3年研发

We empower our people to stay curious and innovative in a fast-evolving world. We’re looking for individuals who are eager to push boundaries, learn continuously, and create meaningful impact both now and in the future. Does that sound like you? Then we’d love to have you join our dynamic and diverse global team. DAI AIX – AI Acceleration and Exploration, is at the forefront of Data Analytics and AI research within Siemens’ global technology network, driving innovation, collaboration, and transformative applications for our customers. As part of our team, you’ll engage in cutting-edge applied research and development.We are currently seeking an NLP/LLM/Agent Engineer/Researcher to work on the development and deployment of next-generation language-related applications and intelligent agents. The focus of this role is advancing the capabilities of large language models (LLMs) and their integration into real-world applications such as autonomous agents and industrial workflows. You will design and implement advanced algorithms, optimize LLM architectures for specific use cases, and develop scalable solutions that drive tangible outcomes in industry. You'll make an impact by • Research on state-of-the-art data analytics & AI technologies on a general range. • Mainly focus on modern foundation model applications in industrial scenarios1. Context engineering for foundation models2. Development of agent systems for industrial applications3. Task-specific model finetuning • Partially work with multi-modal applications • Participating in both internal & external research projects • Assist deployment of customer development/deployment project

更新于 2025-10-23上海|苏州|北京

Large Model Algorithm Researcher (Multimodal & Code AI) | 大模型算法研究员（多模态与Code AI方向）-TikTok AI创新中心-筋斗云人才计划

校招A118205

Team Introduction: The TikTok AI Innovation Center is a department focused on building AI infrastructure and driving cutting-edge research in AI. We explore industry-leading AI technologies, including large language models (LLMs) and multimodal large models, with the goal of developing models that can understand multilingual content and vast amounts of video data, ultimately delivering a better content consumption experience for users. In the Code AI domain, we leverage the powerful code understanding and reasoning capabilities of LLMs to enhance program performance and R&D efficiency. Project Introduction: Multimodal foundation large models (VLM) represent a research hotspot in the industry and a critical technology for TikTok's business scenario applications. In 2024, TikTok's Innovation Center developed VFM V1, a multimodal large model tailored for TikTok's business scenarios. It matches the performance of the best open-source model Qwen VL on public test sets, while significantly outperforming all other foundation models on TikTok's business test sets. In the future, we aim to continuously develop foundation models with efficient perception and reasoning capabilities, capable of handling multilingual and massive video content understanding algorithms to deliver a better content consumption experience for users. Project Challenges: Enhance the multimodal perception encoder: The current encoder uses a fixed frame rate. We need to explore more efficient adaptive frame rates while considering the integration of modalities such as audio and user behavior. How to fuse multimodal perception and thinking capabilities to promote stronger comprehensive perception and cognitive abilities of the model. 团队介绍： TikTok AI创新中心，是致力于AI基础设施建设和创新研究的部门，探索行业领先的人工智能技术，包括大语言模型，多模态大模型等研究方向。我们希望研发能够处理多语言和海量视频内容理解的模型算法，为用户带来更好的内容消费体验。在Code AI方向，我们利用大语言模型强大的代码理解与推理能力，提升程序性能与研发效率。课题介绍：多模态基础大模型VLM 是行业的研究热点，也是TikTok业务场景应用的关键技术，2024年TikTok AI创新中心研发了面向TikTok业务场景的多模态大模型VFM V1，在公开测试集上能够与最好的开源模型 Qwen VL持平，同时在 TikTok 业务测试集上，能够大幅领先所有其它基础模型。未来，我们希望持续研发具有高效感知和推理思考能力的基础模型，能够处理多语言和海量视频内容理解的模型算法，为用户带来更好的内容消费体验。课题挑战： 1、增强多模态感知编码器，当前的编码器是固定帧率，需要探索更高效的自适应帧率，同时考虑音频、用户行为等模态加入； 2、如何融合多模态感知和思考能力，促进更强的模型综合感知和认知能力。

更新于 2025-05-26新加坡

Structured Data Fusion Large Model Researcher | 结构化数据融合大模型研究员-风控-筋斗云人才计划

校招A40464A

Team Introduction: The Risk Control R&D Team is dedicated to addressing various challenges posed by malicious activities across ByteDance's products including Douyin and Toutiao. Their work spans multiple domains of risk governance such as content, transactions, traffic, and accounts. By leveraging technologies such as machine learning, multimodal models, and large models, the team strives to understand user behaviors and content, thereby identifying potential risks and issues. By continuously deepening their understanding of business and user behaviors, the team drives innovation in models and algorithms with an aim to build an industry-leading risk control algorithm system. Project Objectives: Optimize and enhance large models' ability to understand and reason about structured data (sequential data, graph data) based on risk control data. Project Necessity: Data in risk control scenarios is primarily structured, while large models have significantly improved their understanding of text and images. Integrating non-text/image structured data from risk control scenarios with large models to enable better comprehension of structured data remains an industry-wide challenge. This involves three key difficulties: 1. How to effectively align structured information with the NLP semantic space, allowing models to simultaneously understand both data structure and semantic information. 2. How to use appropriate instructions to enable large models to interpret structural information in structured data. 3. How to endow large language models with step-by-step reasoning capabilities for graph learning downstream tasks, thereby inferring more complex relationships and attributes. Project Content: Current industry explorations of structured data include: 1. Graph data understanding (e.g., GraphGPT: Enabling large models to read graph data, SIGIR'2024). 2. Graph data RAG (e.g., Microsoft GraphRAG: Unlocking LLM discovery on narrative private data). 3. Sequential data understanding (e.g., StructGPT: A large model reasoning framework for structured data, EMNLP-2023). However, current efforts mainly focus on understanding single-type structured data, and several challenges remain in risk control scenarios: 1. How to effectively fuse and understand various types of structured data, especially the integration of graph and sequential data. 2. Addressing the challenges mentioned in the ""Project Necessity"" section, particularly the step-by-step reasoning capabilities for downstream tasks, which are currently underexplored—especially reasoning over sequential data. Research Directions: 1. Large model structured data understanding 2. Large model structured data RAG 3. Large model thought chains 团队介绍：风控研发团队致力于解决各个产品（包括抖音、头条等）面临的各种黑灰产对抗问题，涵盖内容、交易、流量、账号等多个方面的风险治理领域。利用机器学习、多模态、大模型等技术对用户行为、内容进行理解从而识别潜在的风险和问题。不断深入理解业务和用户行为，进行模型和算法创新，打造业界领先的风控算法体系。课题介绍： 1、课题目标：以风控数据为基础，优化提高大模型对于结构化数据（序列数据、图数据）的理解推理能力； 2、课题背景：风控场景下的数据主要为结构化数据，而目前大模型对于文本和图像的理解能力有了很大的提升，如何跟风控场景的非文本、图像数据（结构化数据）结合起来，让大模型能够更好的理解结构化的数据，是一个业界难题。面临着三大挑战： 1）如何有效地将结构化的信息与nlp语义空间进行对齐，使得模型能够同时理解数据结构和语义信息； 2）如何用适当的指令使得大模型理解结构化数据中的结构信息； 3）如何赋予大语言模型图学习下游任务的逐步推理能力，从而逐步推断出更复杂的关系和属性。 3、课题内容：目前业界对结构化数据探索有： 1）图数据理解相关GraphGPT：让大模型读懂图数据（SIGIR'2024）； 2）图数据RAG相关GraphRAG：Unlocking LLM discovery on narrative private data； 3）序列数据理解相关StructGPT：面向结构化数据的大模型推理框架（EMNLP-2023）。目前的主要工作都是单一结构数据的理解，在风控场景下还面临几个问题： 1）对各种不同种类的的结构化数据融合理解怎么做，特别是融合图和序列数据的数据理解； 2）针对课题必要性中的问题； 3）对于下游任务的推理能力，目前的研究比较少，针对序列数据的推理能力研究非常少。 4、研究方向：大模型结构化数据理解、大模型结构化数据RAG、大模型思维链。

更新于 2025-05-26新加坡