字节跳动Large Model Application Algorithm Research Engineer|大模型应用算法研究工程师-国际化内容安全算法研究-筋斗云人才计划
任职要求
1. Got doctor degree in Computer Science, Electronics, or other related fields. 2. Extensive experience in ML/CV/NLP/Recommendation Systems, including but not limited to: a. Participation in competitions or industry projects in ML, Data Mining, CV, NLP, or Multimodal. b. Publications in conferences in ML, data mining, AI, or large models (e.g., KDD, WWW, NIPS, ICML, CVPR, ACL, AAAI etc). c. Plus points: 1) Research experience or innovation in large models or RL. 2) Strong hands-on skills with contributions to large model projects in the open-source community. 3) Practical experience in deploying large models in real-world busines…
工作职责
Team Introduction: TikTok Content Security Algorithm Research Team The International Content Safety Algorithm Research Team is dedicated to maintaining a safe and trustworthy environment for users of ByteDance's international products. We develop and iterate on machine learning models and information systems to identify risks earlier, respond to incidents faster, and monitor potential threats more effectively. The team also leads the development of foundational large models for products. In the R&D process, we tackle key challenges such as data compliance, model reasoning capability, and multilingual performance optimization. Our goal is to build secure, compliant, and high-performance models that empower various business scenarios across the platform, including content moderation, search, and recommendation. Research Project Background: In recent years, Large Language Models (LLMs) have achieved remarkable progress across various domains of natural language processing (NLP) and artificial intelligence. These models have demonstrated impressive capabilities in tasks such as language generation, question answering, and text translation. However, reasoning remains a key area for further improvement. Current approaches to enhancing reasoning abilities often rely on large amounts of Supervised Fine-Tuning (SFT) data. However, acquiring such high-quality SFT data is expensive and poses a significant barrier to scalable model development and deployment. To address this, OpenAI's o1 series of models have made progress by increasing the length of the Chain-of-Thought (CoT) reasoning process. While this technique has proven effective, how to efficiently scale this approach in practical testing remains an open question. Recent research has explored alternative methods such as Process-based Reward Model (PRM), Reinforcement Learning (RL), and Monte Carlo Tree Search (MCTS) to improve reasoning. However, these approaches still fall short of the general reasoning performance achieved by OpenAI's o1 series of models. Notably, the recent DeepSeek R1 paper suggests that pure RL methods can enable LLM to autonomously develop reasoning skills without relying on the expensive SFT data, revealing the substantial potential of RL in advancing LLM capabilities. 团队介绍: 国际化内容安全算法研究团队致力于为字节跳动国际化产品的用户维护安全可信赖环境,通过开发、迭代机器学习模型和信息系统以更早、更快发掘风险、监控风险、响应紧急事件,团队同时负责产品基座大模型的研发,我们在研发过程中需要解决数据合规、模型推理能力、多语种性能优化等方面的问题,从而为平台上的内容审核、搜索、推荐等多项业务提供安全合规,性能优越的基座模型。 课题介绍: 课题背景: 近年来,大规模语言模型(Large Language Models, LLM)在自然语言处理和人工智能的各个领域都取得了显著的进展。这些模型展示了强大的能力,例如在生成语言、回答问题、翻译文本等任务上表现优异。然而,LLM 的推理能力仍有很大的提升空间。在现有的研究中,通常依赖于大量的监督微调(Supervised Fine-Tuning, SFT)数据来增强模型的推理性能。然而,高质量 SFT 数据的获取成本高昂,这对模型的开发和应用带来了极大的限制。 为了提升推理能力,OpenAI 的 o1 系列模型通过增加思维链(Chain-of-Thought, CoT)的推理过程长度取得了一定的成功。这种方法虽然有效,但在实际测试时如何高效地进行扩展仍是一个开放的问题。一些研究尝试使用基于过程的奖励模型(Process-based Reward Model, PRM)、强化学习(Reinforcement Learning, RL)以及蒙特卡洛树搜索算法(Monte Carlo Tree Search, MCTS)等方法来解决推理问题,然而这些方法尚未能达到 OpenAI o1 系列模型的通用推理性能水平。最近deepseek r1在论文中提到通过纯强化学习的方法,可以使得 LLM 自主发展推理能力,而无需依赖昂贵的 SFT 数据。这一系列的工作都揭示着强化学习对LLM的巨大潜力。 课题挑战: 1、Reward模型的设计:在强化学习过程中,设计一个合适的reward模型是关键。Reward模型需要准确地反映推理过程的效果,并引导模型逐步提升其推理能力。这不仅要求对不同任务精准设定评估标准,还要确保reward模型能够在训练过程中动态调整,以适应模型性能的变化和提高。 2、稳定的训练过程:在缺乏高质量SFT数据的情况下,如何确保强化学习过程中的稳定训练是一个重大挑战。强化学习过程通常涉及大量的探索和试错,这可能导致训练不稳定甚至模型性能下降。需要开发具有鲁棒性的训练方法,以保证模型在训练过程中的稳定性和效果。 3、如何从数学和代码任务上拓展到自然语言任务上:现有的推理强化方法主要应用在数学和代码这些CoT数据量相对丰富的任务上。然而,自然语言任务的开放性和复杂性更高,如何将成功的RL策略从这些相对简单的任务拓展到自然语言处理任务上,要求对数据处理和RL方法进行深入的研究和创新,以实现跨任务的通用推理能力。 4、推理效率的提升:在保证推理性能的前提下,提升推理效率也是一个重要挑战。推理过程的效率直接影响到模型在实际应用中的可用性和经济性。可以考虑利用知识蒸馏技术,将复杂模型的知识传递给较小的模型,以减少计算资源消耗。另外,使用长思维链(Long Chain-of-Thought, Long-CoT)技术来改进短思维链(Short-CoT)模型,也是一种潜在的方法,以在保证推理质量的同时提升推理速度。
1、 负责语言大模型(Large Language Model)的技术研究,包括但不限于Pretrain、SFT、RL等技术相关的算法研发、数据策略和合成、Infra策略优化等,以及相关的基础技术探索和创新等; 2、负责基础Pretrain模型、Instruct模型、推理模型等系列大模型的技术研发; 3、持续跟进并深入调研大模型前沿技术、开源方案,跟踪业内语言模型领域的最新进展并推进相关研究,打造业界影响力。
1. 营销策划与执行:负责平台站内营销活动/大促的策略输出、创意策划和落地执行,包括但不限于节日节点。根据目标用户群体、营销主题与市场需求,设计站内+站外创意活动方案,提升平台新客获取和老客留存,推动平台交易增长。 2. 跨部门协调与合作:与市场、产品等团队紧密合作,组织跨部门会议,推动活动进展与关键节点的按时落地。 3. 数据分析与优化:跟踪活动的效果,分析流量、转化、补贴等数据,及时优化营销策略。基于数据结果提出优化方案,确保活动持续提升平台效益。 4. 活动Communication:设计传播策略和传播节奏,通过站内渠道和站外媒体推广活动内容,确保物料符合主题氛围并激发用户参与兴趣。 1. Marketing Planning & Execution: Responsible for the strategy, creative planning, and execution of on-platform marketing activities and major promotions, including but not limited to holiday campaigns. Design on-platform and off-platform creative activity plans based on target user groups, marketing themes, and market demand to enhance user acquisition, retention, and drive platform transaction growth. 2. Cross-Department Coordination & Collaboration: Work closely with teams such as Marketing, Product, and others. Organize cross-department meetings to drive the progress of activities and ensure key milestones are met on time. 3. Data Analysis & Optimization: Track and analyze the effectiveness of campaigns, including metrics such as traffic, conversion, and subsidies. Continuously optimize marketing strategies based on data insights and propose improvement plans to ensure sustained platform benefits. 4. Campaign Communication: Design communication strategies and timing for campaign promotion through on-platform channels and external media. Ensure that marketing materials align with the campaign theme and generate user interest and participation.
We are aiming to leverage AI and other leading technology and dedicated to provide safe and reliable risk control capabilities behind payments. The core technologies include rule engines, model engines, intelligent algorithm models, etc., We are the leading platform with capabilities of high concurrent real-time risk calculations and massive big data analysis and processing. And as the core risk management tech platform for global payment business, we adopt a multi-center deployment architecture around the world. Here you may have the opportunity to learn more about and participate in the design and development of the following aspects: 1. Ultimate computing optimization at the millisecond level. 2. Behavior analysis and risk mining under massive data. 3. Global multi-center system architecture planning and high-availability solution design. 4. Participated in the design of R&D of risk control systems and big data platforms. You will also have the opportunity to explore the architectural design and implementation of cutting-edge technologies such as privacy computing and large models in risk control systems.
1、负责搜索推荐流量策略的规划与设计,提升用户搜索和推荐的体验及转化效果。 2、分析用户行为数据,挖掘用户需求,优化流量分配策略,提高用户粘性和活跃度。 3、与技术、运营、数据团队紧密合作,推动策略的实施和迭代,确保产品目标的达成。 4、跟踪竞品动态,研究行业趋势,不断优化产品功能,提升搜索推荐的竞争力。 5、负责跨部门协调沟通,确保项目顺利进行,对产品效果进行持续监控和优化。 1、Responsible for planning and designing search and recommendation traffic strategies to enhance user experience and conversion. 2、Analyze user behavior data, uncover user needs, and optimize traffic allocation strategies to increase user stickiness and activity. 3、Collaborate with technical, operational, and data teams to drive strategy implementation and iteration, ensuring product goals are met. 4、Monitor competitors, research industry trends, and continuously optimize product features to enhance search and recommendation competitiveness. 5、Coordinate with cross-functional departments to ensure smooth project progress and continuously monitor and optimize product performance.