希音高级/资深算法(营销风控/商家风控)-南京上海
社招全职3年以上信息技术类地点:上海 | 南京状态:招聘
任职要求
1、本科及以上学历,计算机科学、数据科学、数学、统计学、人工智能等相关专业。 2、精通数据分析、统计建模,熟悉常用机器学习算法(如逻辑回归、决策树、XGBoost、LightGBM、神经网络等)。 3、熟练使用 Python / Java / Scala 等至少一种编程语言,以及常用数据分析工具(Pandas、NumPy、SQL 等)。 4、具备处理大规模数据的能力,了解分布式计算框架(Hadoop、Spark、Flink 等)。 5、熟悉风控领域常见技术,包括但不限于异常检测、图相关分析等。 6、逻辑分析能力强,具备优秀的问题抽象与解决能力,能在业务与技术之间找到最佳平衡点。 方向2——商家风控算法: 岗位职责: 风控场景建模: 针对SHEIN全球化业务中的商家端风险场景(如:同人店铺、虚假发货、买卖联合欺诈、刷单炒信、恶意注册、黑灰产团伙欺诈等),构建和优化机器学习/深度学习模型。 特征工程与数据挖掘: 深入挖掘海量商家数据(包括基础信息、行为轨迹、交易链路、商品图文等),进行多维度的特征工程构建,提升风控识别的准确率和覆盖率。 图算法与复杂关系挖掘: 运用图神经网络(GNN)、知识图谱等技术,挖掘商家间的复杂关联网络,有效识别并打击团伙作弊和关联风险。 端到端算法落地: 负责算法从数据调研、模型训练、离线评估到线上部署的全流程,并持续监控和迭代模型,保障线上服务的稳定性和高效性。 跨团队协作: 与风控策略、产品经…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
方向1——营销风控算法: 1、结合营销业务场景(如优惠券、补贴、活动奖励、游戏等),设计并实现针对虚假交易、批量注册、套现、团伙等异常行为的实时风控算法。 2、利用机器学习、深度学习及相关数据分析技术,对用户行为和交易数据进行实时监控与分析,识别可疑行为和潜在风险; 3、与业务、产品和风控团队紧密合作,定义并完善风控指标体系,持续跟进风险案例并制定对应策略; 4、设计并实现高效、稳定的风控数据处理流程,包括数据清洗、特征工程、模型训练及线上预测部署; 5、持续跟踪电商行业风险趋势及新技术发展,及时更新和升级风控算法与策略,提升整体防控能力。
包括英文材料
学历+
数据科学+
https://roadmap.sh/ai-data-scientist
Step by step roadmap guide to becoming an AI and Data Scientist
数据分析+
[英文] Data Analyst Roadmap
https://roadmap.sh/data-analyst
Step by step guide to becoming an Data Analyst in 2025
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
XGBoost+
[英文] What is XGBoost?
https://www.ibm.com/think/topics/xgboost
XGBoost (eXtreme Gradient Boosting) is a distributed, open-source machine learning library that uses gradient boosted decision trees, a supervised learning boosting algorithm that makes use of gradient descent.
https://www.youtube.com/watch?v=BJXt-WdeJJo
takes a deep dive into one of the most powerful machine learning algorithm, eXtreme Gradient Boosting, using a Jupyter notebook with Python.
LightGBM+
https://lightgbm.readthedocs.io/en/stable/
LightGBM is a gradient boosting framework that uses tree based learning algorithms.
https://www.youtube.com/watch?v=tSZxOd1TWZc
In this video, we explore LightGBM, a machine learning algorithm developed by Microsoft that offers superior speed, efficiency, and accuracy.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Scala+
Pandas+
[英文] 10 minutes to pandas
https://pandas.pydata.org/docs/user_guide/10min.html
This is a short introduction to pandas, geared mainly for new users.
[英文] Cookbook - pandas
https://pandas.pydata.org/docs/user_guide/cookbook.html#cookbook
This is a repository for short and sweet examples and links for useful pandas recipes.
https://www.kaggle.com/learn/pandas
Solve short hands-on challenges to perfect your data manipulation skills.
https://www.youtube.com/watch?v=2uvysYbKdjM
I'm super excited for this one. We're doing another complete Python Pandas tutorial walkthrough.
https://www.youtube.com/watch?v=Mdq1WWSdUtw
Filtering, Joins, Indexing, Data Cleaning, Visualizations
NumPy+
https://numpy.org/doc/stable/user/absolute_beginners.html
NumPy (Numerical Python) is an open source Python library that’s widely used in science and engineering.
[英文] NumPy - Learn
https://numpy.org/learn/
Below is a curated collection of educational resources, both for self-learning and teaching others, developed by NumPy contributors and vetted by the community.
https://www.kaggle.com/code/themlphdstudent/learn-numpy-numpy-50-exercises-and-solution
This kernel uses exercises of NumPy from the Machine Learning Plus webpage
https://www.youtube.com/watch?v=KHoEbRH46Zk
If you've heard of Pandas and NumPy, you may think one is simply a superset of the other.
https://www.youtube.com/watch?v=QUT1VHiLmmI
Learn the basics of the NumPy library in this tutorial for beginners.
https://www.youtube.com/watch?v=VXU4LSAQDSc
This video serves as an introduction to the NumPy Python library.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
还有更多 •••