
哈啰java实习生(风控特征开发实习生)
实习兼职技术地点:武汉状态:招聘
任职要求
1. 硕士学历,计算机科学与技术、软件工程、数据科学等计算机相关专业在读。 2. 具备扎实的计算机理论基础,熟悉Java语言和 Python 编程语言及 Pandas、NumPy 等数据处理库,有金融数据处理经验者优先。 3. 了解金融风控基本概念(如风控特征概念、信用评分、反欺诈原理等),或有参与过金融相关项目、课程者优先。 4. 熟悉 SQL 语言,能熟练进行数据库查询和数据提取,了解 Hive、Spark 等大数据处理工具者优先。 5. 具备较强的逻辑分析能力和数据敏感性,能从风控特征需求快速理解并转化成代码逻辑。 6. 拥有良好的沟通表达能力和团队协作精神,能积极融入团队,高效推进工作。 7. 保证每周实习时间不少于 4 天,实习周期至少 3 个月,能长期实习者优先。
工作职责
1. 结合金融信贷业务场景(如信贷审批、反欺诈),协助团队技术同事开展风控特征的挖掘、提取与加工工作,涵盖用户基本信息、行为数据、交易记录等多维度数据。 2. 参与金融风控特征工程的代码实现,包括特征提取、清洗、转换与筛选工作,确保特征处理符合金融数据的严谨性要求。 3. 协助对开发的特征进行有效性验证,结合金融风险指标(如逾期率、坏账率等)分析特征对风控模型的提升效果,并提出优化建议。 4. 参与金融风控特征相关文档的撰写,明确特征在金融信贷业务中的含义、计算逻辑及应用场景,为模型开发和业务决策提供支持。 5. 配合团队完成与三方资信数据对接过程中的特征适配工作,保障特征在实际金融信贷业务流程中应用。
包括英文材料
学历+
数据科学+
https://roadmap.sh/ai-data-scientist
Step by step roadmap guide to becoming an AI and Data Scientist
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Pandas+
[英文] 10 minutes to pandas
https://pandas.pydata.org/docs/user_guide/10min.html
This is a short introduction to pandas, geared mainly for new users.
[英文] Cookbook - pandas
https://pandas.pydata.org/docs/user_guide/cookbook.html#cookbook
This is a repository for short and sweet examples and links for useful pandas recipes.
https://www.kaggle.com/learn/pandas
Solve short hands-on challenges to perfect your data manipulation skills.
https://www.youtube.com/watch?v=2uvysYbKdjM
I'm super excited for this one. We're doing another complete Python Pandas tutorial walkthrough.
https://www.youtube.com/watch?v=Mdq1WWSdUtw
Filtering, Joins, Indexing, Data Cleaning, Visualizations
NumPy+
https://numpy.org/doc/stable/user/absolute_beginners.html
NumPy (Numerical Python) is an open source Python library that’s widely used in science and engineering.
[英文] NumPy - Learn
https://numpy.org/learn/
Below is a curated collection of educational resources, both for self-learning and teaching others, developed by NumPy contributors and vetted by the community.
https://www.kaggle.com/code/themlphdstudent/learn-numpy-numpy-50-exercises-and-solution
This kernel uses exercises of NumPy from the Machine Learning Plus webpage
https://www.youtube.com/watch?v=KHoEbRH46Zk
If you've heard of Pandas and NumPy, you may think one is simply a superset of the other.
https://www.youtube.com/watch?v=QUT1VHiLmmI
Learn the basics of the NumPy library in this tutorial for beginners.
https://www.youtube.com/watch?v=VXU4LSAQDSc
This video serves as an introduction to the NumPy Python library.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
相关职位
实习D11903
1、负责快手商业化风控系统的研发工作,通过敏捷开发支持产品需求快速迭代,不断优化系统架构,支撑业务规模增长,保障服务稳定; 2、参与商业化风控平台核心模块的研发工作,包括流程引擎、特征中心、模型训推一体平台、处置中心等系统的研发; 3、对现有系统的不足进行分析,找到目前系统的瓶颈,改进提高系统性能; 4、参与解决高并发、高可用、高性能等方面带来的各种技术难题和挑战。
更新于 2025-06-24