英伟达Senior Application Engineer - Big Data
任职要求
• BS, MS, or PhD in Computer Science, Computer Engineering, or closely related field • 12+ years of work or research experience in software development • Excellent programming skills for manipulating data frames in Python, Scala, Java or SQL • Strong problem solving skills coupled with customer-facing communication skills • Knowledge of open source big data open source ecosystem (Apache Hadoop, Spark, Hive, Presto, Airflow, Kafka, etc) • Able to work successfully with multi-functional teams across organizational boundarie…
工作职责
• Serve as a lead application architect in RAPIDS Accelerator for Apache Spark . • Define reference architecture of accelerated Apache Spark applications for major industry verticals. • Lead the technical engagement with select customers and partners to accelerate Apache Spark applications with GPUs. • Work closely with NVIDIA Spark engineering teams in architecture design and system implementations. • Partner with Solution Architects to understand customer’s existing big data and ML/DL solution architecture. • Conduct regular technical customer meetings for project/product roadmap, feature discussions, customer issue resolution and performance tuning. • Build and work on the PoC for solutions that address customer’s critical business needs. • Develop applications to promote best practices for accelerated data analytics and machine/deep learning in various industry verticals. • Build tools to analyze data processing workloads to find opportunities for acceleration and cost savings. • Work with major cloud service providers and Apache Spark vendors globally. • Engage open source communities, including Apache Spark and RAPIDS, for technical discussions and contributions.
• Drive reliability engineering initiatives, including infrastructure automation, service monitoring, incident response, and capacity planning. • Leading and participating in technical design discussions across cross functional teams. • Collaborate with application teams to define and enforce architectural best practices, CI/CD standards, and cloud-native patterns. • Diagnose complex production issues through in-depth troubleshooting and implement resilient solutions to prevent recurrence. • Contribute to the development of internal tools that improve observability, system health, and operational transparency. • Analyze and optimize existing systems, providing enhancements and ongoing support as needed. • Stay current with new technologies and proactively recommend improvements to existing cloud architectures and processes. • Develop and maintain server-side logic, data processing, and application workflows. • Mentor junior engineers and promote a culture of knowledge sharing and continuous improvement.
• Design and implement end-to-end data pipelines (ETL) to ensure efficient data collection, cleansing, transformation, and storage, supporting both real-time and offline analytics needs. • Develop automated data monitoring tools and interactive dashboards to enhance business teams’ insights into core metrics (e.g., user behavior, AI model performance). • Collaborate with cross-functional teams (e.g., Product, Operations, Tech) to align data logic, integrate multi-source data (e.g., user behavior, transaction logs, AI outputs), and build a unified data layer. • Establish data standardization and governance policies to ensure consistency, accuracy, and compliance. • Provide structured data inputs for AI model training and inference (e.g., LLM applications, recommendation systems), optimizing feature engineering workflows. • Explore innovative AI-data integration use cases (e.g., embedding AI-generated insights into BI tools). • Provide technical guidance and best practice on data architecture that meets both traditional reporting purpose and modern AI Agent requirements.
Build/Improve experiment platforms for new scenarios.Build data pipelines on multiple computation platforms for reporting, analysis and metrics pre-computation with stable SLA and good quality.Build agents for productivity improvement.
• Design, develop, and manage Streaming and Batch pipelines, supporting key functionalities such as large-scale index construction, web page crawling and feature extraction, image processing, and context re-writing. • Optimize continuously a platform to manage, schedule, and monitor hundreds of pipelines. • Optimize continuously a platform to view, track, debug, and operate massive scale Ads Data. • Evaluate and optimize code and design, to maximize performance, minimize complexity. • Mentor junior SDE and solely drive feature development from ground zero.