苹果Software Engineer, Cloud Services Engineering

社招全职Software and Services2025-08-06地点：上海状态：招聘

扫码手机上打开

任职要求

Minimum Qualifications
• Expertise in one or more programming language(Java or Go) with deep experience with multiple design patterns & RESTful apis - Java or Go
• Strong experience in one or more public clouds and infrastructure
• Experience with CI/CD tools and techniques, containers, Kubernetes
• Experience with AuthN and AuthZ technologies and protocols, including IAM and SSO
• Excellent communications skills and ability to establ…

登录查看完整任职要求

微信扫码，1秒登录

工作职责

You will partner with developers, system and site reliability engineers and customers to understand their challenges, work through their issues and provide solutions that can be embraced widely. The ideal candidate is someone with a consistent track record, deep technical knowledge and skills in delivering large scale ,distributed complex software solutions deployed on multiple cloud platforms. This is a highly technical, hands-on role that requires a wide and deep experience in leading infrastructure and applications. The successful candidate will design and implement complete product demonstrating expertise in entire software development lifecycle. Building and maintaining relationships with diverse sets of customers that use the platform will be critical to ensure the business units are successful. We are a team of highly skilled and hardworking engineers working on this groundbreaking and constantly evolving space.

📮 投递简历 ✨AI模拟面试

难度：

包括英文材料

Java+

Go+

REST+

还有更多 •••

登录查看完整学习资料

相关职位

（高级/主任）可靠性工程师 (Senior/Staff) Site Reliability Engineer, Fleetnet

社招软件平台

THE ROLE We're the small, expert team creating the next-generation server-side infrastructure to support the manufacturing and functionality of fleets of Tesla products, and we're looking for seasoned SREs with domain expertise in one or more of: containers, public clouds and cloud-native apps. Today, Tesla owners rely on our services to safely and securely summon their cars with a tap on their mobile phones -- a feature enabled by one of the many over-the-air updates we've delivered to the Tesla vehicle fleet. Tesla engineering relies on our data and analytics platform to make Tesla products better and safer. And, when an owner needs assistance, Tesla service and support rely our applications to understand and respond to the situation. Tomorrow, we will apply fleet learning to dispatch and deliver real-time road conditions to millions of autonomous vehicles and manage distributed energy generation & storage at grid scale. Join us and you will work alongside world-class software and data engineers on some of the newest and most challenging IoT, manufacturing and service engineering problems in the world today. The platform you help us build and automate will be used daily by millions of Tesla owners (and tens of thousands of Tesla employees) to improve and enhance the functionality of our cars, chargers, and batteries worldwide. RESPONSIBILITIES Design and write software that enables rapid prototyping by development teams, while ensuring the highest levels of reliability and availability. Work directly with our factory firmware team to provide highly available factory-facing services. Drive the migration of large-scale, distributed fleet applications towards cloud-native microservices. Influence architectural decisions with focus on security, scalability and high-performance. Automate the build and deployment of infrastructure using Docker, Kubernetes & other orchestration technologies in a hybrid-cloud environment. Setup and maintain monitoring, metrics & reporting systems for fine-grained observability and actionable alerting.

上海

Sr. System Engineer , Linux

社招基础架构

The Role Tesla is looking for a technical and industry-experienced engineer to join a team of talented engineers. As part of Tesla IT Operation team, we are responsible to deliver 7x24 system infrastructure and provides a portfolio of services including configuration management, engineering tools, identity access and control, managing public, private cloud infrastructure, ensure security and extreme reliability is our fundamental design principal, the candidate must be hands-on on day-to-day basis with experience in building, operating and driving reliability and security for production systems at scale. Responsibilities • Responsible for the design, deployment, and support of manufacturing systems and network infrastructure. • Provide support for China-based infrastructure build-out, including datacenter, Linux system (both virtualized and bare-metal servers). • Installation, configuration, and maintenance of Linux server environment. • Ensure the reliability of the existing systems to guarantee uptime and availability of core infrastructure services. • Perform root-cause analysis of complex issues ranging through hardware, operating system, application, network, and information security platforms. • work with different business units to identify, plan, test and deploy or upgrade Linux system according to business requirements. • Partner with teams from across the organization to help tackle hard problems in a collaborative, high velocity environment. • Tackle issues across the entire stack: hardware, software, network and application. • Managing engineering tools and platform such as GitHub, Artifactory, etc. • Perform analysis, troubleshooting, and introspection on core infrastructure components and handle incident response. • Creating and maintain well documented knowledge base and be a mentor of junior engineers. • Take on call role and respond quickly to emergency bridge and provide quick and effective solutions to minimize system downtime.

上海

IT Incident Response Engineer

社招生产支持

THE ROLE This role will be a support engineer within the Tesla IT Infrastructure Engineering & Operations department. The Sr. Incident Response Engineer will be coordinating with cross-functional engineering teams for Incident Response & Management in terms of the high availability to Tesla Manufacturing, Business Operations, Customer Service & Experience. We help to reduce the occurrence of incidents by using efficient IT Operation monitoring, effective risk analysis and professional team collaboration. The Tesla APAC Incident Response Center is a growing team consist of professionals from diverse backgrounds, which will offer you a fantastic development environment. This role will be based on Giga Factory Shanghai, China but will provide support to Tesla Business globally considering of the growing business & great mission. RESPONSIBILITIES • Independently lead incident response and management to minimize impact and ensure optimal response times. Develop incident response plans, conduct post-mortem analyses, and organize drills to enhance preparedness. • Drive IT service management projects. Establish/optimize SOPs to reduce inter-team communication barriers, promote technical knowledge sharing, and improve team incident response capabilities. • Monitor IT infrastructure and data center operations, including servers, networks, and applications. Analyze real-time stability metrics, mitigate risks, and deliver regular operational analysis reports. • Proactively enhance team efficiency through tool automation, process refinement, and adoption of industry best practices. Support daily operations and foster a culture of continuous improvement. • Oversee infrastructure changes to minimize risks, streamline approval workflows, and ensure compliance with change management protocols.

上海

高级 AI 开发工程师，用户支持

社招桌面支持

The Role TESLA is offering a full-time IT Support DevOps AI position in the Information Technology Department (Work Location: Tesla Giga Factory Shanghai). If you are a versatile expert integrating AI development, DevOps practices—someone who can efficiently tackle challenges, solve complex technical problems in user support and experience scenarios, and reject repetitive and inefficient work patterns—this role is perfect for you. IT Support DevOps AI is a core role connecting the company’s IT systems and user-facing processes, standing at the forefront of enhanced user support implementation. You will engage in work across multiple domains, including AI technology R&D, containerized deployment, and operational support. Through technical practice, you will support the company in optimizing user interactions, improving support efficiency, and contributing to the core goal of user experience transformation. Responsibilities • Undertake AI algorithm R&D, model optimization, and training, with a strong emphasis on fine-tuning (FT), supervised fine-tuning (SFT), reinforcement learning (RL), and advanced tuning techniques; focus on user support scenarios such as data analysis, query resolution, issue detection, and automated assistance to ensure AI technology aligns with user experience needs. • Complete the deployment, monitoring, and scaling of AI solutions based on container technologies like Kubernetes (K8s) and Docker, ensuring high availability and stability of the system in the operational environment, while integrating AI underlying technologies like neural networks and Transformer architectures for efficient performance. • Participate in DevOps process development, optimize the full lifecycle of AI model and system development, testing, and deployment, and realize automated deployment, continuous integration (CI), and continuous delivery (CD), incorporating RL-based optimization and model tuning for adaptive user support systems. • Collaborate with user support-related departments such as helpdesk, customer service, and product teams to deeply understand user pain points and provide data-driven AI technical solutions, leveraging SFT and attention mechanisms to enhance personalized user experiences. • Respond quickly to technical requirements and faults in user-facing systems, troubleshoot issues in AI systems, container clusters, and network environments, minimize impacts on user interactions, and improve support efficiency and satisfaction through advanced AI tuning and underlying model diagnostics. • Track cutting-edge technologies in the AI and DevOps fields (e.g., large language models with FT/SFT/RL integration, cloud-native operations) and industry trends, promote the pre-research and application of new technologies in user support scenarios, and continuously optimize system performance using techniques like model compression and quantization.

上海