小米Site Reliability Engineer-Experienced

社招全职A060442025-06-06地点：新加坡状态：招聘

扫码手机上打开

任职要求

1. Proficiency in one of the following programming languages: Python, Go, or shell scripting, with demonstrated ability to independently develop modules or platforms.
  2. Familiar with cloud computing; experience in managing multi-cloud or hybrid cloud platforms (e.g., Alibaba Cloud, Azure, AWS) is preferred.
  3. Strong foundation in computer science, with hands-on experience in Linux, networking, load balancing,…

登录查看完整任职要求

微信扫码，1秒登录

工作职责

1. Ensure the stability, reliability, and efficient operation of the Xiaomi's global business, maintaining high availability of services at all times.
  2. Responsible for core operational tasks such as resource provisioning and management, incident response, capacity management, monitoring, and reliability improvements.
  3. Review technical architecture design, assess soundness of the design, and proactively identify and resolve reliability risks.
  4. Conduct in-depth analysis of systemic deficiencies, identify bottlenecks and develop optimization strategies; plan and execute projects to improve system reliability and ensure cost-effectiveness and highly availability of the systems.
  5. Participate in 24/7 on-call rotation, promptly respond to and resolve production incidents to ensure service availability.
  6. Analyze and improve processes to build stable, highly available systems; drive continuous automation improvements, and minimize manual intervention.

📮 投递简历 ✨AI模拟面试

难度：

包括英文材料

Python+

Go+

Bash+

还有更多 •••

登录查看完整学习资料

相关职位

Senior Site Reliability Engineer

社招Overseas

Take ownership of internal system SRE practices including CI/CD, observability, and system reliability Manage and ensure the reliability of big data platforms (e.g., Hadoop, Spark, Flink) in cloud environments Design highly available architectures tailored to business needs and define ops standards and incident playbooks Lead technology choices, performance tuning, and stability enhancements for core infrastructure Work Location: China-Shenzhen

更新于 2025-06-13深圳

SAP China iXp Intern - System Reliability Engineer Intern - Shanghai

实习Administ

Assist in designing, building, and maintaining a scalable and reliable cloud infrastructure Collaborate with developers, operations, and security teams to ensure that the infrastructure is performing optimally and securely Monitoring and alarm systems for our cloud infrastructure, applications, and services Monitor system performance, identify and resolve issues proactively, and troubleshoot incidents when they arise Develop and implement automation tools to streamline processes and improve operational efficiency Participate in the development of disaster recovery and business continuity plans Document infrastructure and processes to ensure knowledge transfer and institutional memory Stay up-to-date with emerging trends and technologies in cloud-native computing and SRE practices

更新于 2025-09-19上海

Data Center Regional Technical Support Electrical Engineer, Field Engineering

社招Data Cen

* Perform design and equipment submittal review for new Data Centers in your region. * Troubleshoot, conduct Root Cause Analysis (RCA) and create Corrective Action (CA) documentation for site/equipment failures. * Directly support operational issues with ad-hoc training, complex operating procedure reviews, including essential equipment, and event support. * Provide technical support to the design for existing data center upgrades and design-solutions, which add capacity, improve availability, and increase efficiency. * Supporting operating partners to lead, Review, and approve designs for existing data center upgrades which improve availability/efficiency. * Interface with operating partners, data center design engineering team, server hardware team, environmental health and safety team to promote standards that maintain consistency and reliability in services delivered by operating partners. * Work on concurrent projects, sometimes in multiple geographical regions. * Initiate and lead engineering site audits within leased or colo data centers. Produce reports outlining risks with recommended mitigations and remediation's. * Act as resident engineer during new construction projects. Support construction, commissioning, and turnover. A day in the life Each day you will interact with different teams responsible for all aspects of the data centers. You will prioritize your activities to support data center capacity availability and safety focusing on the actions that are most impactful. You will have the opportunity to work on projects locally and globally.

更新于 2025-08-12中卫

Product Development Engineer

社招 Enginee

THE ROLE: This dynamic role drives product Quality & Reliability performance of key customers utilizing AMD Client APU/CPU products on their production lines. The directive of this role is to provide a differentiated quality experience, improve customer satisfaction, and to build customer confidence with a focus on AMD customer satisfaction across the Product Introduction & volume ramp phases. This is a high visibility role that acts as a key interface into the following organizations: AMD Customers, AMD Business unit, AMD Silicon & Package Reliability teams, AMD Engineering leadership, AMD Global Product Engineering & Operations organization, and AMD Sales Teams.

更新于 2025-11-26上海