Site Reliability Engineer Job Description

4.6

170 votes for Site Reliability Engineer

Site reliability engineer provides an application support group (ASG) role to Linux services and associated technologies including CheckMK, Docker, Kubernetes and Ansible.

Site Reliability Engineer Duties & Responsibilities

To write an effective site reliability engineer job description, begin by listing detailed duties, responsibilities and expectations. We have included site reliability engineer job description templates that you can modify and use.

Sample responsibilities for this position include:

You will automate the server provisioning process to reduce the labor of our networking engineering and datacenter operations teams

Perform deep dives into both systemic and latent reliability issues

Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization

Build and maintain low latency, high performance, scalable systems in a polyglot architecture

Ensure UA Record can support massive, global user growth while achieving rigorous SLAs

Enforce best practices for metrics gathering, monitoring, and alarming

Accelerate the UA Record team by making service/application deployment and builds fast, simple, and reliable

Provide basic data administration and optimization, and monitoring

Provide basic network administration and troubleshooting

Assist in the technology selection for core computing and software infrastructure along with data persistence

Site Reliability Engineer Qualifications

Qualifications for a job description may include education, certification, and experience.

Licensing or Certifications for Site Reliability Engineer

List any licenses or certifications required by the position: AWS, ITIL, V3, MCSE, II, IAT, RHCSA, SSL, DNS, HIPAA

Education for Site Reliability Engineer

Typically a job would require a certain level of education.

Employers hiring for the site reliability engineer job most commonly would prefer for their future employee to have a relevant degree such as Bachelor's and Master's Degree in Computer Science, Technical, Software Engineering, Engineering, Computer Engineering, Business, Education, Science, Technology, Information Systems

Skills for Site Reliability Engineer

Desired skills for site reliability engineer include:

Ansible

DNS

Networking

TCP/IP

Infrastructure components

Docker

Jenkins

Kubernetes

Linux

Python

Desired experience for site reliability engineer includes:

Experience with web server configuration, monitoring, trending, network design, high availability

5+ years of experience as a Site reliability, Operations or Software engineer

Proficiency in a scripting language

Practical, solid knowledge of shell scripting and at least one higher-level language (Python or Ruby preferred)

Expert level understanding of Linux servers, specifically RHEL/CentOS

Comfortable configuring DNS, DHCP, and LAN/WAN technologies

Site Reliability Engineer Examples

Site Reliability Engineer Job Description

Job Description Example

Download

Our growing company is looking for a site reliability engineer. We appreciate you taking the time to review the list of qualifications and to apply for the position. If you don’t fill all of the qualifications, you may still be considered depending on your level of experience.

Responsibilities for site reliability engineer

Code Ansible Playbooks in an Amazon Web Services (AWS) Public Cloud environment
Maintain and operate existing applications via configuration management (Ansible) implementing for new systems as needed
Collaborate with the centralized infrastructure team as the engineering stakeholder for UA Record, providing feedback and helping to implement the infrastructure roadmap as applicable
Act as a proactive advisor to the UA Record team on how to build scalable, manageable services
Evangelize and educate the development team on scalability, security, and reliability concerns while assisting the team in efforts to build these checks into the development workflow
40% - Performance Testing / Optimization of Applications
Fixing the interesting problems we face in the best way possible
Participates in Company product lifecycle process
A love of SRE, open-source, self-service tools, and micro-services
Experience with AWSmulti-region/multi-AZdeployed systems, auto scaling of EC2 instances, CloudFormation, ELBs, VPCs, CloudWatch, SNS, SQS, S3, Route53, RDS, IAM roles, security groups, blue/green deployments, and A/B testing

Qualifications for site reliability engineer

Solid understanding of fundamental technologies like TCP/IP, HTTP
Strong working knowledge of Linux systems and applications
Must work well with and be able to influence myriad personalities at all levels
Experience with automation languages like Ruby, Powershell or Unix
Experience with automation tooling such as Chef, Docker, AWS
A bachelor’s degree in Computer Science, a related discipline, or equivalent practical experience

Site Reliability Engineer Job Description

Job Description Example

Download

Our growing company is looking to fill the role of site reliability engineer. We appreciate you taking the time to review the list of qualifications and to apply for the position. If you don’t fill all of the qualifications, you may still be considered depending on your level of experience.

Responsibilities for site reliability engineer

Monitor and maintain applications per agreed upon Service Level Objectives
Support and maintain configuration management for various applications and systems
Identify and resolve a broad range of problems that occur in production applications and systems
Serve as part of the architecture and development lifecycle implementing systems
Support the recovery and resiliency strategy and architecture for various applications and systems
Proactively support capacity planning and disaster recovery and resiliency aspects
Govern support processes, resiliency and automation principles for the larger organization
Provide direction and guidance to other infrastructure and DevOps engineers
Work with business teams to identify complex requirements and their integration into existing and new technologies
Building large scale messaging infrastructure, data replication, auto-scaling and stream processing

Qualifications for site reliability engineer

Comfortable with large scale production systems and technologies, for example load balancing, monitoring, distributed systems, and configuration management
Strong coding skills in at least one programming language, and a desire to pick up more
Familiarity with and enthusiasm for software engineering best practices such as testing, continuous integration and continuous delivery
A passion for solving problems using open source software
The ability to thrive in a rapidly evolving, globally distributed environment
Strong Security mindset

Site Reliability Engineer Job Description

Job Description Example

Download

Our company is looking to fill the role of site reliability engineer. Thank you in advance for taking a look at the list of responsibilities and qualifications. We look forward to reviewing your resume.

Responsibilities for site reliability engineer

Rapidly debug and respond to user-reported issues on the DGX Platform software stack
Contribute to the overall health, performance, and capacity planning of DGX Services
Deliver AWS based infrastructure solutions using AWS Cloud Formation (JSON) for configuration management
Design, develop and implement software that improves the stability, scalability, availability and latency of the Booking.com products
Take ownership of services and have the freedom to do what is best for our business and customers
Build effective monitoring to monitor the health of your system, and jump in to handle outages
Build and run capacity tests to manage the growth of your systems
Plan for reliability by designing systems to work across our multinational data centers
Develop tools to assist the product development teams with successfully deploying 1000s of change sets every day
Share the on-call rotation and be an escalation contact for incidents

Qualifications for site reliability engineer

A passion for elegantly solving problems using open source software whenever possible, avoiding complex solutions and reinventing wheels
A passion for contributing to open source software, fixing bugs and implementing features
Exposure with cloud and Amazon Web Services (AWS) and APIs
Experience with automation tooling such as Chef, Docker
Applies full use and application of engineering methodologies related
Windows and Apple desktop

Site Reliability Engineer Job Description

Job Description Example

Download

Our innovative and growing company is hiring for a site reliability engineer. We appreciate you taking the time to review the list of qualifications and to apply for the position. If you don’t fill all of the qualifications, you may still be considered depending on your level of experience.

Responsibilities for site reliability engineer

Troubleshoot/understand reliability issues
Production readiness-ensuring the environment is available and reliable
Work heavily with AWS technologies (All our systems are in AWS)
Ensure all systems are highly available, with proper DR solutions in place
Work to identify and improve upon latency issues
Work to ensure we are squeezing every bit of performance out of our systems
Writing code to automate our way through AWS and all related ops processes
Monitor infrastructure and applications (creating custom metrics, new alarms, dashboards, etc)
Serving as level 2 escalation for production issues
Capacity planning, ensuring new systems can support production load and scales appropriately

Qualifications for site reliability engineer

Mastery of Linux or Unix
Proficiency in development languages (Bash, Clojure, Go, Java, Javascript, Python, Ruby, etc)
In-depth understanding of web application models and key components, including the HTTP
Experience in a similar role or project
Experience with various data technologies including relational and non-relational databases and message queues
At least 2 years' with Virtualization and Cloud Platforms

Site Reliability Engineer Job Description

Job Description Example

Download

Our company is hiring for a site reliability engineer. To join our growing team, please review the list of responsibilities and qualifications.

Responsibilities for site reliability engineer

Gathering and analyzing data to root out errors, discern trends, and diagnose complex customer-facing issues (pre- and post-sale)
Responding to incidents, but more importantly preventing incidents through pro-active analysis and monitoring
Identify and communicate the need for technology improvements/software updates and product innovation
Work with development engineers to solve platform problems and design code fixes
Implement changes and design regression tests to make permanent solutions to platform problems
Experience managing large-scale database systems in a cloud environment
Strong preference for shipping incrementally with an understanding of the fundamentals of CI / CD
Design and deliver solutions to improve the availability, scalability, latency, and efficiency of CircleCI’s services
Diagnose and resolve production issues in conjunction with software engineering teams
Architect and implement shared infrastructure used by all services within the CircleCI platform, for both SaaS and on-prem configurations

Qualifications for site reliability engineer

4+ years' experience with development and management of Java APIs
1+ years' experience with JavaScript Frameworks, Angular JS and Node.js
1+ years' experience working with cloudautomation/orchestrationtechnologies (ie
Various programming and scripting languages
Various automation languages/platforms
Multiple application platforms

Related Job Descriptions

Browse More

Site Reliability Engineer Job Description

Site Reliability Engineer Duties & Responsibilities

Site Reliability Engineer Qualifications

Licensing or Certifications for Site Reliability Engineer

Education for Site Reliability Engineer

Skills for Site Reliability Engineer

Site Reliability Engineer Examples

Site Reliability Engineer Job Description

Site Reliability Engineer Job Description

Site Reliability Engineer Job Description

Site Reliability Engineer Job Description

Site Reliability Engineer Job Description

Related Job Descriptions

Resume Builder

I am an Employer

I am a Candidate