Required Skills: AWS, GCP, Azure, Python, Bash
Job Description
Job Title : Site Reliability Engineer
Location: United States
Job Type: Full-time, On-site
Visa Sponsorship: H1B
Job Summary:
We are seeking a Site Reliability Engineer (SRE) to maintain the availability, performance, and reliability of our production systems. You will focus on automating operations, monitoring infrastructure, and improving system reliability.
Key Responsibilities:
Ensure the reliability, scalability, and availability of production systems.
Automate repetitive tasks and manual processes.
Troubleshoot and resolve incidents, ensuring timely resolution.
Develop monitoring and alerting systems.
Collaborate with development teams to build more resilient applications.
Required Skills and Qualifications:
Strong experience with Linux systems and cloud platforms (AWS, GCP, Azure).
Proficiency in scripting languages like Python, Bash, or Go.
Experience with monitoring tools (Prometheus, Grafana, New Relic).
Familiarity with CI/CD pipelines and automation.
Knowledge of incident management and root cause analysis.