Logo de empresa

Site Reliability Engineer - Infrastructure Company - Buenos Aires

Capital Federal, Buenos Aires, Argentina

Me interesa

Descripción del puesto

You will be responsible for helping the team in keeping the customers’ applications running at peak performance. Not only will you be the first point of contact to external world-wide customers but you will be helping to identify, analyze and resolve first-tier technical issues on platforms with 500+ Servers. You will be helping modify and improve the monitoring infrastructure which has +9000 metrics and +6000 graphs which are generated every minute from a very diverse environment.

Job Duties and Responsibilities:

  • Analyzing and troubleshooting large-scale distributed systems.
  • Monitor specific metrics for availability, latency and overall system health.
  • Development and implementation of new IT infrastructure monitoring.
  • Continuously refine monitoring processes, thresholds, and configuration.
  • Create and maintain documentation on installations, incidents, and procedures
     

    Job Duties and Responsibilities:

  • Analyzing and troubleshooting large-scale distributed systems.
  • Monitor specific metrics for availability, latency and overall system health.
  • Development and implementation of new IT infrastructure monitoring.
  • Continuously refine monitoring processes, thresholds, and configuration.
  • Create and maintain documentation on installations, incidents, and procedures

    Working hours: 2 PM to 11 PM

Requisitos

Required Qualifications:

  • Fluent English to communicate with technical teams and customers.
  • Ability to diagnose and correlate performance bottlenecks, network issues, failure patterns.
  • Feel comfortable working in a Linux system without a graphic interface.
  • Monitoring tools like Sensu and Grafana
  • Experience on Apache, Nginx, Tomcat, etc.
  • Proficiency in scripting on Bash, Perl, Python or Ruby
  • Strong interpersonal and communication skills.
  • Team work skills..
  • Availability to travel to Las Vegas for 3 months.

Preferred Qualifications:

  • Hands on experience with Nagios, Sensu and Grafana
  • Hands on experience with Docker.
  • Experience or understanding of infrastructure automation tools such as Puppet, Terraform, Chef or Ansible.
  • Has experience in off-premise cloud-based infrastructure, in particular, Amazon Web Services
  • Experience with ticketing tools such as Jira or Remedy.
  • An understanding of networking; DNS, routing, load balancing, and firewalls.
  • Driver's license.

Beneficios

Direct contact with clients and the opportunity to share ideas.
​Training and certifications
Professional growth.
Home-office.
Health inssurance
Trips to events.
…and more!

Detalles

Área: Tecnología, Sistemas y Telecomunicaciones/Infraestructura

Nivel mínimo de educación: Tecnicatura/tecnico

Contrata

We are seeking a Site Reliability Engineer to join an important Infrastructure Company Operations team.

Me interesa