Site Reliability Engineer Remote - Network / Monitoring (m/f/d)

IT & Technology

We are looking for a smart and passionate Site Reliability Engineer (m/f/d) to join our team. For this position, we are currently hiring someone who is willing to work remotely in either Germany or Portugal. The Reliability Engineering team is one of the engineering pillars responsible for critical components such as data ingestion, monitoring, quality and retrieval. Positioned at the foundation of the technology stack, the team also takes the lead on several initiatives that impact the whole engineering team, such as CI/CD pipelines, automation, engineering workflows and, in general, new technologies.

Your Responsibilites

  • You manage and monitor a multi-datacenter environment with an Infrastructure as Code methodology
  • You build automation to prevent problem recurrence and to reduce deployment times and errors
  • You participate in the design of distributed systems architectures
  • You ensure scalability, availability and performance of our software stack
  • You ensure correctness and availability of the data
  • You engage in service capacity planning and demand forecasting
  • You maintain security, backup, and redundancy strategies

Your Profile

  • BSc degree in Computer Science or a related technical field, or equivalent practical experience
  • Minimum 3 years of professional experience
  • Consolidated kledge of Unix/Linux systems and their internals
  • Strong experience with at least one programming language (e.g., Python, )
  • Professional experience with containers and their orchestration (e.g., Docker, Swarm, Kubernetes, Marathon/Mesos, etc)
  • Kledge of networking theory (OSI layers, NAT), protocols (TCP/IP, UDP, Ethernet, DNS) and networking tools (e.g.,  tcpdump, iptables, netstat)
  • Ability to design large-scale distributed systems
  • Experience in the management of distributed storage (e.g., Hadoop/HDFS, MinIO, Ceph, Gluster, etc)
  • Experience with logging and monitoring systems (e.g., Elastic Stack, Prometheus, Grafana, etc)
  • Experience with automation software (e.g., Ansible, Puppet, Jenkins, Tekton)
  • Experience with Cloud technologies (e.g., AWS, GCP, Azure)
  • Experience in managing complex backup solutions and disaster recovery plans

Our Offer

  • Be part of an exciting and ambitious start-up that puts its people at the heart of its business
  • Be part of a diverse, international, cross-disciplinary team of highly motivated, hands-on experts that tackle unique challenges with a positive spirit and lots of fun
  • A flexible work schedule, a dynamic environment where everyone can have substantial impact, career development programs and additional company holidays

Our job offer Site Reliability Engineer Remote - Network / Monitoring (m/f/d)
sounds interesting? Then we are looking forward to receiving your
application via Campusjäger by Workwise.With our partner Campusjäger, you can apply for
this job in just a few minutes without a cover letter and track the status of your
application live.

Work Hours:

35 - 40 hours per week hours per week

About the company:

We offer the most advanced insights on human mobility based on cutting edge data science, proprietary machine learning algorithms and deep technology, capturing billions of signals every day from cell towers and other unique sources. We work with leading telecom companies and data partners around the globe to capture information about people’s geographical locations, movement habits and demographics; all completely anonymized and aggregated.