SITE RELIABILITY ENGINEER (F/M/D) (SRE): PAAS
Updated: 05 Jun 2022
As a team we have three areas of responsibility: platform operations, supporting services for the product teams and telemetry. We are responsible for the continued operation of our Platform as a Service offerings, including incident handling. We work with the product development teams to establish and maintain our service offerings and provide a tight feedback loop on their services’ performance. Accordingly, we provide monitoring, logging, metrics and other cross-product infrastructure on Kubernetes, so our product teams don’t have to worry about it.
We gather all required metrics to enable data-driven decision making for our platforms and services. We are a development focussed team. While we absolutely need to work with all available tools to react to incidents, solutions should first and foremost result in code and automation. Our weapons of choice are Ansible, Lang, GitOps and CI/CD - not root shells and bash scripts. You will be responsible for the following tasks:
- Running our Kubernetes and service infrastructure.
- Building software and systems to manage platform infrastructure and applications.
- Developing monitoring and alerting rules for symptoms and not outages.
- Participating in system design consulting, platform management, and capacity planning.
- Balance feature development speed and reliability with well-defined service level objectives.
- Improving the deployment process to make it as uncomplicated as possible.
- Be on an on call rotation to respond to availability incidents and provide support for service engineers with customer incidents.
- Agile mindset and experience with modern development practices.
- A proactive approach to spotting problems, areas for improvement, and performance bottlenecks.
- Profound experience with cloud environments and Kubernetes.
- Profound experience with the Linux operating system.
- Experience with network fundamentals.
- Active corporate culture: Our cooperation is characterized by flat hierarchies, transparency, open communication and short decision-making processes
- Wide range of continuing education opportunities: We offer a wide range of personal and professional development opportunities - via e-learning and seminars, conferences as well as mentoring. Language training for German and English is available to our employees as online or face-to-face courses
- Attractive JobBike: Through JobBike, we offer our employees the opportunity to lease their individual desired bike simply and inexpensively and, in addition, we finance this offer
- od transport connections: Due to the central location and good transport connections, our locations are easily accessible and offer parking facilities
- Our employees receive employee discounts on products of the United Internet Group, as well as attractive offers and price reductions from external providers and for various leisure activities. In addition, we offer financial benefits, e.g. for company pension schemes
- Health measures: In addition to internal sports and health courses, we also offer lectures on health. You also receive employee discounts at selected gyms and health centers
- Each site has a canteen where our employees can enjoy meals at a reduced price. In addition, fresh fruit and a range of hot and cold drinks are offered free of charge
- Family service: Our family service advises you on questions regarding childcare, home care/elderly care or difficult personal challenges. The family service is free of charge for all employees
- Events: We celebrate our success in our teams. There are legendary summer and winter parties. Additional workshops and team events provide our unique team spirit
Our job offer SITE RELIABILITY ENGINEER (F/M/D) (SRE): PAAS
sounds interesting? Then we are looking forward to receiving your
application via Campusjäger by Workwise.With our partner Campusjäger, you can apply for
this job in just a few minutes without a cover letter and track the status of your
35 - 40 hours per week hours per week