Senior Site Reliability Engineer / Devops Engineer

WizeNoze

Are you looking to work on challenging projects, with a diverse and motivated team, while building technology that can change the world? Do you want to join an award winning startup in their scale-up phase? Do you want to give students all over the world access to a prosperous future? Do you want to work with large distributed systems, machine learning, web crawling, and other interesting technology?

We’re looking for a Senior Site Reliability Engineer / Devops Engineer with 8+ years experience in REST APIs, large-scale distributed systems, and the AWS stack based in Amsterdam or remote within the +-2 GMT timezones. You need to be curious and passionate with a drive to continually improve yourself, your craft, your code, and your colleagues. You need to hold yourself and your peers to the highest standards to deliver the best quality products possible! You must not be complacent, and must keeping working to improve your craft and skills. You must be able to step outside your role to think holistically about our systems, processes, and people; always with a view to improving efficiency and long-term maintainability, reliability, performance, and quality.

Your responsibilities:

keeping our distributed systems running smoothly on AWS: focusing on performance, monitoring, alerting, cost optimisation, etc.
infrastructure automation via infrastructure as code principles using Terraform, Ansible, etc.
configuration management and tracking
optimising alerting and reporting to improve signal vs noise
code instrumentation to tie all systems together with Spring Micrometer, etc. libs to feed metrics in to Prometheus
instrumenting AWS services like RDS, SQS, Elasticsearch Service, Beanstalk, etc. to feed in to Prometheus
building useful dashboards in Grafana to help monitor system performance and reliability over time
audit and monitor security across AWS
optimise network layouts, VPCs, VPNs, and security across AWS
process documentation
load testing (with infrastructure automation so we can load test routinely by standing up a clone of production)
disaster recovery planning and preparation, including routine backup testing
self-hosting Elasticsearch clusters with all necessary monitoring, alerting, disaster recovery, and self-healing in place
application performance monitoring implementation and assisting developers in profiling and optimising code
planning support rotas and standby schedules,- etc.
AWS cost optimisation
automating onboarding and offboarding of employees across GSuite, AWS, and other SaaS and PaaS platforms in use

Your abilities and skills:

Fluent in English
8+ years experience. Mix of development and ops, with at least 5 years dedicated SRE/devops experience
SRE/devops experience must be in a cloud environment with large distributed systems
At least 2 years AWS experience
At least 2 years Java development experience. We need someone that is more devops engineer than sysadmin, with the ability to deeply understand and solve JVM performance and memory issues with the aid of the developers
Preferably worked in startups
Be able to communicate effectively with team members, but also be able to solve difficult problems on your own with perseverance and tenacity
Prove that you’re continually working to improve your skills and knowledge, by raising the standards for your own designs and code constantly, and by reading and learning from books, courses, and peers
Friendly and helpful to tech and non-tech team members.
Linux administration
Shell scripting
Python scripting
Experience in Java, Elasticsearch, Spring, and any of the other tools under responsibilities is a plus
Comfortable working in a remote environment

Your traits:

Curious. Able to learn and apply new concepts and tools rapidly
Pragmatic. Choose the practical path to delivering a system without getting lost in fancy new technologies or techniques
Attention to detail. Understand everything you do and why you’re doing it, no copy/pasting and hoping for the best
Perseverance. Able to solve difficult problems without giving up and expecting someone else to take over
Take responsibility for your work. Have an owner’s mindset.
High degree of personal responsibility over designated duties. Step outside your role if needed to get the job done well
Consistent and organised
Timely and eloquent communicator
Focused on helping the team win, before personal gain
Open to receiving objective criticism and improving upon it
Like to work in a startup environment

Category DevOps

distributedreliabilityengineerdevopsawsinfrastructuresystemsecuritysupportsrecloudjavasysadminlinux

Senior Site Reliability Engineer / Devops Engineer

WizeNoze

Similar Jobs

Proxify AB

Remote Senior DevOps Engineer (Azure)

Proxify AB

Remote Senior DevOps Engineer (AWS)

amazee.io

Remote Marketing Content Writer (Cloud/DevOps) - Remote (EMEA)

VEXXHOST, Inc.

Remote OpenStack Cloud Engineer (DevOps)

CareRev

Remote Sr. Staff, DevOps Engineer (Platform Team)

CoverGo

Remote Senior DevOps Engineer

Senior Site Reliability Engineer / Devops Engineer

WizeNoze

Similar Jobs

Proxify AB

Remote Senior DevOps Engineer (Azure)

Proxify AB

Remote Senior DevOps Engineer (AWS)

amazee.io

Remote Marketing Content Writer (Cloud/DevOps) - Remote (EMEA)

VEXXHOST, Inc.

Remote OpenStack Cloud Engineer (DevOps)

CareRev

Remote Sr. Staff, DevOps Engineer (Platform Team)

CoverGo

Remote Senior DevOps Engineer

Job Alerts