Rapyd is a financial technology company that allows businesses to accept and send payments anywhere.
We invented Fintech-as-a-Service and we are redefining global commerce by powering fintech applications for businesses like Ikea, Uber, and Rappi. Now we’re building the world’s best team to help fuel our hyper-growth get to the next level. If you like to work hard, play harder, dream the impossible and be part of building the next generation of talent for a world-class company, then we really need to hear from you.
What’s In It For You?
In the last year, we’ve raised over $600 million in funding with the world’s most prestigious VCs, launched our own venture capital arm and we’ve been named as one of Forbes’ best, brightest, and most valuable private companies in the cloud. We offer highly competitive compensation and benefits and the opportunity to work in a fast-growing company, our success is your success.
We believe in giving our employees the resources and support to expand and gain new skills and develop their careers. Take off in an environment where each employee is empowered to follow their own development plan, explore growth opportunities and get guidance from mentors and colleagues.
If you are up to the task – Join us in Building the SRE discipline in Rapyd, show the world how it is done In a Unicorn company that’s disrupting the payments industry by building transformative technology, then join our team and play a pivotal role in developing the future of fintech.
- Own the production infrastructure over AWS. Implement sustainable and scalable solutions with goals of improving availability, performance, and security
- Identify root causes for every incident and prevent incidents from ever happening
- Have alerts on symptoms and not on outages. Ensure all infrastructure and application alerts are “actionable” alerts and/or self-healing automation
- Work closely with the R&D, Support & NOC teams: offering education and guidance on integration, support, and monitoring across the toolset
- Everything as a code approach: Run our infrastructure with Ansible, Terraform, and Kubernetes
- Document every action and turn it into repeatable actions and then into automation
- Focus on the system’s observability, availability, reliability, performance/latency, monitoring
- Conduct periodic on-call duties and emergency response
- At least 3+ years of experience in DevOps or an SRE.
- At least 3+ years of experience in Alerting & Monitoring systems such as DataDog / Splunk / New Relic / Prometheus, or similar
- Cloud systems such as AWS / Google cloud / Azure, or similar
- CI/CD tools such as Terraform/GitLab/Jenkins/CircleCI/team city, or similar
- Configuration management such as Ansible/Chef/Puppet, or similar
- Proven experience with Cloud Networking and Security – Connectivity, Load-balancer, DNS
- Experience with Docker, Kubernetes and Helm
- High Analytical & Troubleshooting skills – ability to solve complex problems
- Strong verbal and written communication skills and a collaborative mindset
- OS – Linux
- SCM – Git/bitbucket/gitlab/Phabricator/gerrit
- Knowledge of different database technologies (MySQL administration, MongoDB)
- BSC in Computer Science or related technical certifications