Platform Engineer
Sense Street
At Sense Street, we are developing natural language understanding systems for capital markets. Our premise is simple: markets are conversations, and we help investment banks and asset managers have better, more efficient conversations.
Through our partnerships with global banks, we have access to datasets that have not been made available in the past. This allows us to create language models uniquely suited to capital markets while advancing the state-of-the-art. We are a venture-backed company founded by professionals with experience spanning machine learning, trading, and quantitative research.
The Role
We’re looking for a seasoned Platform/Site Reliability Engineer to help us scale our platform and ensure our systems are resilient, observable, and continuously improving. You’ll work closely with our software engineers and data scientists to design, automate, and operate the infrastructure that underpins our AI systems and real-time services.
This role blends hands-on systems engineering with strategic thinking: from refining our incident response processes, to implementing monitoring that gives us confidence in production, to guiding our infrastructure as we scale.
You’ll thrive here if you’re pragmatic, delivery-oriented, and enjoy being the glue between infrastructure and product delivery teams.
Requirements
- Proven experience operating and scaling production systems in cloud-native environments (AWS and/or GCP)
- Deep knowledge of Kubernetes, Terraform, Docker, Ansible, and infrastructure-as-code practices
- Experience with observability tooling (e.g. Prometheus, Grafana, Loki, ELK, ArgoCD, etc.)
- Familiarity with CI/CD tooling and deployment pipelines (e.g. Gitlab, Jenkins, GitHub Actions, Argo)
- Proficient in Python and Bash for scripting, automation, and debugging
- Solid understanding of Linux systems internals, networking, and distributed systems fundamentals
- OnPrem/Datacentre experience including networking infrastructure, switching, IPMI
- Strong communication skills — you’re comfortable collaborating across product, engineering, and data science teams
Nice to Have
- Comfortable managing incident response, SLAs/SLOs, and post-incident reviews in high-availability systems
- Able to drive reliability engineering practices like chaos testing, capacity planning, and alert tuning
- Previous experience supporting real-time APIs or AI/ML model infrastructure in production
- Exposure to financial systems, data pipelines, or regulated environments
- Experience mentoring or guiding junior engineers on reliability and operations topics
The Work Environment
- Flexible working, central Krakow location
- Highly skilled team, flat hierarchy, and opportunities for mentorship
- Ability to heavily influence platform and culture in a scaling company
- Budget/time for books, training and attending conferences/hackathons
- Opportunity to work with cutting-edge AI technologies and unique datasets
- Regular knowledge sharing sessions and internal tech talks
Company Benefits (pending probationary period)
- Flexi Working
- Pension
- Hybrid working – home/office
- Share Options Scheme
- Private Healthcare
- Benefits system (gym etc)