Senior DevOps Engineer
About The Position
Zebra has set out on a mission to help hundreds of millions of people receive access to fast, accurate medical diagnosis, by teaching computers to read and diagnose medical imaging data.
Zebra Medical vision is revolutionizing healthcare by creating an AI-based radiologist. Our product is a scalable analytic engine, which uses deep learning algorithms to detect anomalies in medical images, at global scale.
We are looking for Senior DevOps engineer to join our team and take on a major role in our DevOps transformation process and bring harmony and productivity to our development, research, and Production sites.
Technologies and tools: Docker, Rancher, Kubernetes, GitLab, Ansible, Prometheus, ElasticSearch, Kibana, AWS, GCP, Jenkins, CephFS, MaaS, GPU, Node.js, Mongo DB, Postgres, TensorFlow, Python and more…
- Manage the data production environment - where we store all our data and train our algorithms: fully containerized environment, with PB+ on hybrid storage (Private and Public cloud), tens of GPU’s, thousands of cores, many DB’s and more...
- Shorten time from Dev. to Production - help building and maintaining high-performance, fault-tolerant, scalable dev-test-deploy processes and workflows and turn every commit into a fully tested & packaged app and with one click deploy to production by implementing CI/CD practices
- Scale our Machine learning research (“ResearchOps”) - maximize our capacity to run experiments and train on huge amounts of data, while keeping all algorithms results traceable in dashboards.
- Create robust production environments - Develop and integrate tools to improve our site reliability and help making our environment scalable, our apps and systems highly available, our provisioning self-service and the data read/write blazing fast with.
- Create a feedback loop and adopt a proactive approach-
- Full stack Monitoring, logging, alerting and other practices and tooling, both on cloud and on premise.
- 3+ years as DevOps in a modern software company
- Experience with CI\CD practices and relevant tools like Gitlab, Jenkins etc.- A Must
- 1+ years with Docker in production - Is a Must
- Experience with provisioning, configuration management and automation tools such as Terraform, Ansible, Salt, Chef, Puppet.
- Experience with public Cloud environment, AWS - an advantage
- Hands-on experience with Monitoring & Logging stacks- ELK, TICK stack, prometheus
- Experience with coding in Python/Node/C#/Java or any other programming language
- Strong Linux skills
- Experience building/supporting a large distributed production environment
- Deliver first-class work on tight schedules that supports self-service approach
- Team-player with good communication skills
Anything from our technology list is a big plus:
Docker, Rancher, Kubernetes, GitLab, Ansible, Prometheus, ElasticSearch, Kibana, AWS, GCP, Jenkins, CephFS, MaaS, GPU, Node.js, Mongo DB, Postgres, TensorFlow, Python and more…