DevOps / Site Reliability Team Lead

Job Description

DevOps / Site Reliability Team Lead

About Us

Imagine a company that combines the dynamic energy of a start-up and the backing of Siemens, a global powerhouse. A company where ideas, passion, and ingenuity are valued - and - vital to solve the next generation of smart building challenges and help customers achieve net carbon zero goals.

Enlighted is human-centered proptech company that creates positive transformation wherever space, people and work meet through our industry leading technology. We empower organizations with this technology to transform physical spaces into regenerative places that fuel positive impact for people, portfolio, and our planet.

Our team is constantly evolving to deliver exceptional value to customers worldwide and remain at the forefront of future-proofing building with our innovative solutions. If you are passionate about turning everyday spaces into extraordinary places – join us – and start making your impact today.

Siemens BRI is looking for a DevOps lead to help build and maintain the infrastructure that runs our products. As a leader of our DevOps/SRE team in India, you are responsible for managing the team that maintain the production infrastructure that runs our IoT workplace and workspaces products. In addition, your DevOps teams provide mission critical engineering support to our product engineering teams.

As the complexity and scale of our products grow, we need experienced DevOps practitioner to join our team to maintain and automate our day-to-day operations. We also need someone who understands and has experience in maintaining enterprise level application. Our team mission is to create system that are available, sustainable, observable, secure and efficiency.

Your team monitors, scales, and automate operations in anticipation of system loads.  This position requires technical knowledge on various infrastructure stacks and disciplines, including Cloud, Server, Backup, Networking, and Security Compliances. You breathe SLO and ensure SLA.  You and your team will be the first responder when things go wrong and will work on many distinct types of challenges. You might be supporting a deployment at a large, global customer one day, and then helping with a new pilot project the next. You are the engine behind Building Robotics Inc’s Cloud Operations!

What will you need to succeed?

  • First and foremost, you have experience managing production environments with modern cloud infrastructure. You take pride in making sure these environments are reliable and available.
  • You hate to do things manually more than once. Therefore, you look for every opportunity to automate things; use of infrastructure as code and configuration management are second nature to you.
  • You are not afraid to roll up your sleeves to debug and fix things and you are intellectually curious, and you constantly look for ways to improve the status quote.
  • You are a strong communicator, and you value well-articulated solutions by creating diagrams and well written documentations. You also understand that sometimes stakeholders and customers will look for your guidance and you are not afraid to engage them when the situation calls for it.
  • Agile development is second nature. You are expected to coach and guide the team in daily standup, backlog grooming, sprint planning and retros; you will have a chance to provide input to improve our processes.

Key Responsibilities

  • You are a strong individual contributor that sets the standard for the team.
  • Mentoring and guiding your team members. You understand when to delegation and when to take charge.
  • Measure and manage team performance by following agile best practices.
  • Managing periodic reporting on progress to management and other stakeholders.
  • Building software to help DevOps - You oversee the building and implementation of tools and services to help Engineering do better with agile development and delivery and drive deeper reliability to our systems in production.
  • Add automation and context to alerts – leading to better real-time collaborative response from technical responders. Additionally, maintaining runbooks, tools and documentation to help prepare Engineering for future incidents.
  • Monitoring the processes during the entire lifecycle for its adherence and updating or creating new processes for improvement and minimizing the wastage
  • Encouraging and building automated processes wherever possible.
  • Build cloud infrastructure
  • Identifying and deploying cybersecurity measures by continuously performing vulnerability assessment and risk management
  • Incidence management and root cause analysis.
  • Coordination with other technical teams (Dev, QA) and customers facing teams.
  • Strive for continuous improvement and build CI/CD pipeline and tools.

Required Skills:

  • Hands-on public Cloud experience - AWS, Google Cloud, and/or Azure (multi-cloud preferred). Prefer industry certification (e.g. AWS Certified DevOps / Solutions Architect)
  • Knowledge of Monitoring and Observability Tools: DataDog, Prometheus, Grafana, Geneos, Jaeger, OpenTelemetry, Zipkin, Splunk
  • Experience in configuring and securing databases such as RDS, PostgreSQL, Cassandra, Redis.
  • Experience in networking and security concepts such as DNS, HTTP, HTTPS, SSL/TLS as well as practical knowledge such as reverse proxy, load balancer, and firewall configurations.

Technologies We Use: Kubernetes, Docker, Terraform, AWS, GCP, Azure, Helm, Datadog, Grafana, SonarQube, Nexus & NexusIQ, Jenkins, GitLab CI/CD, Python & Django, Java & Tomcat, Shell Scripting, Postgres, Cassandra, Kafka, Redis, Okta, Ansible


Organization: Smart Infrastructure

Company: Siemens S.A.

Experience Level: Experienced Professional

Full / Part time: Full-time

Can't find what you are looking for?