Job Description Summary:
We are on the lookout for a proficient Azure Cloud Engineer to provide top-tier Cloud Engineering services to Rackspace’s diverse portfolio of Enterprise and clients. The successful candidate will possess robust technical acumen, hands-on expertise, and the consultative prowess necessary to comprehend, sculpt, and fulfil our customers’ unique requirements.
\n
Responsibilities:- Customer Engagement: Serve as a technical point of contact for clients, ensuring clear communication and a positive relationship during issue resolution and service delivery.
- Platform Management: Manage scalable infrastructure using Azure IaaS, PaaS and SaaS services. Experience in K8s, managed Azure Kubernetes Service (AKS), Azure networking, with IaC (Terraform) is must. The candidate must be able to estimate costs, identify cost control mechanisms for cloud resources, implement security and compliance measures, and support audits.
- DevOps Experience - Experience in Azure DevOps pipelines, infra provisioning via terraform using Azure DevOps, App secret expiry etc.
- Incident Handling: Lead and participate in incident response, ensuring timely issue resolution, root cause analysis, and post-mortem reports.
- ITSM Best Practices: Apply ITSM frameworks for incident management, change management, and problem management to maintain operational excellence along with reporting.
- Monitoring & Automation: Develop and maintain automated tools and scripts for infrastructure monitoring, alerting, and incident response using Azure Monitor, Azure Log Analytics, Application Insights, and other monitoring tools like Prometheus or Grafana.
- Team Mentoring: Provide technical mentorship to junior engineers and operational teams, fostering a culture of continuous learning and improvement.
- Windows VM operational support – Basic knowledge in windows VM management, including patching and backup activities.
- Operational Excellence: Manage the reliability and performance of mission-critical systems, ensuring high availability and optimal performance across services.
- Continuous Improvement: Identify and implement process improvements to enhance system reliability, incident response times, and operational efficiencies.
- Documentation: Maintain comprehensive documentation for system architecture, configurations, and troubleshooting guides.
Skills & Experience:- Experience: A minimum of 3-5 years of pertinent, hands-on experience with Azure cloud technologies as mentioned below.
- Strong expertise in Azure Kubernetes Service (AKS) and other Azure services including PaaS, SaaS, and IaaS (e.g., Azure App Service, Azure SQL Database, Azure Storage, Azure Functions, and Azure Active Directory).
- Proficient with Terraform for infrastructure as code (IaC).
- Experience with Azure monitoring tools like Azure Monitor, Azure Log Analytics, and Application Insights.
- Solid understanding of cloud infrastructure, container orchestration, and microservices architectures.
- Experience in managing incidents and applying ITSM principles.
- Experience of what to monitor and continuously refine.
- Ability to define responses to incidents, deep troubleshooting experience.
- Hands-on experience with monitoring tools like Prometheus, Grafana, or similar. Experienced in setting up monitoring and observability implementations along with implementing parameters required for SRE team(s)
- Scripting and automation skills in Python, Bash, or PowerShell.
- Basic knowledge in supporting Azure Windows and Linux VM’s. Knowledge of system patching and backup and basic troubleshooting.
- Customer Facing Experience: Strong communication skills with a proven ability to engage with customers, understand their requirements, and manage expectations.
- Problem-Solving: Ability to stay calm under pressure and apply critical thinking to quickly resolve incidents and technical troubleshooting.
- Team Mentorship: Prior experience in coaching and mentoring junior engineers and promoting a collaborative working environment.
- Flexibility: Ability to adapt to changing priorities, manage multiple tasks, and work in a fast-paced, dynamic environment.
Additional Skills:- Excellent verbal and written communication skills.
- Strong interpersonal skills with the ability to work effectively across teams and stakeholders and build relations with diversified peer/customer groups.
- Ability to handle high-pressure situations with calmness and composure. You should be comfortable asking questions in a customer centric and cross-cultural environment.
- Strong organizational skills and attention to detail.
- An Individual with both technical and operational readiness experience to wear different hats on different time of situations.
Education:- Graduate, degree in a technology-related field.
\n