We are looking for a technically strong Hazelcast DevOps & Distributed Cache Engineer to provide operational, technical, and production support for enterprise distributed caching environments. The ideal candidate should have strong expertise in Hazelcast, Java backend systems, Kubernetes operations, and production troubleshooting within mission-critical enterprise platforms.
This role requires a hybrid profile combining Java backend engineering, distributed cache troubleshooting, DevOps operations, Kubernetes support, performance analysis, and production incident management.
Key Responsibilities
Hazelcast Platform Support
- Provide L2/L3 support for Hazelcast clusters and related application integrations.
- Monitor cluster health, member status, partition distribution, memory utilization, latency, and throughput.
- Configure and support distributed maps, Near Cache, eviction policies, TTL, replication, serialization, and cluster discovery.
- Troubleshoot cache-related incidents including high latency, split-brain scenarios, node restarts, data inconsistency, memory pressure, and degraded performance.
- Support capacity planning, performance tuning, and operational improvements.
- Coordinate with vendor support teams for patches, upgrades, and product-level escalations.
Java / Application Support
- Analyze Java application behavior related to Hazelcast integration.
- Troubleshoot JVM-level issues including heap usage, garbage collection, thread dumps, memory leaks, and serialization overhead.
- Work with application teams to identify inefficient cache access patterns and performance bottlenecks.
- Support Spring Boot and Java Microservices integrated with distributed cache platforms.
- Review and validate application-side configurations and integration patterns.
DevOps & Kubernetes Operations
- Support Hazelcast deployments running on Kubernetes or containerized environments.
- Work with Kubernetes objects including pods, services, namespaces, deployments, stateful sets, configmaps, and secrets.
- Troubleshoot pod restarts, resource utilization, liveness/readiness failures, and container logs.
- Support CI/CD deployment activities and configuration management.
- Assist with TLS/mTLS certificate-related troubleshooting.
- Coordinate with infrastructure and platform teams on networking, DNS, storage, compute, and security-related issues.
Monitoring & Incident Management
- Monitor platform and application metrics using AppDynamics, Splunk, Prometheus, Grafana, ELK, or similar tools.
- Participate in production incident management, troubleshooting calls, war-room support, and issue triage.
- Prepare Root Cause Analysis (RCA) reports for production incidents.
- Recommend operational improvements, automation opportunities, and preventive measures.
- Maintain runbooks, SOPs, support documentation, and knowledge base articles.
Required Skills
Mandatory Skills
- Strong hands-on experience with Hazelcast.
- Strong experience in Java Backend Development or Platform Support.
- Good understanding of JVM internals, memory management, garbage collection, heap analysis, and thread dumps.
- Experience with distributed caching and in-memory data grid platforms.
- Hands-on experience with Kubernetes, containers, Linux, and basic networking.
- Experience supporting applications in production or enterprise environments.
- Strong troubleshooting, analytical, and communication skills.
Preferred Skills
- Spring Boot and Microservices Architecture.
- Redis or Apache Ignite exposure.
- CI/CD tools such as Jenkins, GitLab CI, Azure DevOps, or similar.
- OpenShift, Anthos, or enterprise Kubernetes platforms.
- AppDynamics, Splunk, Prometheus, Grafana, ELK, or similar observability tools.
- Knowledge of TLS/mTLS, certificates, and secure service communication.
- Experience in Banking, Telecom, or other mission-critical enterprise environments.
Experience & Qualification
- Bachelor’s Degree in Computer Science, Information Technology, Software Engineering, or equivalent experience.
- 5+ years of experience in Java Backend Development, Platform Engineering, or Production Support.
- 2+ years of experience with Hazelcast, Distributed Systems, Kubernetes, or DevOps Operations.
- Prior experience supporting enterprise production environments is preferred.
To apply for this job email your details to rajat.tyagi@praxisconsultants.in