Infrastructure Engineer

  • Advanced Tech Placement
  • Roseland, New Jersey
  • Full Time

We are looking for a Infrastructure Engineer

We are seeking a highly skilled Infrastructure Engineer to help design, build, automate, and operate scalable, high-availability production infrastructure in a fast-paced enterprise technology environment. This individual will play a key role in driving reliability, automation, cloud infrastructure strategy, operational excellence, and AI-enabled engineering practices across mission-critical systems.

Responsibilities:

  • Design, build, automate, and support large-scale, highly available cloud infrastructure environments
  • Manage and optimize containerized production platforms and orchestration environments
  • Develop and maintain Infrastructure as Code (IaC) solutions using tools such as Terraform or Pulumi
  • Build automation tooling, operational utilities, and platform enhancements using Python or Go
  • Drive infrastructure reliability, scalability, observability, and resiliency initiatives
  • Partner closely with engineering, product, security, AI/ML, and platform teams to support enterprise-wide initiatives
  • Implement and maintain monitoring, logging, alerting, and performance management solutions
  • Troubleshoot complex production issues and proactively identify systemic risks or operational weaknesses
  • Lead infrastructure improvements with a focus on reversibility, risk mitigation, and minimizing production blast radius
  • Create operational standards, automation frameworks, and deployment strategies that improve engineering velocity and reliability
  • Support AI-driven infrastructure operations, intelligent automation initiatives, and AI-assisted engineering workflows
  • Evaluate and implement emerging AI-enabled operational tooling to improve efficiency, incident response, automation, and developer productivity
  • Collaborate with engineering teams supporting AI/ML workloads, data platforms, and model deployment pipelines
  • Own infrastructure initiatives end-to-end, including architecture, implementation, rollout, rollback planning, and operational support

Requirements:

  • 5 years of experience in Infrastructure Engineering, DevOps, Site Reliability Engineering, or similar roles supporting large-scale production environments
  • Hands-on experience operating containerized production environments and orchestration platforms in enterprise or high-growth environments
  • Strong experience with Kubernetes, Helm, and Infrastructure as Code tools such as Terraform or Pulumi
  • Experience supporting cloud infrastructure environments, preferably AWS
  • Proficiency in Python or Go for automation, tooling, and infrastructure development
  • Strong experience with monitoring, observability, and logging platforms such as Prometheus, Grafana, ELK, or equivalent technologies
  • Experience implementing resilient infrastructure designs focused on scalability, reliability, rollback strategies, and operational safety
  • Strong understanding of infrastructure tradeoffs involving reliability, cost optimization, deployment velocity, and operational risk
  • Demonstrated experience leveraging AI-assisted engineering tools and agentic AI workflows within day-to-day development and operational practices
  • Experience utilizing AI-enabled platforms such as Claude Code, Codex, GitHub Copilot, or similar tools to improve automation, troubleshooting, deployment efficiency, and operational workflows
  • Familiarity with infrastructure requirements supporting AI/ML environments, including compute scalability, data processing pipelines, model deployment, or GPU-enabled workloads is highly desirable

Required Skills:

  • Excellent communication and cross-functional collaboration skills
  • Strong analytical and problem-solving capabilities
  • Ability to challenge assumptions, identify operational gaps, and recommend innovative infrastructure solutions
  • Proven ownership mindset with experience leading infrastructure initiatives from concept through production deployment
  • Strong organizational skills with the ability to prioritize and execute in fast-paced environments
  • Passion for continuous improvement, emerging technologies, and modern AI-enabled operational practices

Preferred Skills:

  • Software engineering background with experience building and maintaining production-grade applications, services, libraries, or internal frameworks
  • Ability to read, troubleshoot, and modify application codebases supporting infrastructure platforms
  • Experience bridging infrastructure engineering and software development practices
  • Experience building reusable platform tooling, developer enablement frameworks, or internal infrastructure products
  • Experience supporting enterprise-scale cloud transformation or modernization initiatives
  • Exposure to MLOps, AI infrastructure, vector databases, model serving frameworks, or intelligent automation platforms
  • Experience supporting AI/ML engineering teams through scalable infrastructure and deployment automation
Job ID: 523507574
Originally Posted on: 6/3/2026

Want to find more Real Estate opportunities?

Check out the 26,666 verified Real Estate jobs on iHireRealEstate