System Administrator, Data Centers

Ann Arbor, MI
Full Time
Technical Operations
Mid Level
Utilidata is a fast-growing NVIDIA-backed edge AI company enabling greater visibility and control of power utilization in energy-intensive infrastructure, like the electric grid and data centers. Karman, the company’s distributed AI platform powered by a custom NVIDIA module, is transforming the way utility companies operate the grid edge and will enable data centers to unlock more compute for the same provisioned power.

The Systems Administrator is responsible for the day-to-day operational support, maintenance, and reliability of Karman systems deployed in high-density data center environments. This role focuses on Linux systems administration, system health, uptime, and operational excellence. This is not an architecture or design role, but a hands-on position ensuring stable, secure, and optimized production systems. This position is based onsite at our company headquarters in Ann Arbor, Michigan, with flexibility for occasional remote work. Candidates will be expected to collaborate cross-functionally with remote teams based across the country. 

Responsibilities
  • Perform day-to-day Linux systems administration across production environments (RHEL, Ubuntu, CentOS or similar)
  • Install, configure, patch, upgrade, and maintain Linux servers and associated infrastructure
  • Monitor system performance, availability, and resource utilization; proactively address issues before impact
  • Troubleshoot and resolve operating system, application, and infrastructure-related incidents
  • Execute established deployment procedures and configuration standards for Karman systems
  • Support rack-based systems, including hardware checks and coordination around PDUs and PSUs as needed
  • Maintain system security through patching, access control management, and adherence to best practices
  • Perform log analysis, root cause analysis, and incident documentation
  • Contribute to documentation of operational runbooks, troubleshooting guides, and system procedures
  • Support Tailscale network access, user accounts, and permission policies
  • Support observability stack (Prometheus/Grafana) including dashboards, user access management, metrics collection, and alerting
  • Provide internal technical support to engineering teams, including general escalations and package install requests 
  • Work with IT and security teams on vulnerability scanning, remediation timelines, and patch prioritization
  • Support change management processes, coordinating with IT on network, firewall, and access control changes affecting production environments
  • Participate in security reviews, audits, and tabletop exercises as needed
  • Coordinate with IT on asset lifecycle management, including provisioning, decommissioning, and inventory tracking for data center hardware
  • Coordinate with the SOC on security incident detection, triage, and response for Karman production systems
  • Participate in on-call rotation to support system uptime and rapid incident response

Minimum Qualifications 
  • 5+ years of hands-on Linux systems administration experience in production environments
  • Strong experience with Linux installation, configuration management, patching, and performance tuning
  • Experience supporting systems in data center or high-availability environments
  • Solid understanding of system monitoring, log management, and troubleshooting methodologies
  • Experience working within established infrastructure standards and operational processes
  • Comfortable working in a fast-paced, operationally focused environment

Enhanced Qualifications (Nice to Have) 
  • Experience with configuration management or automation tools (Ansible, Puppet, Chef, or similar)
  • Experience with monitoring/observability tools (Prometheus, Grafana, ELK stack, or similar)
  • Familiarity with containerized environments (Docker, Kubernetes)
  • Exposure to high-performance computing, AI infrastructure, or GPU-based systems
  • Familiarity with security frameworks such as NIST, CIS benchmarks, or SOC 2 controls.
  • Experience with SIEM platforms (Splunk, Sentinel, or similar)

Salary Range: $120,000 to $150,000 base compensation depending on experience and stock options.  Salary will be commensurate with an individual's skills, training, years of experience, and in line with internal compensation bands.

Location: This position is based onsite at our company headquarters in Ann Arbor, Michigan, with flexibility for occasional remote work.

Our Commitments:
Utilidata values the diversity of our team. We provide equal employment opportunities without regard to race, color, religion, creed, sex, gender, sexual orientation, gender identity or expression, national origin, age, physical disability, mental disability, medical condition, pregnancy or childbirth, sexual orientation, genetics, genetic information, marital status, or status as a covered veteran or any other basis protected by applicable federal, state and local laws.

We are committed to:
  • Creating a diverse and inclusive workplace that is welcoming, supportive, affirming and respectful
  • Empowering employees to solve problems and work together to make a difference
  • Providing mentorship and growth opportunities as part of a collaborative team
  • A flexible work environment with flexible paid time off
  • Competitive compensation and benefits, including health, dental, vision, and employer-match 401k

 
Share

Apply for this position

Required*
We've received your resume. Click here to update it.
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*