Skip to Content

Employment Opportunities


High Performance Computing Systems Administrator (Operating Systems Programmer/Analyst 2 or 3), UCP 7 or 9

Posted on August 21, 2017

Under the general supervision of the Team Lead for Research Technology, these positions are responsible for the day to day operations of the university’s large-scale research computing platform on the Storrs campus.

Incumbents in this position are considered either technical specialists or technical experts based on their level and are expected to have expertise with large-scale integrated computing systems. Individuals are expected to apply a wide range of problem solving and resource management techniques to their work which is of moderate to complex difficulty. Incumbents are expected to carry out projects of moderate to large size and complexity with minimal supervision.

The HPC cluster at UITS serves as the catalyst for research in computationally intensive disciplines, in alignment with the university’s ambitious initiatives – UConn Technology Park, Next Generation Connecticut, and Bioscience Connecticut. The cluster has over 6,000 CPU cores, a one petabyte parallel file system (GPFS), and an Infiniband interconnect.

PRIMARY RESPONSIBILITIES

• Install, configure, and maintain Linux operating systems (RHEL v6/v7).

• Perform operating system upgrades, patching, troubleshooting, performance tuning.

• Manage the lifecycle of Linux-based scientific applications, including compiling from source and troubleshooting.

• Manage the HPC scheduling software layer (SLURM) and related tools.

• Installation and management of configuration, monitoring, and notification tools.

• Administration of an Infiniband fabric and basic Ethernet network administration.

• Hardware maintenance and troubleshooting

• Interact with university researchers on various topics, including the use of existing services, service policies, and research requirements.

• Provide excellent technical support and training to a diverse user base.

• Create and maintain clear and effective technical documentation.

• Interact with vendors, assessing products and making purchasing recommendations.

• May supervise students or employees.

• Performs related duties as required.

MINIMUM Qualifications

1. Bachelor’s degree in Computer Science, Computer Engineering, or closely related field; or equivalent combination of training and experience and 2 or more years’ experience in a large scale computing environment.

2. 2 or more years of experience in the hands-on management and troubleshooting of Linux systems (RHEL, CentOS, etc.).

3. Demonstrated experience in software installation, compilation (GCC, Intel ICS, etc.), and troubleshooting in a Linux environment.

4. Demonstrated experience using scripting/programming (Bash, Python, etc.) in support of systems operations.

5. Demonstrated commitment to providing excellent technical support to a diverse user base.

6. Good organizational skills and attention to detail along with good written and oral communications.

7. The ability to work with minimal supervision.

8. The ability to work effectively with staff and users at all levels, vendors and other technical staff and as a member of a team.

9. Demonstrated ability to meet deadlines and work under pressure.

10. Demonstrated ability to lead projects.

Preferred Qualifications

1. Experience installing and troubleshooting enterprise hardware platforms, such as servers and storage.

2. Experience with HPC infrastructure components such as job schedulers (SLURM, LSF, etc.), environment management (Modules, Lmod, etc.).

3. Experience managing large-scale data storage systems, preferably parallel file systems such as GPFS, Lustre, etc.

4. Experience in systems automation (DevOps) using tools such as Ansible, Puppet, Chef, etc.

5. Maintenance and troubleshooting of an Infiniband fabric.

6. Certifications relevant to this position.


Please mail your comments to:
Laurie.Enderle@uconn.edu or Carolyn.Chartier@uconn.edu