Opportunity
General Fusion research relies heavily on computer simulation to design and operation its experimental devices. The Computational Physics team is seeking a High-Performance Computing Administrator/Research Software Engineer to administer and maintain its internal computing cluster and provide support to researchers using the system. The current system is comprised of 40 AMD Epyc 7452 processors with 5TB of memory and 200TB of storage running CentOS 8.
Key Responsibilities: - Act as a primary source of HPC expertise within General Fusion
- Manage hardware and interact with vendors support teams
- Develop customized software tools to enable physics simulation workflows
- Help maintain the software environment of the cluster
- Work collaboratively to help users optimize their codes for an HPC environment
- Help maintain General Fusion's rapidly evolving suite of physics codes
- Provide training and support to build expertise in the use of the cluster
- Maintain and update technical documentation.
- Serve as an expert resource within the IT team for Linux-related issues
- Manage and advise on use of HPC cloud computing resources
Qualifications:
Required: - Candidates must possess demonstrated experience programming in Fortran, C++, or Python.
- Working knowledge of scripting languages such as Bash, Perl, and Python
- Excellent verbal and written communication skills; experience writing documentation
- Demonstrated application programming experience under Linux
- Demonstrated experience with cluster/MPI programming and/or other parallelization techniques
- Demonstrated knowledge across a broad range of HPC skills. Experience in a research environment is an asset
- Experience configuring and managing HPC workload management and scheduling software suites required, preferably SLURM
Preferred: - Proficiency in POSIX operating systems: Ubuntu, CentOS, OSX
- Good understanding of protocols like NFS, CIFS, LDAP, DHCP, TFTP, and NTP.
- Good understanding of using and maintaining monitoring/alerting systems like Ganglia and Nagios.
- Experience installing, configuring and maintaining application tools like MariaDB, Apache/http, HDF5, netCDF
- Experience with container software like Docker or Singularity
- Experience with version control systems: subversion, git
Experience with continuous integration and testing practices for research code
Education: - Master's degree in a scientific discipline or equivalent experience.
What We Offer: - Flexible hours
- Four weeks' vacation
- Comprehensive benefits package
- RRSP Contribution
- Support for professional development
- Great company culture - social events, food trucks, bike rides, Sun Run, etc.
Applications
General Fusion is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, or age.
|