Job Summary:
Squarepoint is looking for a talented and highly motivated Ultra Low Latency Platform Engineer to provide solutions across Squarepoint’s global colocation (COLOs) estate consisting of 400+ servers across 30 global sites. The candidate will be responsible for project delivery, support escalations, monitoring, automation, security, documentation, and capacity management for Squarepoint’s low latency infrastructure. This will involve collaborating with our business partners, application owners, clients, vendors, and internal teams (SRE, Network, Application Support and Application Development, Quants, etc.) to deliver end to end solutions in a timely manner.
- Manage systems efficiently at scale through standardization, automation, testing, and in-depth monitoring
- Enforce development standards for source control, testing, and continuous integration for infrastructure, OS, patches, and configuration management
- Manage a distributed compute environment and multiple petabyte-scale storage systems
- Install, manage, and monitor the Linux operating system (RHEL based)
- Troubleshoot complex hardware and software issues throughout the Squarepoint technology stack
- Create self-healing systems and automated recovery processes
- Respond to system incidents and participate in on-call rotations
- Conduct root cause analysis of incidents and outages
- Reduce operational toil through the development of user-driven automated workflows
- Work with business owners to regularly re-prioritize the book of work, while delivering both tactical and long-term objectives
Required Qualifications:
- 5+ years of experience working with Linux (RHEL/CentOS/Rocky preferred) in a large complex or niche environment with the following areas of focus: operations, systems engineering and systems performance.
- Server Management and Support: HP, SuperMicro, Dell, various overclock servers.
- Experience with Low latency network interfaces and kernel bypass (configuration and optimization): Solarflare with onload, Mellanox with VMA.
- Experience with build and configuration management tools, specifically Chef or Ansible.
- Experience with observability tools, specifically Grafana and Prometheus.
- Highly motivated and a keen eye for scripting and automation in Python, Ruby, and Bash.
- In depth knowledge of server network stack configuration, tuning and troubleshooting including TCP, UDP(unicast/multicast), NTP, PTP, wireshark/tshark
- Critical thinking and problem-solving skills to tackle troubleshooting the unknown, glitches and the obscure.
- Good understanding of trading venues such as Nasdaq, LSE, Euronext etc.
- Degree in Engineering, Computer Science or related experience.
The minimum base salary for this role is $80,000 if located in New York. This expectation is based on available information at the time of posting. This role may be eligible for discretionary bonuses, which could constitute a significant portion of total compensation. This role may also be eligible for benefits, such as health, dental, and other wellness plans, as well as 401(k) contributions. Successful candidates’ compensation and benefits will be determined in consideration of various factors.