System Design Jobs
Large-scale AI system design and architecture roles.
50 open positions
Customer Success Manager 3
Behavox
Customer Success Manager 3 role at Behavox focused on ensuring zero churn and driving ROI for enterprise financial services clients using AI/ML-powered data solutions. The position requires managing post-implementation customer success, measuring program value, guiding best practices, and expanding accounts through upsells with a deep understanding of enterprise risk and compliance domains.
Technical Sourcing Manager
Graphcore
Technical Sourcing Manager role at Graphcore, a leading AI compute hardware company, focused on managing procurement and supply chain for specialized AI hardware manufacturing. The position involves sourcing components, vendor management, and cost optimization within a distributed team across Bristol and Taiwan.
Storage Architect
Graphcore
Graphcore seeks a Storage Architect to design and optimize high-performance storage systems for AI data centers, specializing in NVMe SSDs, PCIe topologies, and Linux kernel tuning. The role focuses on eliminating I/O bottlenecks for GPU training and inference workloads, including GPU-direct storage optimization and telemetry-driven system design. This is a critical infrastructure position requiring deep expertise in storage hardware and software optimization at massive scale.
Staff UEFI Engineer
Graphcore
Graphcore seeks an experienced Staff UEFI Engineer to design and deploy firmware for AI server platforms, focusing on system initialization and hardware configuration in large-scale data center environments. This role combines deep firmware expertise with infrastructure engineering to support next-generation AI compute hardware.
Staff System Software Engineer
Graphcore
Graphcore seeks a Staff System Software Engineer to design, implement, and test low-level kernel drivers and user-space driver libraries as part of their system software team. This role focuses on building the critical driver infrastructure for Graphcore's AI compute hardware stack. The position requires deep expertise in systems programming and hardware-software integration to support their datacenter-scale AI compute platform.
Staff SoC Architect
Graphcore
Staff-level SoC Architect role at Graphcore responsible for designing and specifying sub-systems within high-performance AI acceleration silicon devices. The position requires deep expertise in silicon architecture, hardware integration, and cross-team collaboration with silicon design, verification, and software teams to deliver next-generation AI compute solutions.
Staff Silicon Verification Engineer - Bengaluru, Multiple Vacancies
Graphcore
Graphcore seeks experienced Silicon Verification Engineers for their new Bengaluru AI Engineering Campus to ensure RTL and silicon designs meet architectural specifications. The role involves verification planning, functional coverage closure, and quality assurance for advanced AI computing semiconductors. This is a staff-level position requiring deep expertise in silicon verification and hardware design methodologies.
Staff Post Silicon Validation Engineer (Bringup)
Graphcore
Staff-level post-silicon validation engineer role focused on bringing up and validating cutting-edge AI chips at Graphcore's new Austin campus. Responsibilities include leading first silicon bringup, functional validation, collaborating with cross-functional teams, and architecting test infrastructure improvements while mentoring other engineers.
Staff Microcontroller Firmware Developer
Graphcore
Graphcore seeks a Staff Microcontroller Firmware Developer to design and implement firmware for microcontroller-based management systems in AI server and rack-scale platforms. The role focuses on Zephyr RTOS-based firmware development and low-level device drivers for real-time embedded systems supporting hyperscale data center infrastructure. This is a senior-level engineering position requiring deep expertise in embedded firmware development and hardware-software integration.
Staff Machine Learning Engineer (Large Systems)
Graphcore
This is a Staff Machine Learning Engineer role at Graphcore focused on developing and optimizing AI models for specialized hardware at scale. The position involves implementing cutting-edge ML models, optimizing performance across thousands of accelerators, and collaborating with software and research teams to advance AI compute technology. Candidates should have strong expertise in distributed training, model optimization, and large-scale system implementation.
Staff Firmware Validation Engineer
Graphcore
Graphcore seeks a Staff Firmware Validation Engineer to ensure quality and reliability of firmware across ARM-based AI server platforms, including SoC firmware (UEFI), OpenBMC, and rack management systems. This role is critical for validating the infrastructure powering hyperscale AI deployments and requires deep expertise in firmware testing and system-level validation.
Staff Embedded SW/FW Engineer (Bringup) - Bengaluru
Graphcore
Staff-level embedded systems engineer role focused on bringing up and validating cutting-edge AI accelerator chips at Graphcore. Responsibilities include developing C/C++ bringup sequences, post-silicon validation, test infrastructure, and mentoring junior engineers in a collaborative hardware validation team.
Staff Bring-Up and Characterisation Engineer
Graphcore
Graphcore seeks a Staff Bring-Up and Characterisation Engineer to lead the bring-up and validation of cutting-edge AI silicon processors and high-performance blade systems at their new Austin AI Engineering Campus. This role requires deep expertise in hardware characterization, silicon validation, and cross-functional coordination with architecture, silicon engineering, and product teams. The position offers a competitive salary of $198,100-$268,000 plus phantom equity, targeting an experienced engineer capable of managing complex hardware platform launches.
Software Infrastructure Kubernetes Engineer
Graphcore
Graphcore seeks a Kubernetes-focused Software Infrastructure engineer to develop and manage critical platforms supporting their AI compute teams. The role involves building CI/CD pipelines, deployment systems, and infrastructure services for machine learning software components on HPC platforms. You'll work in cross-functional squads to eliminate operational toil and deliver long-term engineering solutions.
Software Engineer in Build Engineering
Graphcore
Join Graphcore's new Build Engineering team to develop critical infrastructure tools that power their ML software stack. You'll optimize build, test, and deployment processes for high-performance AI platforms while working with distributed systems and collaborating closely with QA and development teams.
Software Engineer
Graphcore
Graphcore seeks a Software Engineer to develop their Collectives Communication Library for AI hardware accelerators, focusing on high-bandwidth, low-latency distributed computing primitives. The role involves designing and implementing complex software systems that integrate custom hardware with existing AI ecosystems, requiring strong systems programming and distributed systems expertise.
Signal Integrity Engineer
Graphcore
This is a hardware signal integrity engineering role focused on designing and validating high-speed PCB and interconnect systems for AI compute hardware at Graphcore. The position requires expertise in simulating and optimizing SERDES (112Gbps) and DDR interfaces, collaborating with CAD and architecture teams to ensure optimal system performance.
Senior Technical Program Manager
Graphcore
Senior Technical Program Manager role at Graphcore focused on bridging technical and management domains for AI compute infrastructure. Responsibilities include liaising between technical teams, managing project schedules, and ensuring delivery of integrated hardware-software systems across workload management, systems management, and observability platforms. Requires deep technical expertise combined with program management excellence to coordinate multi-functional teams in a complex semiconductor and AI infrastructure context.
Senior Staff Engineer - Telemetry
Graphcore
Senior Staff Engineer role focused on designing and deploying scalable management and observability solutions for AI infrastructure at Graphcore. The position requires architecting monitoring systems, establishing infrastructure controls, and creating reference designs that bridge internal engineering efforts with customer deployments.
Senior Software Engineer
Graphcore
Senior Software Engineer role at Graphcore focused on designing and developing a large-scale collective communication simulator for new AI hardware. Requires expertise in complex software systems, hardware integration, and distributed computing with leadership responsibilities for mentoring junior engineers and driving technical excellence.
Senior Silicon Verification Engineer
Graphcore
Senior Silicon Verification Engineer role at Graphcore responsible for verifying RTL implementations against architectural specifications for AI computing hardware. The position involves verification planning, functional coverage closure, and cross-functional collaboration with design and architecture teams to ensure high-quality silicon delivery.
Senior Quality Assurance Engineer - Workload Management
Graphcore
Senior QA Engineer role focused on validating Kubernetes integrations for next-generation AI accelerator hardware at Graphcore. The position involves testing workload scheduling, orchestration, and resource utilization across distributed computing environments within the SoftBank AI ecosystem. This is a critical infrastructure testing role bridging hardware validation with cloud-native deployment systems.
Senior Machine Learning Engineer (Large Systems)
Graphcore
This role focuses on developing and optimizing AI models for Graphcore's specialized hardware at large scale, working across distributed systems spanning thousands of accelerators. The engineer will collaborate with software and research teams to implement cutting-edge models, benchmark performance, optimize kernels, and contribute to reference applications that showcase the hardware's capabilities. Success requires deep expertise in machine learning implementation, system-level optimization, and the ability to identify and resolve performance bottlenecks in production-scale AI systems.
Senior Machine Learning Engineer (Large Systems)
Graphcore
Senior ML Engineer role focused on developing and optimizing AI models for specialized hardware at massive scale (1000s of accelerators). The position requires expertise in model implementation, performance optimization, and distributed systems while working closely with software and research teams to advance Graphcore's AI compute technology.
Senior Kubernetes Software Engineer (Go)
Graphcore
Graphcore seeks a Senior Kubernetes Software Engineer to develop Go-based services that integrate AI accelerator hardware into Kubernetes clusters. This role focuses on building production-grade software components at the intersection of hardware, software, and cloud platforms, including CRDs, operators, and cloud-native infrastructure solutions.
Senior Embedded SW/FW Engineer (Bringup)
Graphcore
Senior embedded software/firmware engineer role focused on post-silicon validation and bringup of advanced AI chips at Graphcore's Austin campus. Responsibilities include bringing first silicon to life, functional validation, test infrastructure architecture, and technical leadership of validation teams supporting new product introductions.
Senior Bring-Up and Characterisation Engineer - Bengaluru, Multiple Vacancies
Graphcore
Graphcore seeks a Senior Bring-Up and Characterisation Engineer to design and execute comprehensive testing and validation plans for cutting-edge AI processor silicon and system platforms. The role involves automated and manual lab testing, datacentre validation, and cross-team collaboration to ensure silicon devices operate correctly across all conditions before product release.
Senior Bring-Up and Characterisation Engineer
Graphcore
Senior hardware engineer role at Graphcore focused on bringing up and characterizing cutting-edge AI semiconductor devices and system platforms. Requires expertise in silicon validation, device testing, and collaboration across architecture, silicon, hardware, and product teams to ensure successful deployment of AI computing hardware.
Research Scientist (Embodied AI & World Models)
Graphcore
Graphcore seeks an experienced Research Scientist to advance embodied AI and world models for edge/low-power scenarios including robotics and autonomous driving. The role involves developing hardware-aware AI algorithms and deploying multimodal models while contributing to fundamental and applied research published at top-tier ML conferences. You'll join a collaborative research team across UK locations working on efficient compute, model scaling, and next-generation AI architectures.
Research Engineer
Graphcore
Graphcore seeks a Research Engineer to advance AI compute through hardware-aware algorithms and implementations. The role combines machine learning expertise with strong software engineering and performance optimization skills to deliver impactful research across efficient training/inference, world models, and reinforcement learning. You'll collaborate with researchers to translate ideas into scalable implementations and contribute to publications at leading AI conferences.
Quality Assurance Engineer - Workload Management
Graphcore
This QA Engineer role focuses on validating Kubernetes integrations for next-generation AI accelerator hardware within the Workload Management team at Graphcore. The position involves testing and ensuring efficient scheduling, orchestration, and utilization of AI accelerators across distributed computing environments. This is a critical infrastructure role bridging AI hardware and cloud-native orchestration platforms.
Pytorch Engineer
Graphcore
Graphcore seeks a PyTorch Engineer to design and optimize software for machine learning accelerators within their frameworks team. The role involves implementing new features, optimizing performance, maintaining codebases, and collaborating across engineering teams while contributing to PyTorch and Triton integration. Success requires strong software engineering fundamentals, deep PyTorch expertise, and the ability to balance code quality with business delivery.
Principal Software Architect
Graphcore
Graphcore seeks a Principal Software Architect to define and drive the architectural vision of their ML accelerator's software stack, spanning firmware to ML frameworks. The role involves designing coherent end-to-end architecture, communicating complex technical vision across engineering teams, and maintaining architectural integrity as the product evolves. This is a high-level technical leadership position requiring deep expertise in system design and hardware-software interactions.
Principal Security Firmware Engineer
Graphcore
Graphcore seeks a Principal Security Firmware Engineer to design and validate security mechanisms for their AI compute hardware platforms. The role focuses on secure firmware architecture, trusted boot systems, and secure update frameworks while collaborating with hardware and security teams. This is a senior leadership position requiring deep expertise in embedded systems security and low-level firmware development.
Principal Power Engineer
Graphcore
Graphcore seeks a Principal Power Engineer to architect power delivery systems from grid to chip for their AI computing platforms. The role requires deep expertise in power electronics and infrastructure design, collaborating across hardware, firmware, and data center operations teams. This is a strategic leadership position focused on ensuring performance, reliability, and scalability for next-generation AI computing systems.
Principal Power Engineer
Graphcore
Principal Power Engineer role at Graphcore focused on architecting power delivery systems from grid to chip for AI computing hardware. Requires deep expertise in power distribution, semiconductor systems, and data center infrastructure with cross-functional leadership across hardware, firmware, and operations teams. Based in Austin or San Jose with flexible remote work options, offering competitive compensation with phantom equity.
Principal Hardware Diagnostics Engineer
Graphcore
Graphcore seeks a Principal Hardware Diagnostics Engineer to design and develop advanced diagnostics software for monitoring hardware health and diagnosing system-level issues across their AI infrastructure platforms. The role involves building diagnostics agents, tools, and analytics frameworks to help engineers and automation systems identify, isolate, and resolve hardware issues at blade-level and rack-scale cluster levels.
Principal Hardware Design Engineer
Graphcore
Graphcore seeks a Principal Hardware Design Engineer to lead the design and development of advanced AI/ML hardware systems and compute platforms. This role requires deep expertise in electrical engineering, PCB design, and system-level architecture, with responsibility for collaborating across multiple technical disciplines including power, thermal, and firmware teams. The successful candidate will serve as a technical leader driving design excellence for next-generation AI infrastructure.
Principal Embedded SW/FW Engineer (Bringup) - Bengaluru, multiple vacancies
Graphcore
This Principal-level embedded systems role focuses on post-silicon validation and bringup of AI chips, requiring deep expertise in bare metal C/C++ development, FPGA/emulator testing, and ML workload understanding. The engineer will lead cross-functional validation efforts, architect test infrastructure, and mentor junior team members while working on high-performance AI chip products from first silicon through full characterization.
Hardware Validation Manager
Graphcore
This role leads hardware validation efforts for AI compute platforms at Graphcore, requiring deep technical expertise in hardware testing, electrical and functional validation, and reliability engineering. The successful candidate will manage a validation team, define validation strategies, and ensure products meet quality targets from prototype through production release.
Hardware Development Engineer
Graphcore
Hardware Development Engineer responsible for designing and validating circuit solutions for AI accelerator systems at Graphcore. The role involves analyzing system requirements, generating schematics, collaborating with PCB CAD teams, and executing comprehensive test campaigns to optimize processor-based hardware performance.
Graduate Electrical Engineer
Graphcore
Graphcore seeks a recent graduate in Electrical or Computer Engineering to join their Server Design team in Austin, developing next-generation AI infrastructure for hyperscale data centers. The role involves designing cutting-edge AI servers and power distribution systems while collaborating with experienced engineers across multiple disciplines to deliver high-performance, power-efficient solutions for modern AI workloads.
Global Sourcing Manager - Power Equipment
Graphcore
This is a strategic procurement role focused on sourcing power infrastructure components for AI compute platforms, not a technical AI/ML position. The role requires expertise in global sourcing, supplier negotiations, supply chain resilience, and cross-functional stakeholder management rather than AI domain knowledge.
Distinguished Engineer - Inference Serving Network and Storage
Graphcore
Distinguished Engineer role leading end-to-end networking and storage architecture for large-scale AI inference serving systems at Graphcore. Responsible for defining serving fabric design, KV cache management, storage strategies, and driving cross-functional technical decisions that impact product differentiation and competitive advantage. Chief technologist position requiring expert-level technical leadership, strategic thinking, and influence across organizational boundaries.
Director, Silicon Logical Design
Graphcore
Director-level leadership position at Graphcore responsible for overseeing the Logical Design group's microarchitecture and RTL design strategy for advanced AI compute chips. The role requires guiding a multi-site team, ensuring world-class standards in performance/power/area/schedule, and driving cross-functional collaboration with Architecture, Physical Design, Verification, and Program Management teams to deliver silicon aligned with product roadmap.
DFT Engineer
Graphcore
Graphcore seeks a DFT Engineer to join their AI compute semiconductor team, responsible for implementing Design for Testability strategies and ensuring product quality through robust testing methodologies. The role requires strong expertise in DFT, RTL design, and hands-on experience with synthesis and timing analysis to support their advanced AI hardware platform.
BMC Engineer
Graphcore
This role focuses on developing and maintaining OpenBMC software stacks for AI server management hardware, including kernel drivers for ASPEED devices and Redfish API enhancements. The engineer will work cross-functionally with firmware, hardware, and business teams to deliver enterprise-grade baseboard and rack management solutions for Graphcore's AI compute infrastructure. This is a systems-level engineering position critical to the deployment of large-scale AI datacenter environments.
Asset & Inventory Operations Coordinator
Graphcore
Graphcore seeks an Asset & Inventory Operations Coordinator to manage end-to-end material lifecycle for R&D infrastructure, including GPU components, custom accelerators, and data center equipment across global lab networks. The role involves building frameworks, tools, and workflows to ensure tracking, delivery, compliance, and operational excellence for engineering labs and infrastructure systems.
Asset & Inventory Operations Coordinator
Graphcore
Graphcore seeks an Asset & Inventory Operations Coordinator to manage end-to-end material lifecycle for R&D infrastructure, supporting engineering labs and data centers with component tracking and delivery. The role involves building frameworks, tools, and workflows for asset management, compliance, and operational excellence across a global lab network. This is a critical operational position supporting advanced AI computing hardware infrastructure at a SoftBank Group company.
AI SoC Validation / Bring-up Lead
Graphcore
Lead the validation and bring-up of Graphcore's AI System-on-Chip (SoC) hardware, ensuring performance and reliability. This role combines deep hardware expertise with AI systems knowledge to debug, test, and optimize next-generation AI accelerator chips at a world-leading AI compute company.