Job Search Results

Software Engineer, Inference - Multi Modal

with inference tooling like vLLM, TensorRT-LLM, or custom model parallel systems. Own problems end-to-end and are excited...About the Team OpenAI's Inference team powers the deployment of our most advanced models - including our GPT models...

Apply Now

Company: OpenAI

Location: San Francisco, CA

Posted Date: 23 May 2025

Software Engineer, Inference - TL

across research, infra, and product teams. Mentor engineers on GPU performance, CUDA development, and distributed inference... across productivity, creativity, and more. We focus on high-performance model inference and accelerating research through efficient...

Apply Now

Company: OpenAI

Location: San Francisco, CA

Posted Date: 10 May 2025

Software Engineer, Machine Learning

“Apply to Job” online on this web page. Software Engineer, Machine Learning Responsibilities Innovate and implement cutting-edge deep... development, responsiveness, and recommendation quality. Leveraging LLM and other state-of-the-art deep learning techniques...

Apply Now

Company: Meta

Location: San Francisco, CA

Posted Date: 04 Jun 2025

Staff Software Engineer, ML Serving Platform

a Staff Software Engineer to join our ML Serving team and spearhead our technical strategy on our ML inference engine. The ML... Serving team constructs large-scale online systems and tools for model inference, deployment, monitoring, and feature fetching...

Apply Now

Company: Pinterest

Location: San Francisco, CA

Posted Date: 25 May 2025

Machine Learning Engineer, 6+ Years Experience

Optimizing model inference (TensorRT, ONNX, Triton Inference Server) Building Kubernetes-based systems for distributed data/ML... underscores our commitment to driving worldwide innovation. About The Role As a Machine Learning Engineer at TwelveLabs...

Apply Now

Company: Twelve Labs

Location: San Francisco, CA

Posted Date: 18 May 2025

ML Engineer

. We are looking for an experienced full stack ML engineer with demonstrated industry experience in productionizing large scale ML models in industrial..., analysis and serving systems for features required across our Cody LLM stack Be contributing actively to the world...

Apply Now

Company: Source Graph

Location: San Francisco, CA

Posted Date: 10 Apr 2025

Software Engineer, ML Infrastructure - Training Platform

building machine learning training pipelines or inference services in a production setting. Experience with distributed...: Experience with LLM inference latency optimization techniques, e.g. kernel fusion, quantization, dynamic batching...

Apply Now

Company: Scale AI

Location: New York City, NY - San Francisco, CA

Posted Date: 02 Apr 2025

AI Engineer

://www.recruitingfromscratch.com/ AI Engineer Location: Palo Alto or Remote Company Stage of Funding: Seed Office... into actionable solutions that deliver measurable business value Design and implement LLM-powered products, including prompting...

Apply Now

Company: Recruiting From Scratch

Location: San Francisco, CA

Posted Date: 21 Mar 2025

AI Engineer

://www.recruitingfromscratch.com/ AI Engineer Location: Palo Alto or Remote Company Stage of Funding: Seed Office... into actionable solutions that deliver measurable business value Design and implement LLM-powered products, including prompting...

Apply Now

Company: Recruiting From Scratch

Location: San Francisco, CA

Posted Date: 21 Mar 2025

Find your dream job now!

Keywords: Distributed LLM Inference Engineer, Location: San Francisco, CA

Page: 1

Software Engineer, Inference - Multi Modal

Software Engineer, Inference - TL

Software Engineer, Machine Learning

Staff Software Engineer, ML Serving Platform

Machine Learning Engineer, 6+ Years Experience

ML Engineer

Software Engineer, ML Infrastructure - Training Platform

AI Engineer

AI Engineer