Find your dream job now!

Click on Location links to filter by Job Title & Location.
Click on Company links to filter by Company & Location.
For exact match, enclose search terms in "double quotes".

Keywords: Distributed LLM Inference Engineer, Location: San Francisco, CA

Page: 1

Software Engineer, Inference - Multi Modal

with inference tooling like vLLM, TensorRT-LLM, or custom model parallel systems. Own problems end-to-end and are excited...About the Team OpenAI's Inference team powers the deployment of our most advanced models - including our GPT models...

Company: OpenAI
Location: San Francisco, CA
Posted Date: 23 May 2025

Software Engineer, Inference - TL

across research, infra, and product teams. Mentor engineers on GPU performance, CUDA development, and distributed inference... across productivity, creativity, and more. We focus on high-performance model inference and accelerating research through efficient...

Company: OpenAI
Location: San Francisco, CA
Posted Date: 10 May 2025

Software Engineer, Machine Learning

“Apply to Job” online on this web page. Software Engineer, Machine Learning Responsibilities Innovate and implement cutting-edge deep... development, responsiveness, and recommendation quality. Leveraging LLM and other state-of-the-art deep learning techniques...

Company: Meta
Location: San Francisco, CA
Posted Date: 04 Jun 2025

Staff Software Engineer, ML Serving Platform

a Staff Software Engineer to join our ML Serving team and spearhead our technical strategy on our ML inference engine. The ML... Serving team constructs large-scale online systems and tools for model inference, deployment, monitoring, and feature fetching...

Company: Pinterest
Location: San Francisco, CA
Posted Date: 25 May 2025

Machine Learning Engineer, 6+ Years Experience

Optimizing model inference (TensorRT, ONNX, Triton Inference Server) Building Kubernetes-based systems for distributed data/ML... underscores our commitment to driving worldwide innovation. About The Role As a Machine Learning Engineer at TwelveLabs...

Company: Twelve Labs
Location: San Francisco, CA
Posted Date: 18 May 2025

ML Engineer

. We are looking for an experienced full stack ML engineer with demonstrated industry experience in productionizing large scale ML models in industrial..., analysis and serving systems for features required across our Cody LLM stack Be contributing actively to the world...

Company: Source Graph
Location: San Francisco, CA
Posted Date: 10 Apr 2025

Software Engineer, ML Infrastructure - Training Platform

building machine learning training pipelines or inference services in a production setting. Experience with distributed...: Experience with LLM inference latency optimization techniques, e.g. kernel fusion, quantization, dynamic batching...

Posted Date: 02 Apr 2025

AI Engineer

://www.recruitingfromscratch.com/ AI Engineer Location: Palo Alto or Remote Company Stage of Funding: Seed Office... into actionable solutions that deliver measurable business value Design and implement LLM-powered products, including prompting...

Posted Date: 21 Mar 2025

AI Engineer

://www.recruitingfromscratch.com/ AI Engineer Location: Palo Alto or Remote Company Stage of Funding: Seed Office... into actionable solutions that deliver measurable business value Design and implement LLM-powered products, including prompting...

Posted Date: 21 Mar 2025