Location: California
Category: Engineering
Domain: Software
Experience Level: Senior Level
Compensation: $160,200 -- $322,300
Posted 8 months ago
Job Description
Join our team as a Sr. Engineering Manager for our AI Inference Platform and seize the extraordinary opportunity to embrace the world of GenAI with Firefly!
We're looking for an extraordinary AI infrastructure authority to lead the development of our AI Inference Platform! You will lead the architecture, design, development, and testing of the platform. The primary goal of this team is to enable the Firefly Product Team to easily run and deploy ML capabilities used by Adobe client applications.
In addition to the Firefly Team, Adobe Research and other App Teams will deploy thousands of models on this platform. Models will come from a variety of lifecycle stages (early research, development, productization, optimization, etc.)
We’re a close-knit team dedicated to creating a dynamic platform that will offer ML model serving at scale, with high-cost efficiency and on a wide variety of hardware platforms, across multiple clouds. If you enjoy having many different exciting tasks where it’s easy to draw a line from your efforts to real accomplishments, come talk to us!
What You'll Do
Partner with Applied Research and Firefly App Integrations team to understand their needs and goals. Use this knowledge to drive the development and evolution of the AI Inference Platform.
Identify and implement standard processes and open-source or public cloud solutions to increase the efficiency and scalability of the platform.
Ensure inference service enhances GPU utilization, scales models independently and optimizes COGs.
Drive the engineering aspects of the AI Inference Platform - evolving the platform as needed.
Work closely with other teams across Adobe that need short-term training and inference.
Impact the organization through contributions to technical direction and strategic decisions.
We Expect You To
A proven understanding of AI/ML, including ML frameworks, public cloud and commercial AI/ML solutions - familiarity with Pytorch, SageMaker, HuggingFace is required.
Experience building and scaling distributed systems, and experience with containerization and orchestration technologies (Kubernetes, EKS).
Strong communication and collaboration skills - building strong relationships with internal customers and external partners.
A track record of attracting top talent and leading high-performance teams to deliver results in a fast-paced dynamic environment of AI infrastructure.
Demonstration of strong analytical and problem-solving skills, with the ability to think strategically and make data-driven decisions.
A passion for staying up to date with the latest trends and technologies in AI/ML - in the cloud and on device - experience with ONNX, NVIDIA's TensorRT, OpenAI Triton, Meta AITemplates, CoreML, WinML is a plus.
A Bachelor's or Master's degree in Computer Science, Electrical Engineering, a related field, or confirmed equivalent experience.