Research Scientist - Applied AI/LLM

Databricks

View all jobs at Databricks

Location: Bellevue, WA

Category: Research

Domain: AI

Experience Level: Mid Level

Compensation: $142,500—$180,500

Posted 12 months ago

Apply Now

Job Description

You’ll work with teams across Databricks to conduct foundational research into the feasibility and effectiveness of solutions that help customers analyze data using natural language, and then bring those solutions into our products to make data analysis easier and more approachable for all of our customers. More broadly, our teams work on some of the hardest, most interesting problems facing the business, ranging from designing large-scale distributed AI/ML systems, to optimizing distributed GPU model serving to developing novel modeling methodologies that scale to production use cases.

The impact you will have:

Shape the direction of our applied ML areas and intelligence features in our products, helping customers translate unstructured text into structured code, queries and data.

Drive the development and deployment of state-of-the-art AI models and systems that directly impact the capabilities and performance of Databricks' products and services.

Architect and implement robust, scalable ML infrastructure, including data storage, processing, and model serving components, to support seamless integration of AI/ML models into production environments.

Develop novel data collection, fine-tuning, and pre-training strategies that achieve optimal performance on specific tasks and domains.

Design and implement automated ML pipelines for data preprocessing, feature engineering, model training, hyperparameter tuning, and model evaluation, enabling rapid experimentation and iteration.

Implement advanced model compression and optimization techniques to reduce the resource footprint of language models while preserving their performance

Contribute to the broader AI community by publishing research, presenting at conferences, and actively participating in open-source projects, enhancing Databricks' reputation as an industry leader.

We Expect You To

PhD in Computer Science, strongly preferred, or a related field or equivalent practical experience

2+ years of machine learning engineering experience in high-velocity, high-growth companies. Alternatively, a strong background in relevant ML research in academia will be considered as an equivalent qualification.

Experience developing AI/ML systems at scale in production or in high-impact research environments.

Strong track record of working with language modeling technologies. This could include the following: Developing generative and embedding techniques, modern model architectures, fine tuning / pre-training datasets, and evaluation benchmarks.

Strong coding and software engineering skills, and familiarity with software engineering principles around testing, code reviews and deployment.

Experience deploying and scaling language models in production; deep understanding of the unique infrastructure challenges posed by training and serving LLMs.

Strong understanding of computer science fundamentals.

Prior experience with Natural Language Processing and transforming unstructured text into structured code, queries and data is a plus.

Contributions to well-used open-source projects.

More jobs like this

Research Engineer

OpenAI

San Francisco

Analytics Data Engineer

OpenAI

San Francisco

Android Engineer, ChatGPT

OpenAI

San Francisco