Research Engineer Intern, Evaluations Job at TensorStax, Alameda, CA

Yit1cEgrTVJXWGNKZlVNSng2SEVja2FKK3c9PQ==
  • TensorStax
  • Alameda, CA

Job Description

Research Engineer Intern, Evaluations & Benchmarks

Location: San Francisco (Hybrid)

About TensorStax:

TensorStax is building fully autonomous AI systems to manage and optimize mission-critical data infrastructure. Our research integrates reinforcement learning and language models to enhance reasoning over large-scale data lakes and warehouses, detect failures in pipelines, and autonomously construct and optimize data workflows with high precision.

We are looking for a Research Engineer Intern to design evaluation frameworks and benchmarks that assess the autonomy, adaptability, and reliability of AI agents in data engineering environments. This role is ideal for candidates passionate about AI evaluations, language model benchmarking, and autonomous data systems.

What You’ll Do:

  • Develop evaluation environments to test AI agents' ability to reason, plan, and act autonomously within mission-critical data pipelines.
  • Design benchmarks to assess model capabilities in failure detection, pipeline optimization, and agentic decision-making in data workflows.
  • Implement automated assessment frameworks for language model-based agents operating over data lakes and warehouses.
  • Work with synthetic and real-world datasets to create robust testing environments for AI-driven data automation.
  • Collaborate with research engineers to refine reward shaping strategies, guiding models toward more efficient and agentic behaviors in data-intensive tasks.

What We’re Looking For:

  • Experience in language model research, with a focus on benchmarking LLMs in mission-critical domains.
  • Strong background in AI evaluation methodologies, reinforcement learning, and RLHF techniques.
  • Familiarity with benchmarking language models for structured and unstructured data tasks.
  • Proficiency in Python and experience with ML frameworks like PyTorch or JAX.
  • Hands-on experience with data lakes, warehouses, and data engineering tools (Snowflake, BigQuery, dbt, Spark, Kafka).
  • High agency—proactive, resourceful, and comfortable working in a fast-paced research environment with minimal supervision.
  • Attention to detail—ability to design rigorous, reproducible experiments and evaluations.

Bonus Points:

  • Contributions to open-source AI benchmarks (e.g., SweBench, BIRD, SPIDER).
  • Contributions to open-source agentic frameworks.
  • Experience developing custom RL environments for AI evaluation.
  • Strong understanding of ETL, ELT, and data transformation pipelines.

Benefits:

  • Competitive internship stipend.
  • 100% employer-covered health, dental, and vision insurance (for eligible interns).
  • Access to Bay Club or Equinox in San Francisco.
  • Opportunity to work at the cutting edge of AI evaluations and autonomous data engineering research.

Job Tags

Internship,

Similar Jobs

Shift Robotics

Mobile App Developer Job at Shift Robotics

We are seeking a highly skilled Mobile App Developer to join our team! The ideal candidate will have experience in developing and deploying mobile applications on both iOS and Android platforms, as well as in Bluetooth technology and using backend technologies like Firebase... 

The Resource Co

Senior Research And Development Engineer Job at The Resource Co

 ...parameters and design outputs. Documentation : Maintain accurate and clear documentation for verifications, validations, design history, risk management, and manufacturing instructions. Stakeholder Collaboration : Engage and negotiate effectively with multiple... 

National Airlines

Flight Attendant Job at National Airlines

 ...carriages worldwide. National Airlines specializes in over-sized, time sensitive, and special handling requirements, utilizing 757-20...  ...and A-330 aircraft to accomplish its missions. Job Title: Flight Attendant Location: Orlando, FL Department: Inflight Reports... 

GVEC

Digital Marketing and Multimedia Designer Job at GVEC

 ...What We Are Looking For GVEC is looking to hire you for the position of Digital Marketing and Multimedia Designer to deliver the unexpected with a willingness to continuously develop yourself, your department, and our organization. As a Digital Marketing and Multimedia... 

Chick-fil-A

CFA Front of House Team Member Job at Chick-fil-A

 ...Immediate Opening: CFA Front of House Team Member Are you a highly skilled and motivated individual who loves serving and helping others? Do you thrive in a dynamic and fun work environment? If so, we have an exciting opportunity for you! About Chick-fil-A: Chick...