+
Вход

Въведи своя e-mail и парола за вход, ако вече имаш създаден профил в DEV.BG/Jobs

Забравена парола?
+
Създай своя профил в DEV.BG/Jobs

За да потвърдите, че не сте робот, моля отговорете на въпроса, като попълните празното поле:

101+58 =

+
Забравена парола

Въведи своя e-mail и ще ти изпратим твоята парола

Camplight

Python Engineer – Reinforcement Learning Environments for LLMs (Contractor)

ApplyКандидатствай

Обявата е публикувана в следните категории

  • Anywhere
  • Съобщи проблем Megaphone icon

Съобщи за проблем с обявата

×

    Какво не е наред с обявата?*
    Моля опиши ни, къде е проблемът:
    За да потвърдите, че не сте робот, моля отговорете на въпроса, като попълните празното поле:
    Tech Stack / Изисквания

    Are you a Python engineer with a background in AI/ML who wants to work on the systems that train large language models?

    Project Overview:

    You’ll be part of a team creating reinforcement learning environments used directly in advanced LLM training pipelines. Your work will influence how models learn complex behaviors and how those behaviors are evaluated and improved over time.

    What You’ll Be Doing:

    • Design and implement RL environments for LLM training
    • Create task prompts that specify desired model behavior
    • Build judges and automated evaluation logic
    • Design and integrate tool interfaces for model interaction
    • Work with data to create diverse, high-quality tasks
    • Run, debug, and improve environments inside virtualized execution setups
    • Collaborate with AI/ML engineers and researchers on reward signals and evaluation criteria
    • Improve robustness, reproducibility, and diversity of the environment suite

    Requirements:

    • 5+ years of professional Python programming experience
    • Background in AI / Machine Learning (industry, research, or advanced projects)
    • Solid understanding of reinforcement learning concepts
    • Experience with RL environments or frameworks (e.g. OpenAI Gym or similar)
    • Experience building systems involving automated evaluation, validation, or judging logic
    • Experience working with large language models
    • Experience designing custom RL environments for training
    • Experience with virtual machines, sandboxed execution, or containerized environments
    • Background in AI research, ML infrastructure, or training pipelines

    What do we offer?

    • Fully remote contractor role with flexible hours
    • Competitive compensation
    • Opportunity to transition into a full-time role
    • Career development opportunities within Camplight’s cooperative structure
    • Supportive, collaborative, and innovative team culture

    What does the interview process look like?

    Initial Interview: A 45-minute cultural and technical conversation with two Camplight team members.

    Technical Deep Dive: Choose between:

    A homework assignment (2 hours) followed by a 1-hour discussion.

    A pair programming session (2 hours) focused on real-world problem-solving.

    Regardless of the outcome, you’ll receive constructive feedback to help you grow.