Nº 10/Engineering

Full-stack ML Engineer

Team: Engineering
Type: Full-time
Location: Lyon or Paris · Hybrid
Reports to: CTO
Working language: English; French

The engineer who owns Scenario's model layer end to end. Open-source models served on GPUs, external providers wrapped behind one interface, LoRA and fine-tune training pipelines, and the image, video, 3D, audio and text processing that surrounds them — you integrate, train, deploy and operate all of it. Because AI tooling collapses the cost of crossing boundaries, you don't stop at the model layer: you push into the compute layer that runs it, the infrastructure around it, the cloud API and the SDKs, and you step up on LLM and agent integration wherever the product needs it. Python is your primary language, but AI integration comes before all. This role reports to the CTO today, with a clear path to lead the ML team as we grow.

Apply for this role

Refer this role

01 Mission

Integrate and serve new models on GPUs — image, video, 3D, audio and text — from open-source weights or external provider APIs, behind one consistent interface.
Build and operate training and fine-tuning pipelines (LoRA, full fine-tunes) so customers can train custom models reliably and cheaply.
Own the processing layer around the models and generated assets — captioning, subtitles, upscaling, reframing, compositing, masking, mesh and texture rendering, etc.
Own GPU economics and reliability: latency budgets, cold starts, cost per generation, and the checkpoints/weights pipeline that ships model files to production.
Go deep in open-source code: read, debug and patch the model libraries and model repos you depend on — transformers, diffusers and friends. Fix issues upstream rather than waiting for a release.
Push past the model layer: tune the compute layer and infrastructure (GPU provisioning, serving, the cloud), open PRs in the cloud API to register and wire up new models, extend the SDKs, and step up on LLM and agent integration.
Bring AI pair-programming into the ML team's daily flow — the prompts, agents and Claude-Code-driven workflows that make model integration faster.

02 Scope of ownership

Owns

The model layer end to end: integration, training, serving and the processing pipelines.
GPU economics and reliability: latency, cold starts, cost per generation, and the checkpoints/weights pipeline.
Cross-stack reach: the compute layer and infra, model registration in the cloud API, SDK extensions, and LLM/agent integration.
AI-native ML workflows: the prompts, agents and tooling that make the team faster.

Does not own

Frontend surfaces and the web app (Engineering · Front-End) — though you'll test through it and even open issues / PRs there when a model needs it.
Cloud API architecture decisions (Engineering · Cloud) — though you ship model-integration PRs within it.
Product roadmap and prioritization (Product).

03 What we look for

Strong Python and fluency in the ML ecosystem, with a track record shipping ML in production — you've deployed models on GPUs and owned latency and cost, not just notebooks.
Hands-on across modalities — image/video diffusion (Flux, Stable Diffusion, Reve, Wan), ideally 3D (Gaussian splatting, mesh/texture), audio or text/LLMs — and able to dive into open-source model code and fix it or its integration: you could debug and open a PR against Transformers or a model repo, not just call the library.
Deeply AI-native: a daily user of AI coding assistants (Claude Code, Cursor) who ships an integration live in the interview rather than describing one.
A bias for action across the stack: when a model needs registering in the API or exposing in an SDK, you open the PR. Comfortable stepping up on LLM and agent integration.
Clear written English; based around Lyon or Paris (or exceptional and willing to travel to a hub).

04 Disqualifiers

A research-only profile: trains models in notebooks but has never shipped one behind an API with a latency and cost budget.
Needs clean boundaries: refuses to open a PR in the cloud API or an SDK when that's where the model integration lives.
Not AI-native in their own work — thinks in tickets and hand-offs rather than agents and prompts, with no integration they can build live.
Treats GPU cost, cold starts and reliability as someone else's problem.

05 How we hire2-3 weeks

01
Intro call
30 min
02
Product session
60 min
03
Build live
60-90 min
04
Written POV
Async, 48h
05
References
Parallel
06
Founder conversation
45 min