Skip to main content
Novo Nordisk

Lead Data Scientist

1w

Novo Nordisk

London, GB · Full-time · £95,000 – £135,000

About this role

Join Clinical AI to build production-ready LLM and GenAI solutions that accelerate drug development and improve patient care across global R&D. As Lead Data Scientist, design, develop and deploy LLM-based and ML solutions for content retrieval, generation, summarization and inference. Bridge data science and software engineering by architecting services, designing APIs, building backend systems while ensuring performance, safety, privacy and regulatory compliance.

Day-to-day tasks include designing production-ready LLM/GenAI solutions for retrieval-augmented generation (RAG), summarization and inference to address clinical and R&D use cases. Build and integrate backend services, APIs and data pipelines using embeddings, vector DBs and knowledge bases. Support end-to-end deployment with cloud platforms and containerization, fine-tune models and implement monitoring frameworks.

Optimize models through quantization, distillation and caching, operationalizing with LLM-Ops tools and CI/CD best practices. Ensure compliance with AI governance, data protection and regulatory requirements including anonymization and audit logging. Produce technical documentation and runbooks for stable production use.

Join the Clinical AI & Analytics team in R&D, working across Target Discovery, Clinical Development and Medical functions. Blend advanced machine learning research with production engineering to create trustworthy AI systems. Collaborate with Data Science, Engineering, Medical and Regulatory teams to align solutions and drive technology roadmap adoption.

Stay current on LLM research, trends, best practices and technology roadmap. Pursue publications through R&D relevant activities. Translate technical concepts for stakeholders and ensure regulatory and privacy requirements are met.

Requirements

  • Hold a PhD, Master’s or Bachelor’s degree in Computer Science, Computer Engineering, Computational Biology, Engineering or a related quantitative discipline (PhD preferred)
  • Strong practical experience in LLMs / generative AI: model selection, fine-tuning (LoRA, PEFT), prompt engineering, evaluation and observability
  • Software engineering experience from architecture design to Infrastructure as Code (IaC), with hands-on experience in cloud platforms, containers and microservices and automating serverless, event-driven pipelines
  • Experienced building data pipelines and retrieval systems (embeddings, vector DBs, knowledge bases) to support RAG and document understanding
  • Competence implementing testing, monitoring and optimisation for model performance, fairness and cost; familiar with LLM-Ops tools (e.g., LangChain, LlamaIndex, Langfuse)
  • Excellent collaboration and communication skills to work with cross-functional teams, translate technical concepts for stakeholders, and ensure regulatory and privacy requirements (GDPR) are met
  • Experience with implementing CI/CD best practices

Responsibilities

  • Design and develop production-ready LLM/GenAI solutions for retrieval-augmented generation (RAG), summarization and inference to address clinical and R&D use cases
  • Build and integrate backend services, APIs and data pipelines (embeddings, vector DBs, knowledge bases) and support end-to-end deployment using cloud platforms and containerization
  • Fine-tune and evaluate models (supervised, LoRA/PEFT, prompt engineering), implement monitoring and testing frameworks for performance, fairness, hallucination rate, latency and cost
  • Optimize models and systems (quantization, distillation, caching), and operationalise with LLM-Ops tools and CI/CD best practices for stable, secure production use
  • Ensure compliance with internal/external AI governance, data protection and regulatory requirements (anonymization, access controls, audit logging) and produce technical documentation and runbooks
  • Collaborate with internal and external stakeholders across Data Science, Engineering, Medical and Regulatory teams to align on solutions, publish outcomes and drive technology roadmap adoption
  • Stay current on LLM research, trends, best practices and technology roadmap, and pursue publications through R&D relevant activities