CAMERA GitHub X LinkedIn

I am a Principal Scientist Manager (Director) at Microsoft, based in Cambridge (UK), where I currently lead work on post-training, multimodal conversational AI, evaluation systems, and meeting intelligence for Microsoft Teams. Increasingly, this work also touches reinforcement learning-style optimization, feedback-driven improvement, and the practical challenges of making agentic systems useful in real products.

Before this, at Microsoft Mesh and Mixed Reality, I founded and led the Motion Lab team, working on real-time understanding and generation of human motion for remote communication in 2D and Mixed Reality. Our work contributed to shipped experiences across Teams, Microsoft Mesh, and Azure-connected services, including audio-driven avatars, face tracking, body pose estimation, and technologies for more compelling presence in immersive communication.

Up to 2021 I was a full-time Professor of Computer Science at the University of Bath, where I joined as a Royal Academy of Engineering / EPSRC Research Fellow in 2007 and where I still hold a part-time position. In 2015 I founded and became Director of the Centre for the Analysis of Motion, Entertainment Research and Applications (CAMERA), building it from an initial vision into a 50+ person centre with major facilities, industrial partnerships, a strong route-to-impact model, and over £20m in funding and partner contributions, before helping secure the £45m+ MyWorld investment in the South West of the UK.

In my career so far, my interests have spanned computer vision, graphics, generative AI, multimodal modelling, reinforcement learning, and animation science and engineering applied to problems in Mixed Reality, digital humans and avatars, real-time communication, motion capture, video game animation, visual effects, and biomechanics.

Multimodal AI Post-Training Real-Time Communication Digital Humans Computer Vision

Leadership and AI Delivery

A major part of my work today is not just creating new models or algorithms, but building the environment in which AI can be delivered well: defining product strategy, establishing teams, mentoring scientists and engineers, creating evaluation and experimentation workflows, and helping organizations move from prototypes to robust customer-facing systems.

In Teams, this has meant helping drive the shift toward multimodal agents, post-trained AI systems, and proactive collaboration experiences, setting technical direction for conversational AI, and building the foundations for synthetic data generation, model evaluation, and live testing using human and AI feedback.

The same leadership pattern was central to CAMERA. Founding CAMERA was not simply a research exercise: it required creating the vision, bringing together academic and industrial partners, raising substantial funding, hiring and growing an interdisciplinary team, building studios and technical infrastructure, and turning the centre into something that delivered both strong research and real-world impact. That remains one of the clearest examples of the kind of institution-building and long-horizon technical leadership I most enjoy.

It has also pulled my work further toward reinforcement learning, post-training, and the broader question of how modern AI systems become reliably useful once they leave the demo stage and enter products that people use every day. I am especially interested in real-time communication settings, where models must work under latency, context, and interaction constraints, and in proactive agents that can help move collaboration forward rather than only reacting to a prompt.

Previous Product Impact: From Multimodal AI to Immersive Experiences

The examples below reflect previous product work I have been fortunate to ship into the hands of millions of customers through Microsoft Teams, Microsoft Mesh, and Azure-connected services. More recently this has included multimodal conversational AI, post-training strategy, agent systems, and the tooling required to evaluate and improve these systems in live product settings.

Within Teams, this includes work around collaborative AI experiences such as Facilitator, a chat-based agent experience aimed at helping people keep meetings and collaboration on track through structured assistance, follow-up, and shared context.

At Microsoft Teams, the product impact increasingly comes from turning advances in AI into tools that improve how people communicate and coordinate. That spans meeting intelligence, multimodal interaction, and new agentic behaviours that support real-time collaboration.

Microsoft Mesh

I established the Motion Lab team, bringing together expertise in computer vision, generative AI, graphics, and animation to solve problems in real-time human understanding. This included shipping live audio-driven avatars, real-time face tracking, and technologies for full-body reconstruction from sparse headset-based signals for devices such as HoloLens 2 and Meta Quest.

Microsoft Mesh and Teams
Microsoft Mesh enables meetings in 2D through Teams and in immersive 3D spaces through VR. My work across Mesh and Teams has focused on the AI systems that support presence, communication, avatar control, meeting intelligence, and richer collaborative experiences.

CAMERA

At CAMERA I created a multi-disciplinary research and production environment with three themes converging on human perception, analysis, and synthesis: entertainment, human performance enhancement, and health and rehabilitation. Each theme was partnered with industry, giving us real routes to impact and allowing us to translate research into tools, studios, and delivered projects.

CAMERA themes and studio
As Director of CAMERA, I built research themes, partnerships, teams, and studio capability around motion capture, photogrammetry, digital humans, and AI for entertainment, sport, and health applications.

At CAMERA I also created a framework for turning research into production tools with a team of engineers, deploying these in our studio, and delivering projects to clients that used the technology in practice. This helped ship a number of video games and award-winning immersive experiences. Below is a small snapshot of products and experiences I have been fortunate enough to contribute to.

11:11 Memories Retold

11:11 Memories Retold

With Aardman and Bandai Namco (BAFTA nominated). We delivered motion capture for the video game at the CAMERA studio.

Is Anna OK?

Is Anna OK?

With BBC and Aardman. An immersive experience delivered with our in-house facial rigging, animation, and motion capture solutions.

Cosmos Within Us

Cosmos Within Us

With Satore Studios (Cannes Lion Winner). We built digital doubles for performers and animated them using our in-house tools.

Previous Research Impact: From Digital Humans to Biomechanics and Human Perception

The research below reflects previous work spanning digital humans, motion capture of faces, bodies and animals, animation, computer vision, biomechanics, and human perception. A central theme has always been motion - understanding it, measuring it, generating it, and building systems that use it well. More recently, this work expanded into multimodal AI systems, synthetic data, evaluation, foundation-model adaptation, reinforcement learning, and proactive agents for communication and immersive experiences. A list of publications can be found on my Google Scholar page.

Facial performance capture and animation
Facial Performance Capture and Animation. This spans early 4D facial capture, monocular facial capture, and performance retargeting from motion capture. We built end-to-end pipelines for rigged digital facial models, automated performance transfer, and dynamic face modelling. The D3DFACS dataset later became one of the foundations of the FLAME model from Max Planck.
Generative models of shape and motion
Generative Models of Shape and Motion. My work has touched several areas of generative AI, from early speech-driven facial animation during my PhD through to recent personalised animation and multimodal avatar work. Today, this broader theme also connects to product-facing work on multimodal AI, post-training, and systems that combine speech, vision, and interaction signals to drive useful behaviour in real products.
Synthetic data for AI and computer vision
Synthetic Data for AI and Computer Vision. Computer graphics can be used to render realistic images and labels for training AI systems. We have applied these ideas across pose estimation, digital humans, and animal motion, and the same principles now extend naturally to modern multimodal systems where data generation, evaluation, and feedback loops are central to product quality.
Digital humans and avatars
Digital Humans and Avatars: Animation, Embodiment and Perception. In CAMERA we created pipelines for digital humans that could be used for video games, VR research, and immersive experiences. At Microsoft, related ideas continue in more product-facing form through avatars, presence technologies, and systems for embodied communication in Teams and Mesh.
Markerless motion capture and analysis for biomechanics
Markerless Motion Capture and Analysis for Biomechanics. One of the core motivations of CAMERA was to apply state-of-the-art computer vision and AI to biomechanics and elite sport. We worked closely with coaches and athletes, including the British Skeleton team, to create markerless systems that could provide useful biomechanical signals in real training environments.
Human pose estimation from egocentric cameras and head-mounted displays
Human Pose Estimation from Egocentric Cameras and HMDs. At Microsoft, we have explored 3D human pose estimation from head-mounted devices and egocentric viewpoints, including work that supports avatar embodiment and full-body understanding in immersive products. These technologies are central to convincing presence experiences in mixed reality.

Research Funding and Awards

As a Professor, one of the core activities in building ambitious research programmes is funding. Below is a selection of the major awards that supported large-scale centres such as CAMERA as well as more targeted projects in AI, computer vision, digital humans, performance capture, and perception.

2021-2026: MyWorld (~£45m FEC). UKRI (PI, University of Bath)
2020-2025: CAMERA 2.0 - Centre for the Analysis of Motion, Entertainment Research and Applications (£4,151,614 FEC). EPSRC
2019-2021: CAMERA Motion Capture Innovation Studio (£901,391). Horizon 2020
2019-2022: A Tool to Reveal Individual Differences in Facial Perception (£402,113). Medical Research Council
2018-2020: Rheumatoid Arthritis Flare Profiler (£165,126; total project value £663,290). Innovate UK
2018-2022: Bristol and Bath Creative Cluster (~£4m). AHRC
2017-2019: DOVE: Deformable Objects for Virtual Environments (£128,746; total project value £562,559 FEC). Innovate UK
2016-2018: HARPC: HMC for Augmented Reality Performance Capture (£119,025; total project value £517,616 FEC). Innovate UK
2015-2020: CAMERA (£4,998,728 FEC; ~£5m additional partner contributions). EPSRC/AHRC
2012-2016: Next Generation Facial Capture and Animation (£100,887 FEC). Royal Society Industry Fellowship
2007-2012: Exploiting 4D Data for Creating Next Generation Facial Modelling and Animation Techniques (£460,640 FEC). Royal Academy of Engineering Research Fellowship

Public Data

RGBD-Dog

RGBD-Dog contains motion capture and multiview RGB and RGBD data for dogs performing different actions. You can get the data, code to view it, and the CVPR 2020 paper from our GitHub page.

Shadow Removal

Shadow Removal Ground Truth and Evaluation provides a benchmark and dataset for single-image shadow removal, enabling open quantitative comparison across a challenging range of cases. The evaluation website is available here.

Alumni (University)

At Microsoft I lead teams of talented scientists and engineers. As a Professor and former Director of CAMERA, I have also had the privilege of working with outstanding students, researchers, technical staff, and collaborators.

Martin Parsons (CAMERA), Murray Evans (CAMERA), Yiguo Qiao (Living With/RUH/InnovateUK), Jack Saunders, George Fletcher, Jake Deane, Kyle Reed (Cubic Motion), Jose Serra (Digital Domain/ILM), Anamaria Ciucanu (MMU), Pedro Mendes, Shridhar Ravikumar (Amazon, Apple), Alastair Barber (The Foundry), Wenbin Li (Bath), Han Gong (Apple), Charalampos Koniaris (Disney Research), Daniel Beale, Sinan Mutlu (Framestore), Nicholas Swafford, Nadejda Roubtsova (CAMERA), Sinead Kearney (CAMERA), Maryam Naghizadeh, Catherine Taylor (Marshmallow Laser Feast).

Personal

I love my work, but the number one thing in my life is my family. If anything, having a family motivates me even more in my work, giving me the desire to make sure we all have the best life. It also forces you to be efficient and productive with the time that you are working, and to appreciate the time you have together even more.

Family