El.kz / Marina Ruzmatova / AI Grok

China builds massive 3D face database to sharpen humanoid robots using point clouds

04.03.2026 13:44

Humanoid robots are becoming harder to distinguish from real people, and a new study from China points to one reason why, El.kz reports citing Interesting Engineering.

Researchers have developed a large-scale 3D facial dataset and a new AI model that can detect facial landmarks directly from raw 3D data, without relying on 2D images or digital templates.

The work targets a core challenge in building realistic androids and virtual humans: enabling them to express emotion, recognize identity, and interact naturally.

One of the key technical building blocks behind that capability is three-dimensional facial keypoint detection, which maps critical points on a face in 3D space.

Most existing systems depend heavily on 2D texture mapping or synthetic 3D faces. That approach can introduce errors because digital models often differ from real human facial geometry, and texture alignment is not always precise.

The new study aims to bypass those limits by working directly with real-world 3D facial scans.

Building massive 3D datasets

To support the effort, the team built a custom 3D and 4D facial acquisition system. They carried out standardized data collection and assembled a database containing around 200,000 high-fidelity 3D facial scans.

The database also includes a multi-expression 3D face dataset, a standardized 3D facial landmark dataset, a high-precision 3D human body dataset, and a dynamic 4D facial expression dataset.

Together, these multimodal biometric resources form one of the largest structured collections of real 3D human facial data reported to date. The dataset was selected for Fujian Province’s 2025 High-Quality AI Dataset Program.

Instead of feeding the AI system with textured images, the researchers designed a curvature-fused graph attention network, or CF-GAT, to process unordered point clouds directly. A point cloud represents the geometry of a face as a collection of spatial points, without surface textures.

The team introduced a geometry-driven sampling strategy that simplifies the point set while preserving key curvature information. That curvature data is encoded as an explicit geometric prior and integrated into the model’s attention mechanism. This allows the network to focus on subtle local shape variations while also modeling global relationships across the face.