Tom Romero: October 2007

An idea I've been thinking over for having a skeletally animated mesh with the ability to lip-sync to speech.

Imagine a human ragdoll mesh. The head can be represented by a single body. If we remove the head (but nothing else, not even the neck), we still have a fully functioning ragdoll. It is important to note that the neck joint to which the head was attached should not be removed. Let us call the neck joint "NTAG".
Now imagine that the head mesh that we have removed is stored as a separate file in MD3 format. This MD3 file contains a single "tag", which is also called "NTAG".
When the ragdoll is loaded, the head mesh is loaded too and is rejoined with the body by linking the MS3D joint "NTAG" and the MD3 tag "NTAG".
Well, what have we achieved? Not much - we have the same model as before, only now the head is stored as a frame-by-frame model while the body is stored as a ragdoll.
Next step: For the MD3 head model, create a separate animation frame for every sound the human mouth can make. Each frame should be named appropriately (eg, the animation frame of the head making an "eeh" sound should be named "eeh", etc).
Using SDL_mixer, load some speech sounds into the project.
Hook up an SDL_mixer sound channel to an output preprocessor function. This allows the user to access any raw sound data for that channel just before it is played.
Analyse the speech WAV as it is played using this callback function. Whenever an "ahh" sound is played, set the MD3 head mesh to the frame named "ahh", etc.
To improve the effect, use linear interpolation for the MD3 frame transitioning to smooth the mouth movement and make it look more natural.

Introduction.

I've made some good progress with my MilkShape-ODE Ragdoll project since I last posted here. Currently, the project can load a MilkShape3D model, find and load any associated textures and create a set of linked joints and bodies to form a skeleton. When the model is rendered, the positions of it's vertices are transformed based on the skeletal data.

Terminology.

ODE Bodies are simply points in space with a "mass" value. They can be joined together with other bodies using "ODE Joints". To illustrate how this works, imagine that your upper arm is one "body" and your lower arm is another, with your elbow being a "joint".

Bodies cannot collide with each other, as they are used to simulate dynamics, not collision. However, they can be paired with an "ODE geom". Geoms have no dynamics data (such as mass or inertia) but do have collision data. So if you were to create a bowling ball in ODE, the ball's body would be what gravity pulled down on - but the ball's geom would be what prevented gravity from pulling it through the floor.

Please note that ODE joints are not the same thing as MS3D joints. In ODE, vertices are associated with ODE bodies. Two ODE bodies can be linked together with an ODE joint. In MilkShape3D, there is no such thing as a body - every vertex in a model is linked to an MS3D joint. An MS3D joint may have a parent joint or may be independant. When an MS3D joint is moved or rotated, all vertices associated with that MS3D joint move with it (see my last post for more details on how MilkShape3D joints work). So, MS3D joints are roughly a combination of both ODE joints and ODE bodies.

How the skeletal data is generated.

The first step is to generate a list of ODE bodies. This is done fairly simply: the number of ODE bodies created for a model is equal to the number of MS3D joints specified in the mesh file. ODE bodies are positioned by adding the position vector of every associated vertex together and then dividing the result by the total number of vertices used, so they are generally positioned around the center of a "vertex cloud". Any ODE bodies which do not have any associated vertices are simply positioned at [0, 0, 0], to avoid a divide-by-zero error.

The next step is to join these bodies together using ODE joints. This is not quite as simple as generating ODE bodies. There are two reasons for this:

MS3D joints are joined to other MS3D joints, while ODE joints are connected to ODE bodies.
MS3D joints do not need to have a parent, whereas if an ODE joint is not attached to anything then it will join itself to the environment. This has the same effect as nailing something to a wall, because the environment is the world - meaning it doesn't move.

The answer here is to create an ODE joint for every MS3D joint that has a parent, join it to the ODE body with the same index value as that MS3D joint (so if you were working with MS3D joint #3, you would join the current ODE joint to ODE body #3) and that MS3D joint's parent MS3D joint's ODE body. This is not as complex as it sounds - it only becomes difficult because there are less ODE joints than there are MS3D joints.

Current work.

There are still a few things missing from the simulation yet. Firstly, the ODE bodies do not have any mass values set. I think that the best solution would be to assume a uniform density value for every body in the mesh (for example, 0.4) and then approximate the total area covered by the vertices of an ODE body using a cuboid or capped-cylinder shape.

Secondly, no ODE geoms are generated yet and so there is no collision. Originally, I was planning on approximating ODE collision geoms in the same way as I would approximate mass but, while I can get away with comparatively slight inaccuracies in mass approximation, collision errors are more "visible" to the end user. My current thoughts are to generate a separate trimesh for each ODE body in the mesh. This would be very accurate, but it could be computationally expensive.

Possible collision problems.

Neither of the above approaches really solve the problem of triangles which span vertices which are associated with different bodies. While most triangles in a mesh belong to a single ODE body, triangles which are used to join the vertices of two ODE bodies together (such as "elbow triangles" which stretch from the upper arm body to the lower arm body) will not have any collision data.

Possible solutions are to create an ODE sphere geom for each ODE joint or to generate a trimesh for the entire model every time it changes. The first approach could lead to redundant ODE geoms being created for joints in locations such as at the top of a string, several feet away from a "puppet" model below and would also require me to find a way to guess what the radius of the sphere should be. The second approach could have issues if it interferes with the "temporal coherence" data that ODE uses for trimesh collision.

Tom Romero

Sunday, 7 October 2007

Ragdolls with Lip-Sync.

Progress on ragdolls.

Blog Archive

Links