a hybrid condensed finite element model with gpu acceleration for interactive 3d soft tissue cutting
TRANSCRIPT
COMPUTER ANIMATION AND VIRTUAL WORLDS
Comp. Anim. Virtual Worlds 2004; 15: 219–227 (DOI: 10.1002/cav.24)* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
A hybrid condensed finite element modelwith GPU acceleration for interactive 3Dsoft tissue cutting
By Wen Wu* and Pheng Ann Heng* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
To meet the requirement of computer-aided medical operations, apart from the real-time
deformation, it is also necessary in the design to simulate the tissue cutting and suturing in a
surgery simulation. In this paper, we present a model on topology change and deformation of
soft tissue, referred to as the hybrid condensed finite element model, based on the volumetric
finite element method. The most important advantage of our model is its ability to achieve
an interactive frame rate for the topology change in surgical simulation on standard PC
platform. This is achieved through two innovations. One is to apply the condensation
technique, by fully calculating the volumetric deformation in the operation part while only
calculating the surface nodes in the non-operation part. Secondly, the major calculation
work in the Conjugate Gradient solver for cutting and deformation is migrated from the CPU
to the contemporary GPU to promote the calculation. Test examples have been given to
show the feasibility and efficiency of the model. Copyright # 2004 John Wiley & Sons, Ltd.
KEY WORDS: surgical simulation; soft tissue cutting and deformation; the GPU computation;the hybrid condensed finite element model
Introduction
In recent decades, the minimally invasive microsurgery
has been developed as a renovation technique. It has the
advantages of reducing the operation time and the trau-
matizing degree of patients. It requires the surgeons to be
highly experienced in hand-eye coordination skill during
operations. Therefore, the surgery simulation is expected
to play an essential role in the future surgical training.
A good surgical simulation system should provide
surgeons with visual, tactile and behavioral illusion of
reality.1 To meet this requirement, it is crucial to solve
the problem in the development of a surgical simulation
system to achieve simulation realism with interactive
speed. In the previous works, various models have been
presented. Among those, the mass-spring model is most
widely applied in modeling deformable objects due to
its simplicity. Picinbono et al.2 used the mass-spring
model to simulate the stiff character of the liver capsule,
while Delingette et al.3 expressed the fat tissue elasticity
as a network of springs on a 3-simplex mesh. Although
these mass-spring models are easy to construct, they
lack realism since a discrete set of masses are used in the
models to simulate the continuum mechanics. Com-
pared to finite element models (FEM), the range of
possible dynamic behaviors of spring models is more
constrained in mass-spring model.4 Bro-Nielsen et al.5
discussed real-time simulation of deformable objects
using a 3D solid volumetric fast finite element model.
In their solution, three optimized methods are used
during the computation: firstly, make use of the con-
densation technique to reduce the computation time by
confining the full computation only to the surface nodes
of the mesh; secondly, the system matrix is explicitly
inverted for the transformation calculation in order to
achieve a very low calculation expense; and finally,
selective matrix vector multiplication is applied accord-
ing to the sparse character of the force vector. Their
system enables a solid deformation for relatively large
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2004 John Wiley & Sons, Ltd.
*Correspondence to: Wen Wu, Department of ComputerScience and Engineering, The Chinese University of HongKong, Shatin, N.T., Hong Kong.E-mail: [email protected]
objects with video frame-rate. However, this model is
not feasible for the cutting operations since in cutting
the stiffness matrix needs to be updated when the
topology change occurs. Cotin et al.6 presented a real-
time deformation method by using a preprocessing of
elementary deformations derived from a finite
element method. The linear model in their method
is enhanced by taking biomechanical results on soft
tissues into account and it produces more realistic
behavior. However simulation of tissue cutting in
real-time was impractical because several tasks should
be performed: re-meshing of the cutting area,
recalculating the stiffness matrix, and new stage of
preprocessing.
Alongwith the improvement of the hardware archi-
tecture and the increasing requirements from the
commercial entertainment/games industry, graphics
hardware has migrated away from the traditional
fixed-function pipeline to a newly reconfigurable
graphics pipeline. As a streaming processor, its out-
standing polygon and pixel processing speed has
greatly attracted researchers to utilize it into both
common graphics problems and generic computa-
tional applications. Entering 2003, the full IEEE
floating point precision data type supported in
shaders and texture stages allows the GPU to be
applied into richer body of general applications. The
applicable fields include collision detection and
object intersections,7,8 physical simulation,9,10 scienti-
fic computing such as solvers for linear algebra11 and
partial difference equations,12 and fast Fourier
transformation in image processing13 etc. In this
paper, we will demonstrate an application of the
programmable GPU in another area namely, surgery
simulation.
In this paper, we present a deformation model, called
a hybrid condensed finite element model, based on the
volumetric finite element method accelerated by GPU.
The most important advantage of our model is its ability
to achieve an interactive frame rate for the topology
change in surgical simulation. The calculation by the
GPU can be integrated into our system for taking the
sparse matrix and vector multiplication of the Conjugate
Gradient solver. In the second section, the fundamental
theory of the linear elastic finite element model and the
condensation technique will be introduced briefly. In
the third and fourth section, the structure of our model
and the mapping methods of computation onto the GPU
will be described respectively. The implementation and
experimental results are given in the fifth section.
Finally, we conclude the paper with a discussion.
Fundamental Theory
In finite element method, the continuum or object is
subdivided into elements joined at discrete node points.
The solution is subjected to constraints at the node
points and the element boundaries so that continuity
between elements is achieved.
We chose the four nodes linear tetrahedral element to
model the soft tissue in 3D domains. The displacement
variation u ¼ u; v;wð Þ within an element can be given by
the nodal displacements and the shape function as
follows
u ¼ INi; INj; INm; INp
� �ui uj um up� �T ð1Þ
where I is a 3 � 3 identity matrix and the shape function
N is defined as:
Ne ¼ae þ bexþ ceyþ dez
6V; e ¼ i; j;m; p ð2Þ
ai¼1 ¼ a1 ¼ �1ð Þi�1
xj yj zj
xm ym zm
xp yp zp
��������������
bi¼1 ¼ b1 ¼ �1ð Þi1 yj zj
1 ym zm
1 yp zp
��������������
ci¼1 ¼ c1 ¼ �1ð Þixj 1 zj
xm 1 zm
xp 1 zp
��������������
di¼1 ¼ d1 ¼ �1ð Þi�1
xj yj 1
xm ym 1
xp yp 1
��������������
In which V represents the volume of the element. The
other constants are defined by cyclic interchange of the
subscripts in the order i, j, m, p.
Linear Elasticity
The state of deformation at a point in the solid is
described by the strain tensor, which is the second order
tensor. The physical behavior of soft tissue may be
considered as linear elastic if its displacement and
deformation remain small.14 In this case, the second
order term can be neglected. We use a linear strain
approximation to represent the tissue deformation,
W. WU AND P. A. HENG* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2004 John Wiley & Sons, Ltd. 220 Comp. Anim. Virtual Worlds 2004; 15: 219–227
where " is a set of first order differential equations
presented by the displacement
"xx ¼@u
@x"yz ¼
@v
@zþ @w
@y
"yy ¼@v
@y"zx ¼
@w
@xþ @u
@z
"zz ¼@w
@z"xy ¼
@u
@yþ @v
@x
ð3Þ
The potential energy � of a body is the sum of the total
strain energy and the work done on the body by external
forces.15 With the assumption of the soft tissue as the
isotropic linear elastic material, there is a linear relation-
ship between the stress and the strain. We obtain the
strain energy E as follows
E ¼ 1
2
ðV�T"dV ¼ 1
2
ðV"TD"dV ð4Þ
where D is the symmetric elastic matrix which relates
physical properties of the soft tissue we modeled. It
depends on the parameters of Young’s modulus, shear
modulus and Poisson’s ratio. �T ¼ ½�xx; �yy; �zz;
�yz; �zx; �xy� and "T ¼ ½"xx; "yy; "zz; "yz; "zx; "xy� are vectors
of the stress and strain components.
Forces applied to a deformable body include: distrib-
uted body forces, distributed surface forces and con-
centrated loads. The work done by these forces can be
written as
W ¼ðVu � fbdVþ
ð�
u � f sdSþXi
ui � pi ð5Þ
where u is the displacement vector, fb x; y; z� �
are body
forces applied to the object volume dV, f s x; y; zð Þ are
surface forces applied to the object surface dS, and pi are
concentrated loads acting at the points xi; yi; zi
� �. From
(1) and (3), the strain " at arbitrary point in the element
can be expressed in terms of the nodal displacements
and the shape function
" ¼ BU ð6Þ
where U is the 3ne � 1 (ne is the number of nodes in one
element) vector of the nodal displacements. B is the
6 � 3ne matrix.
In equilibrium, the potential energy reaches the mini-
mum, which means @�=@ui ¼ 0. A set of linear equa-
tions has been derived: Keue ¼ fe, where Ke is the
stiffness matrix of a single element. To assemble all of
the stiffness matrices of the elements into one, a large
but sparse linear system in form of K u ¼ f is deduced,
where K is the global stiffness matrix in 3n� 3n, where n
is the total number of nodes in the model.
Condensation
Condensation is a process by which some of the degrees
of freedom are eliminated from the overall equilibrium
equations prior to the continuation of other phases in
the solving.16 In finite element method, it is efficient to
employ condensation at a certain stage to eliminate
variables that do not directly participate in the rest of
the solution.
To establish the equations used in condensation, we
rewrite the sparse linear system as the block matrixes
form
Kii Kib
Kbi Kbb
" #ui
ub
� �¼
f i
fb
" #ð7Þ
where the i represents the condensed degree of freedom
(DOF), and the b is the retained DOF. The upper part of
(7) can be solved for ui as
ui ¼ K�1ii f i � K�1
ii Kibub ð8Þ
To substitute this equation into the lower part of (7) for
ui, we can obtain the new equations for ub
K0bbub ¼ f 0b ð9Þ
where the reduced stiffness matrix K0bb is
K0bb ¼ Kbb � KbiK
�1ii Kib ð10Þ
and the reduced force vector f 0b is given by
f 0b ¼ fb � KbiK�1ii f i ð11Þ
The new stiffness matrix K0bb in the reduced equation (9)
is obviously denser than that in the original one. The
size of the equations is just the same as what derived
from the surface FEM. It means that, the condensed one
is mathematically equivalent to the volumetric FEM,
bearing the volume character in the solution but only in
the equal computation expense of the surface FEM
solution.
INTERACTIVE 3D SOFT TISSUE CUTTING* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2004 John Wiley & Sons, Ltd. 221 Comp. Anim. Virtual Worlds 2004; 15: 219–227
Structure ofHybridCondensedFinite ElementModel
In surgeries, most operations are concentrated on the local
pathologic area of organs. In these regions, tissues undergo
cutting or tearing, thus more precise models are required.
We assume the non-linear deformation or the topology
changes only occur in the operation part throughout the
whole surgery. With this assumption, the key idea of the
hybrid FEM algorithm is to separate the areas containing
the non-linear elastic properties or topology changes from
the others. It includes
* Operation part where the surgery occurs. It corresponds
to the pathological structures and only represents a
small part of the whole model in a simulator. In this
part, tearing and cutting are simulated.
* Non-operation part where only the deformation
occurs. But it greatly contributes to the realism of
the simulation.
Different models treat areas with different properties
respectively to optimize the trade-off between the com-
putation time and the realism of the simulation.
The illustration of the hybrid model is shown in
Figure 1. Since these two parts connect with the com-
mon nodes, additional boundary conditions are added
to both models. We rewrite the linear system equations
of the operation area (12) and non-operation area (13) as
a block matrix form respectively
K11 K1I
KI1 K1II
" #A1
AI
� �¼
P1
�PI
� �ð12Þ
K2II KI2
K2I K22
" #AI
A2
� �¼
PI
P2
� �ð13Þ
where the subscript 1, 2 and I represent the operation
area, non-operation area and the common nodes shared
with these two areas respectively, the superscript 1, 2
denote the common nodes represented in the operation
area and the non-operation area respectively. PI and
�PI is the force and counterforce respectively applied to
the common nodes when we analyse these two parts.
From (13) we obtain
ðK2II � KI2 � K�1
22 � K2IÞ � AI ¼ PI � KI2 � K�122 � P2 ð14Þ
Representing (14) in another form
PI ¼ K0 � AI þ P0 ð15Þ
where
K0 ¼ K2II � KI2 � K�1
22 � K2I ; P0 ¼ KI2 � K�122 � P2 ð16Þ
Substituting (15) into (12), we can get a new matrix
system (17), from which the displacements of the opera-
tion area A1 and AI can be obtained.
K11 K1I
KI1 K1II þ K0
" #A1
AI
� �¼
P1
�P0
" #ð17Þ
The next step is to obtain the deformation of the non-
operation area. As the inside nodes of the non-operation
area are unrelated with any action of the surgeon even
invisible, we can just treat them as redundant nodes for
the simulation. Computing these nodes only brings mean-
ingless burden of calculation. We adopt condensation to
remove them from the computation process. This keeps
the original physical character of the volumetric system
but the size of the condensed matrix equation is equal to
the FE surface model. Therefore, we can rewrite the non-
operation area equation in the condensed form as
K2II KIs KIi
KsI Kss Ksi
KiI Kis Kii
264
375
AI
As
Ai
264
375 ¼
PI
Ps
Pi
264
375 ð18Þ
The subscript i represents the internal nodes to be
condensed out and s represents the surface nodes to
be retained. From (18), we have
Kss � Ksi � K�1ii � Kis
� �� As ¼ Ps � Ksi � K�1
ii � Pi
þ Ksi � K�1ii � KiI � KsI
� �� AI
ð19Þ
Because the internal nodes hasn’t been acted by the
external force, the term of Ksi � K�1ii � Pi is equal to zero.
Then we can derive the new matrix equation K�As ¼ P�
which only relates to the variables of the surface nodes
K� ¼ Kss � Ksi � K�1ii � Kis;
P� ¼ Ps þ Ksi � K�1ii � KiI � KsI
� �� AI
ð20ÞFigure 1. The illustration of the hybrid model.
W. WU AND P. A. HENG* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2004 John Wiley & Sons, Ltd. 222 Comp. Anim. Virtual Worlds 2004; 15: 219–227
It should be noticed that, the form of P� is different from
the one derived from the condensation used in general
FEM. It has one term that relates to the common nodal
displacements AI in the hybrid FEM.
All terms in K0, P0, K� and P� are known constants and
keep unchanged in the whole surgery simulation pro-
cess because no topology change will occur in the non-
operation part. Thus they are calculated in the pre-
processing stage. This stiffness matrix in (17) will be
updated according to the topology changed during the
simulation. However, it is clear that the order of this
system is far less than the order of the global FEM
system.
In summary, the interactive update and display of the
hybrid model is achieved through the processing on the
operation area. The model of this area is updated based
on the imposed displacement or forces on the boundary.
During this stage, the displacements of the nodes and
forces applied on the common nodes are computed.
Then we can obtain the displacements of nodes on the
surface of the non-operation model according to dis-
placements of the common nodes.
CalculationontotheGPU
As mentioned above, the interaction occurs in the op-
eration area, where the collision of the scalpel and the
tissue model is detected and the cutting operation is
processed. The topology of the model is changing on-
the-fly during the simulation. The corresponding equa-
tion (17) is constructed and solved in every different
processing step. Along with the size of the model
increasing, the time spent by the collision detection rises
as a linear function of the model scale. And the time of
constructing the new global stiffness matrix is squared.
Solving the linear system is about cubed complexity,
which takes almost seventy to eighty percent of the
time executed by the operation area. Therefore, solving
the linear algebraic equations is the bottleneck of the
whole simulation. It directly impacts the feedback speed
of the system. In our system, we use the Conjugate
Gradient iteration to solve the linear equations. The
global stiffness matrix derived from FEM exhibits a
feature of large size, sparse distribution of non-zero
elements, symmetric and positive definite, this suffi-
ciently meets the requirement of the condition of ordin-
ary Conjugate Gradient algorithm. The primary
operation in each iteration of the Conjugate Gradient
algorithm is the matrix-vector multiplication, so the zero
elements of the matrix would remain unchanged in the
calculation process, which makes it more suitable for its
implementation onto the GPU.
In this section, we introduce our method of mapping
the calculation kernel onto the GPU. The Conjugate
Gradient solver includes three primary operations: mul-
tiplication of the sparse matrix and the vector, addition
of two vectors and the sum-reduction operation. The
core component is the multiplication of the sparse
matrix and the vector. We migrate this part of calcula-
tion from the CPU into the fragment processor of the
GPU, to take advantage of the fragment processor in its
efficiently manipulation of the local texture memory on
the mathematical calculation. The basic principle is to
load matrices and vectors as textures into the GPU, and
then rasterize a proper quad of pixels for invoking the
fragment program, which process the actual calculation
in the fragment processor. The result can be obtained as
the color value, transferred directly to the next pass for
execution or readback to the CPU.
Texture RepresentationofMatrices
For the sparse matrix-vector multiplication: yi ¼Pnj¼0 ai;j � xj, the matrix representation in the GPU
should consider both the system data structure
and the GPU single-instruction-multiple-data (SIMD)
character.
Since the matrix is sparse and symmetric, we use an
one-dimensional array M to store the upper-right non-
zero elements of upper-right of the matrix in our system.
Two accessorial arrays are needed to restore the original
matrix (Figure 2), N1 for the indices of the diagonal
elements in M and N2 for the column index of every
non-zero element.
Various mapping methods have been proposed in
previous works.11,12,17 In our solution, an improvement
is made to the method from12. The diagonal and off-
diagonal elements of the matrix are stored separately by
different layout textures. Two dependent textures are
used to help fetch off-diagonal elements and proper
vector elements efficiently in the fragment program
computation. The entire process of multiplication for
the off-diagonal elements is as illustrated in Figure 3.
Figure 2. The data structure of the sparse matrix.
INTERACTIVE 3D SOFT TISSUE CUTTING* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2004 John Wiley & Sons, Ltd. 223 Comp. Anim. Virtual Worlds 2004; 15: 219–227
Ty is the texture keeping the multiplication result, and
Tx has the same layout, which stores the vector x.
Telement keeps all the non-zero (off-diagonal) elements
of the matrix row by row and the index of every row-
starting point in Telement can be found by T index. The
texture coordinates of the corresponding elements in
vector x are stored in Tst. In12, rows were divided into
multiple groups in preprocessing, and each group re-
lated with a certain size of the texture has the equal
number of non-zero elements. The specified fragment
program processes each group. In the interactive simu-
lation, we should try to avoid such trivial work and
construct textures directly. We add one more content to
T index keeping the total number of non-zero elements in
each row, which is inspired from Buck’s presentation.18
Accordingly, in the fragment program, we use two
instructions to check and handle the case where the
pass number of processing is greater than the number of
non-zero elements in every row. To take full advantage
of the 4-channels type of texture element, we rearrange
the content of these textures with two different designs.
MethodA
Textures which have the same layout can be integrated
into one texture. As described above, using RGB chan-
nels of textures, Telement Tst and Tdiagonal T index can be
combined into one respectively (Figure 4). This compact
representation not only saves the texture memory, but
also reduces the number of instructions on texture
fetches. The pass number executed by the fragment
program is decided by the maximum number of non-
zero elements of the sparse matrix rows.
MethodB
In this method, we use all RGBA channels of Telement to
keep the non-zero (off-diagonal) elements. Because one
texture coordinates can decide four elements at one
time, if the number of non-zero elements isn’t a fourfold
number, it will be padded with zeros to let all channels
filled. Tst should provide all four coordinates of the
corresponding elements of x in Tx at one time. As
the original one texture can’t fulfill the requirement,
we use two textures Tst-a and Tst-b instead of Tst to keep
this information (Figure 5).
This strategy has more compact style of non-zero
elements representation. The fragment program is
more complex and a little bit longer. It has 13 more
instructions compared with last one, 5 of which are
texture fetches. The fragment program executes four
multiplications at one time. The current pass result
needs be accumulated with that from the previous
pass kept in a one channel texture. Therefore four
more instructions are used to accumulate these four
components before executing the accumulation between
two consecutive passes.
Implementation andExperiments
We have implemented all these algorithms by VCþþ. In
the pre-processing stage, we build an initial FEM with
both geometric shapes and physical parameters of
tissues. The geometric model of the patient organ was
generated based on CT/MRI images. The physical char-
acters include the boundary conditions of the model
such as the displacement constraints and the external
forces, the initial strains and some material parameters.
Figure 3. The entire process of the sparse matrix-vector
multiplication on the GPU.
Figure 4. The layout of Telement in method A.
Figure 5. The layout of Telement, Tst-a and Tst-b in method B.
W. WU AND P. A. HENG* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2004 John Wiley & Sons, Ltd. 224 Comp. Anim. Virtual Worlds 2004; 15: 219–227
After marking the surgery area interactively on the
mesh to determine the operation part, the non-operation
part and the interface between the two parts, the sub-
matrices K0, P0, K� and P�, which remain constant
throughout the simulation can be calculated in advance.
The data we employed comes from the Visible
Human Project.19 We created a volume tetrahedral
mesh of a short portion of the upper leg (Figure 6) for
the experiment. The thighbone is not shown in the
figures for clarity, thus leaving a hole in the model.
The nodes fixed on the thighbone have no displace-
ments. Three nodes on the surface of the operation area
are imposed with certain forces to simulate the stresses
caused by other organs or forces applied by surgical
tools. These forces can also cause the initial deformation
of the soft tissue. When the scalpel collides with the
tissue and the press is big enough, the cutting operation
occurs (see Figure 7). Figure 8 shows the deformation of
the tissue during the cutting process. For the model of
which the non-operation area has 633 nodes (416 surface
nodes) and 2134 tetrahedrons, the average execution
time for obtaining the displacement of the non-opera-
tion with condensed model is almost four times faster
than that of non-condensed model (Table 1).
Table 2 compares the time of the sparse matrix and
vector multiplication, which run in Conjugate Gradient
solver of our system, on the GPU and the CPU respec-
tively with different degrees of freedom of the operation
area. It’s the time of one multiplication on NVIDIA
GeForceFX 5950 Ultra with 5.2.1.6 driver, 1.5 GHz
Pentium 4 CPU. The size of the linear equation derived
from the operation area model is from 456 to 2190. The
maximum number of non-zero elements per row deci-
des the pass number of the fragment program, and at
each pass the result are added into that in the previous
pass. Because method B makes full use of the 4-channels
character to compactly fill non-zero elements and co-
ordinates indices into the texture, the pass number
required can be remarkably reduced so that time spent
on the data transfer between two consecutive passes can
be saved. That makes method B more efficient than
method A. Along with the increase of the data scale
processed, the performance of the method B on the
GPU can reach 2.15 times faster than the CPU
Figure 6. (a). The operation area mesh. (b). The non-operation
area mesh. (c). The whole volume tetrahedral mesh of the
shorter portion of the upper leg.
Figure 7. (a)–(h) The cutting operation.
INTERACTIVE 3D SOFT TISSUE CUTTING* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2004 John Wiley & Sons, Ltd. 225 Comp. Anim. Virtual Worlds 2004; 15: 219–227
implementation. Even for the method A on the GPU, an
improvement rate of 1.14 times can be achieved. It
proves that GPU implementation shows predominance
in processing large scale data due to its high degree of
parallelism, of which the CPU is lacking.
Conclusion andDiscussion
In this paper, we proposed a hybrid condensed finite
element model with the GPU hardware acceleration for
surgical simulation of tissue cutting and deformation.
The model processes the tissue as two different types.
For the first type of the tissue which is not involved with
any cutting operation, the deformation is calculated
only to the surface nodes, by a condensation technique,
so that no computation is required to a large number of
interior nodes. For the second type of the tissue involved
with the cutting operations, more comprehensive pro-
cess is applied. In our solution, as the new topology
constructs on-the-fly during the cutting procedure, the
stiffness matrix is refreshed to build the new linear
equations for the nodal displacements. Conjugate Gra-
dient iteration is adopted to solve the problem. To
improve the efficiency, we migrate the most computa-
tional intensive part of the calculation onto the GPU for
further acceleration. Two methods of mapping data
onto the GPU have been proposed in accordance with
the application data structure, and as a result, the
simulation can reach an interactive frame rate, even
with topology change to the model in a certain degree
of complexity.
As a future work, the force applied by surgical
instruments shouldn’t be restricted in the operation
area. Some other physical characters can be added to
the model to enhance the realism. On the other hand,
current methods of mapping the calculation on the GPU
are sensitive to the bandwidth of the sparse matrix. As
long as one row has a high bandwidth, a high number of
passes would be required so that the cost could be
sharply increased. In fact, we find that most of rows
have non-zero elements less than twenty. A direct
method to solve this problem is to optimize the mesh
prior to the simulation to make the incident edges for
each node fairly uniform in number. Another possible
solution is to investigate a better layout of texture for
storing the data.
Generalmodel Condensedmodel
Average time 11.372 2.906
Table1. Time of twomodels for 633 systemnodesin the non-operation area (Pentium4, 1.5GHz)
(Source: Time :ms)
DOF Max.Non-zero Max.PassNo. Max.PassNo. Time Time CPUElementNo. (GPUMethodA) (GPUMethod B) (GPUMethodA) (GPUMethod B) Time
456 188 188 47 1.913 1.118 0.622960 188 188 47 1.808 1.023 1.1161137 188 188 47 1.811 1.027 1.5001602 188 188 47 1.837 1.056 1.7311854 190 190 48 1.846 1.057 2.2882100 231 231 58 2.106 1.140 2.3232190 237 237 60 2.196 1.161 2.503
Table 2. Timeofthe sparsematrix-vectormultiplication,whichrunsintheConjugateGradient solverofour systemon theGPUandCPUrespectively, with differentdegrees of freedomof the operation area.ItisthetimeofonemultiplicationonNVIDIAGeForceFX5950Ultrawith5.2.1.6driver,1.5GHzPentium
4CPU. (Source:Time :ms)
Figure 8. The deformation during the cutting process.
W. WU AND P. A. HENG* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2004 John Wiley & Sons, Ltd. 226 Comp. Anim. Virtual Worlds 2004; 15: 219–227
ACKNOWLEDGEMENTS
This work was supported by the Research Grants Council of the
Hong Kong Special Administrative Region under Central
Allocation Grant (project no. CUHK1/00C) and Earmarked
Research Grant (project no. 4356/02E).
References
1. Bro-Nielsen M. Finite element modeling in surgery simula-tion. In Journal of the IEEE 1998; 86(3): 490–503.
2. Picinbono G, Lombardo J-C, Delingette H, Ayache N.Improving realism of a surgery simulator: linear anisotro-pic elasticity, complex interactions and force extrapolation.Tech. Rep. 4018, INRIA, Sep. 2000.
3. Delingette H, Subsol G, Cotin S, Pignon J. A craniofacialsurgery simulation testbed. In Visualization in BiomedicalComputing (VBC’94), 1994; 607–618.
4. Delingette H. Towards realistic soft tissue modeling inmedical simulation. In Proceedings of the IEEE: Special Issueon Surgery Simulation, 1998; 521–523.
5. Bro-Nielsen M, Cotin S. Real-time volumetric deformablemodels for surgery simulation using finite elements andcondensation. In Proceedings of Eurographics’96—ComputerGraphics Forum, 1996; 15: 57–66.
6. Cotin S, Delingette H, Ayache N. Real-time elasticdeformations of soft tissues for surgery simulation. IEEETransactions on Visualization and Computer Graphics 1999;5(1): 62–73.
7. Agarwal P, Krishnan S, Mustafa N, VenkatasubramanianS. Streaming geometric computations using graphics hard-ware. Tech. Rep., AT&T Labs-Research, 2002.
8. Govindaraju N, Redon S, Lin MC, Manocha D. CULLIDE:interactive collision detection between complex models inlarge environments using graphics hardware. In Proceed-ings of Graphics Hardware, 2003.
9. Harris MJ, Baxter WV III, Scheuermann T, Lastra A.Simulation of cloud dynamics on graphics hardware. InProceedings of Eurographics/SIGGRAPH Workshop onGraphics Hardware, 2003.
10. Kim T, Lin M. Visual simulation of ice crystal growth. InProceedings of ACM SIGGRAPH/Eurographics Symposiumon Computer Animation, 2003.
11. Kruger J, Westermann R. Linear algebra operators for GPUimplementation of numerical algorithms. In Proceedings ofSIGGRAPH, 2003.
12. Bolz J, Farmer I, Grinspun E, Schroder P. Sparse matrix sol-vers on the GPU: conjugate gradients and multigrid. InProceedings of SIGGRAPH, 2003.
13. Moreland K, Angel E. The FFT on a GPU. In Proceedings ofGraphics Hardware, 2003.
14. Fung YC. Biomechanics-Mechanical Properties of LivingTissues, 2nd edn. Springer-Verlag; 1993.
15. Gibson SFF, Mirtich B. A survey of deformable modelingin computer graphics. http://www.merl.com, November1997.
16. Kardestuncer H, et al. Finite Element Handbook.McGraw-Hill: New York, c1987.
17. Larsen ES, McAllister D. Fast matrix multiplies usinggraphics hardware. In Proceedings of Supercomputing, 2001;55–60.
18. Buck I, Hanrahan P. Data parallel computation on graphicshardware. Graphics Hardware 2003 Panel: GPUs as StreamProcessors, presentation ppt, 2003.
19. The Visible Human Project. Accessed April, 2004. http://www.nlm.nih.gov/research/visible/visible_human.html(*The texture of the cross section was created from the dataof the Visible Human Project.)
INTERACTIVE 3D SOFT TISSUE CUTTING* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Copyright # 2004 John Wiley & Sons, Ltd. 227 Comp. Anim. Virtual Worlds 2004; 15: 219–227