Coordinate systems
Freesurfer uses at least 4 voxel coordinate systems and 4 "RAS" coordinate systems. 
0. Stored in volume file     
     original volume        ========>     RAS ("scanner RAS", c_(r,a,s)!=0 in general)
          |                                |
          | 1. calculated                  | identity
          |                                |
          V       2. calculated            V
   conformed volume         ========>     RAS (to have the same c_(r,a,s) as above)
      L^3 with S mm voxel                |
          |                                |
          | identity                       | 3. calculated (translation)
          |                                |
          V       4. fixed(standard)       V
   conformed volume         ========>    SurfaceRAS with c_(r,a,s) = 0
      L^3 with S mm voxel                
Functional analysis(fMRI) part uses another coordinate system to map from the src volume. 
     original volume        ========>    RAS ("scanner RAS", c_(r,a,s) != 0 in general)    
          |                                |
          | identity                       |  calculated
          V       5. fixed("standard")     V
     original volume        ========>   tkregRAS where c_(r,a,s) = 0 
          |                                |
          | 6. calculated                  |  mri2fmri (registration will give this)
          |                                |
          V        7. fixed("standard")    V                        
     overlay volume         ========>    fRAS where c_(r,a,s) = 0
All these coordinate systems make it a rather difficult task to trace to the original source volume voxel index from surface or functional index. If you can follow the arrows, you can get the necessary transforms easily. 
The transform 2 (CORONAL coordinates) is calculated so that the following equation holds. That is, the direction cosine part is fixed, but not the translation part. In this way, the conformed volume always in the CORONAL orientation. 
              [-1  0  0 s1][S 0 0 0][L/2]   [c_r]       s1 = c_r + S*L/2
              [ 0  0  1 s2][0 S 0 0][L/2] = [c_a]  ==>  s2 = c_a - S*L/2
              [ 0 -1  0 s3][0 0 S 0][L/2]   [c_s]       s3 = c_s + S*L/2
              [ 0  0  0  1][0 0 0 1][ 1 ]   [ 1 ]
where c_(r,a,s) is from the "scanner RAS". This "scanner RAS" has the physical meaning of "Right-Anterior-Superior" directions of the head.
The transform 4 (surfaceRASFromConformedVoxel) is fixed as
              [-1  0  0  S*L/2][S 0 0 0]
              [ 0  0  1 -S*L/2][0 S 0 0]
              [ 0 -1  0  S*L/2][0 0 S 0]
              [ 0  0  0    1  ][0 0 0 1]
The transform 1 (conformedVoxelFromVoxel) is calculated by (in a matrix sense)
     xform1 = inv(xform2) * xform0  = [1/S 0  0  0][-1  0   0  s1] * xform0
                                      [ 0 1/S 0  0][ 0  0  -1  s3] 
                                      [ 0  0 1/S 0][ 0  1   0 -s2]
                                      [ 0  0  0  1][ 0  0   0  1 ]   
The transform 3(SurfaceRASFromRAS) is calculated by (in a matrix sense). Note that it is independent of conformed voxel size S and the length L:
     xform3 = xform4 * inv(xform2) = [ 1 0 0  -c_r]
                                     [ 0 1 0  -c_a]
                                     [ 0 0 1  -c_s]
                                     [ 0 0 0    1 ]
Because of the xform3 (changing only translation part), it is easy to calculate SurfaceRASFromVoxel (xform3*xform0) and is given by
     SurfaceRASFromVoxel = [  3x3 part   (t1 - c_r)]
                           [  same as    (t2 - c_a)]
                           [  xform0     (t3 - c_s)]
                           [    0            1     ]
where t1,t2,t3 are the translation part of xform0. 
The transform 5 and the transform 7 are calculated by the requirement 
              [-1  0  0 s1][xsize  0     0    0][width/2 ]   [0]
              [ 0  0  1 s2][  0  ysize   0    0][height/2] = [0]
              [ 0 -1  0 s3][  0    0   zsize  0][depth/2 ]   [0]
              [ 0  0  0  1][  0    0     0    1][    1   ]   [1]
Here the name "RAS" lost the meaning completely. This "RAS" is just to be used for alignment purpose only. The width/height/depth are for the appropriate volume. The reason is that the original volume could be sagittal or horizontal.