My suggestion:
Use getCameraMatrix to get the x, y, z position of camera.
Use getWorldFromScreenPosition on position of the cursor, with depth being the maximum distance you want to reach, to get the cursor's 3D position at that distance.
Use processLineOfSight between camera and cursor coordinates to find the 3D position of the vehicle's surface point where the cursor is pointing.
Transform that surface point from world coordinate space to vehicle's local coordinate space with the help of getElementMatrix. This one is a bit tricky since the element matrix allows transforming from local to world space easily, using matrix x vector multiplication, but doing the opposite (world to local) requires multiplying the inverse of that matrix by vector, and there doesn't seem to be a built-in function for calculating the inverse of a matrix in MTA. I could do that in Lua, since it's not very complicated, but still easy enough to make a mistake.
Luckily, the vehicles have a scale of 1 and no shearing transformations (unless some glitch with Monster truck wheels occurs which does exactly that kind of crap to other vehicles), which makes the rotation matrix (the 3x3 part of element matrix) orthogonal, which in turn makes its inverse equal to transpose.
This is my attempt to transform the coordinates:
-- wx, wy, wz are the coordinates in world space
local m = getElementMatrix(vehicle)
local rx, ry, rz = wx-m[4][1], wy-m[4][2], wz = m[4][3]
local lx = m[1][1]*rx + m[1][2]*ry + m[1][3]*rz
local ly = m[2][1]*rx + m[2][2]*ry + m[2][3]*rz
local lz = m[3][1]*rx + m[3][2]*ry + m[3][3]*rz
Unless I screwed something up, lx, ly, lz should be wx, wy, wz relative to the vehicle. By transforming the cursor's 3D position like that, you can get the cursor's position with respect to the vehicle, which sounds like what you need.