Submitted by x11ry0 t3_126jezi in deeplearning
Hi,
I have to work with sequences of vectors.
It is the spatial position of a solid expressed as a translation and a quaternion for the rotation. The solid has a trajectory in the space that follow some logic but is still to complex to be mathematically modeled. All solids are evolving in a similar environment with strong winds and fixed obstacles.
At each step of the trajectory we need to predict the next position in order to predict the position were we need to focus the camera. It has a relatively narrow field of vision so it shall be accurate enough to have the object in the field of vision.
The camera is not just rotating over a fixed point. It has to capture one of the faces of the solid to the move is slow enough that we need some prediction ahead of time.
A vector is always of length 9 :
The solid position: Tx, Ty, Tz, Qx, Qy, Qz, Qw
2 imposed environmental factors : Pu, Pv, that can have some influence of the trajectory. These are measured just before the prediction and shall not be predicted.
The position of the vector in the sequence is meaningful, so we could add a coordinate that is imposed: index.
The sequence of previous vectors can vary from 1 to 100 but some ~5 of the ~15 previous vectors are the most meaningful for the next vector prediction. Not necessarily the last ones. These numbers can slightly vary. We can find by advance, using an analytical algorithm, a list of 6 possibly meaningfull vectors to take into account, leave 15 vectors and let the system decide or even let the full history.
To start experimenting with... We have ~20 different trajectories of 100 items. We will probably have a few hundred within a few months.
The goal is to give some of the first vectors to the system and let him predict the next vector. At each step we have the environmental factors available and we need to predict a next move that is realistic. Not necessarily the next move that will happen exactly in reality as we can measure and correct it afterwards, but something near that can be exploited to focus the camera on the right place to capture the object in a relatively narrow field of vision.
I was thinking about next word prediction models that behave similarly. Namely LSTM and Transformers. I also think about simple position aware decision trees and plain neural networks with the index in the parameters.
Does this ring some bells about possible papers or concepts I shall explore before testing some implementations ?
Thanks for any advice!
mmeeh t1_je9hfuo wrote
Why don't you dump this on ChatGPT and get a way more accurate answer ?