Since it uses text-to-speech, I see no technical difficulty with implementing it. I think it would be very useful to work more on difficult vocabulary or words from readings.
On a similar note, I personally don’t use it, but I could see people wanting it: the same applies to the listening mode.