It certainly expects you to react and speak within a second, if you don’t it means clicking Record on and off. With one answer, English from Italian, it wanted “Mosquito” for “Le zanzare”, and Ital from E, I clearly and swiftly answered “Saremo” and it gave “San Remo”! So I popped back to entering text with no problems at all.
Edit: Today I said “Convento” clearly and in my best Italian but it recorded “col vento” and then “con vento”. Getting too many red crosses so may give it a rest😊jusk 4 a wile.