Tests cues with voice markup <v>.