Outputs generated text-to-speech messages as an MPEG-layer-3 stream, given a selected voice name and message text.