Read DingTalk recognition text when text.content is empty, and add a handler-level regression test for voice transcript delivery.