Thanks for the great article. As an audio designer I’m intrigued by the potential of voice, and I think the quality of content in the skills now has to be raised a great deal to match that of podcasts/audiobooks/radio etc. Most voice skills don’t sound very good yet, and although that is fine for scenarios when the user just wants to get useful information quickly, it isn’t suitable for those who want to engage with the skill and spend time with it, perhaps even as entertainment.
One thing I don’t agree with in your article is the ‘side-note’ about synthetic voices. Although it is possible to alter the pitch, time and other characteristics of voice it definitely isn’t good enough to get any tone you want. The difference is still huge between the expressiveness of a human and a synthetic voice, regardless of the amount of SSML markup tags used on the synthetic voice.