Digital Voice Assistants – the voice of your life


The desire of the user and humans, however, is different. Many smart home fans have only begun to marshal their lives and homes through the use of the language assistant, which was the primary desire of the user, voice-over-the-machine communication rather than the use of smart home technology. This goes so far that the users one language assistants who not only has his own personality and name, but also becomes a partner, assistant or even friend. A virtual person!

This wish is fundamentally not so far-fetched, since the systems are fundamentally conceived as "private personal assistants who can speak", even if today's stage of development can not yet offer this in their entirety. Big role models are e.g. Jarvis from the Iron Man comic, or of course Star Trek and many others. In Hollywood, the topic of the virtual person has long been a successful topic, which can be found in many films.

With the greatest emotional trust in hearing and voice, Alexa, Google Assistant, Siri, and Co are the logical choice for the most human interface possible. Likewise the user understands his environment best by ear and voice. Therefore, although voice interfaces are often not required, they give the user security and thus a good sense of control, since the voice is also the most barrier-free communication channel for humans.

Which application should be?

Admittedly, currently interacting with language assistants just works well for device control or simple use cases, whereas deeper dialogues usually (still) lead to frustration quite quickly. That's why smart home / living use cases are the real "killer application" for digital voice assistance, although many smart voice developers are less likely to hear that.

The really exciting use cases for smart technology in conjunction with language assistants are the use cases that the developers themselves do not necessarily come by and are part of everyday life and work. Starting the robotic vacuum cleaner is less exciting than organizing complete processes, even though the design of such processes is currently often cumbersome, especially if the configuration of the necessary devices and functions is different or too cryptic.

When planning such processes, one quickly notices how complex even simple processes can be, which mistakes happen and therefore have to be intercepted, which one did not previously think of. In the vacuum cleaner robot, e.g. such things as getting stuck in cables or cabinets. Therefore, a process (routine, process) is actually rarely to never a grade line, but rather a diagram with multiple paths and possibilities. Many graphical user interfaces for designing smart home / living functions already support this idea. Creating such processes only by voice is currently not possible, not even multimodal (with voice over a screen). What is, of course, possible is the triggering of processes by voice or just the output of information by voice. Where, of course, the question always arises: in which situations of human life is the voice good enough as an input / output channel? Logically, this is different from case to case and a general statement can not really be made. Nonetheless, there are fundamental criteria that determine whether or not the use of voice and hearing through language assistants makes sense at some point.

Language assistance with meaning

The most basic criterion is simply the environment itself. If the environment does not allow it to speak and hear without disruption then voice and hearing are not the tools of choice. Unless one avoids this problem e.g. through headphones for the ears and a HBI (Human Brain Interface). There are already first model projects that make it possible to read the human brainwave patterns while "talking inwardly".


Such an input method could e.g. Even previously speechless people can give a voice again.

The next key criterion for using a speech interface is the question of ergonomics. Depending on the application, it may be more efficient to initiate a sequence by switch, sensor or time-controlled. Voice / Hearing are then only the alternative way.

Always meaningful use for language assistance, is always the status query. Known variants are e.g. the question of the weather, the next train or the remaining time of the timer. However, status queries that are answered with a more complex output can no longer be pleasantly output by voice output to humans and therefore rather multimodal (with screen) should be output to the user (saddle point of multimodality).

Likewise, all applications are useful in which the voice becomes the third hand. A wish that every craftsman and every parent in the world has. Those who do not have their hands free at the moment can, as if by magic, trigger processes by means of voice assistants in conjunction with the appropriate smart home technology and move things through motors. For physically / motor restricted people are particularly such applications, such. the opening of a door, extremely valuable life aids.

Logically, the availability of language assistance is the foundation of their commitment. Optimally, therefore, the digital voice assistant should always be direct and close to the person and best to move with it. Whether the speech assistant listens constantly, or only listens on instruction, determines the user through appropriate access settings. The language assistant's alternative directly to humans is a correspondingly designed environment in which a voice assistant should be able to listen where the desired application needs it. Both variants have their advantages and disadvantages. Whereby the modern human currently always has his smartphone with him or is always in the vicinity of any device that could accommodate the speech assistant.

Jarvis, do you understand me?

Man is a social being, if he can not live out social, he gets sick. Accordingly, man's desire for social contact is one given by nature. A social contact can also be a digital voice assistant for a person as a virtual person, insofar as he has correspondingly human traits. There are already developments for language assistants who are lonely people through social interaction to give a better attitude to life, However, what sounds rather perverted in a world as full as the current one is still better than the alternative of loneliness or illness. Accordingly, digital language assistants are also the glue between humans and society. It is quite possible that some people, as often portrayed in many movies, feel more at ease in a virtual society than in a real one. If you take a closer look at some societies, this desire for virtual society and its fulfillment may not even be outlandish. Whether this is the right way to a peaceful world, of course, nobody knows.

Luckily, most people want to live in reality and see the digital voice assistant as a practical addition to their lives that users can adjust accordingly. The corporations, the appropriate language assistance systems such as Amazon Alexa or Apple Siri, of course, want to keep their influence on the user as large as possible. For this reason, a strong adaptability of the personality of the digital language assistant is not really desired by the corporations either. A Siri Apple, should always be recognizable as Siri Apple. Just as an Alexa should always leave the user with the impression that the user is on the side of the user. Anyone who really wants to personalize their personal virtual assistant in their own personality will have to wait for appropriate services and offers. Currently it is Presentation of the "brand" For the corporations still too important to allow the users to personalize the nature of the language assistant by the user, even though this was usually the original desire of the user to buy a smart speaker with voice assistant at all.

The development of digital language assistants is proceeding in large steps, even if this is not necessarily directly recognizable to the user. Years will pass before the systems have the power of motion pictures. Ultimately, it's also a generation issue, because today's users need to learn how to use language assistants, whereas for tomorrow's users, things that can not speak will be "broken".

The development itself has not yet exceeded certain saddle points, although the systems are designed from the beginning. Above all, what is meant here is the "intention" (intent) that man transports in his situational feeling, thinking, acting and speaking. The developers are still working on solving problems rather than offering solutions and thus the solution Fulfillment of human intention (What the user really wants!). But this is a completely normal development as with any other previous medium as well. The scientific claim to digital language assistance is far too complex and too high for it to develop into a hype. The sale of gadgets that can be controlled by voice, on the other hand, is the hype we currently have on the market and the corporations are happy to serve with great investment and desire. It does not matter if the user really can do something with it or not.

Let's stay curious ….

Until then, by participating in the following survey, you can influence the future of digital voice assistant development. Join in!

Our experts on

In our reporting, we draw on the deep industry knowledge of renowned experts such as Robert Mendez on the topics of smart home, language assistants and electromobility. Within guest contributions we publish their exclusive assessments and background information.

You can find further expert contributions in our expert directory.

Become an expert now!

Are you also an expert in the field of Smart Home, IoT, Electromobility or Connected Living? We give selected experts the opportunity to share their expertise at homeandsmart with our readers.

We look forward to receiving an e-mail – please contact us via!


Leave a Reply

Your email address will not be published. Required fields are marked *