A stylized computer screen graphic showing the Wordly logo in the center, with arrows extending out to three different graphics of people, each with speech bubbles in different languages (English, French, and German).

Wordly Live AI Translation and Captioning [Review]

Disclaimer: Skift Meetings receives a fee to review this product. The content below is verified for accuracy and aims to provide an objective look at the product’s key features, potential uses, and pricing, among other factors.

Wordly offers a cost-effective AI solution for real-time translation and captioning at live events — whether in-person, hybrid, or virtual.

What Is It?

Wordly is a leading provider of AI-powered live translation and captioning services, with both speech-to-speech and speech-to-text capabilities. In other words, the software is capable of translating speech into either an AI-generated natural sounding voice or captions (sometimes referred to as subtitles). The tool can also be used to generate same-language captioning with virtually zero lag time. Wordly’s translation software is able to process over 50 different languages.

Wordly can be used at in-person, virtual, and hybrid events, as well as for smaller meetings and webinars. Fully powered by AI, it does not require the use of human interpreters or any special equipment — at in-person events, Wordly’s product can be connected directly into the audio mixer via a laptop, tablet, or smartphone.

A four-part diagram outlining the steps involved in each stage of Wordly's translation system. The text reads as follows:
Presenter - Audio Input, 50+ languages
Event AV System - Audio Mixer or Meeting Platform
Wordly AI System - Real-time Transcription and Translation
Attendees - Audio & Caption Output, 50+ Language Options

Below are icons for personal devices (laptop, tablet, phone) and an in-room display (a large screen).

Main Feature Categories

Live Speech to Speech and Speech to Text Translation

Wordly uses a live translation solution that enables real-time, high quality translations into either audio or text captions — or both. For those who choose the audio option, translations will be delivered with a natural-sounding AI-generated voice that attendees can listen to using personal headphones. For the captions, there are multiple display and delivery options, depending on the event format and personal preference. When it comes to in-person events, attendees can read them on their phones or on a stage screen (either as a side panel or as a lower third). For virtual events and webinars, attendees can access text translations as an embedded live feed within the event platform, as a lower-third on the livestream, or in a separate tab.

Live Speech to Text Captioning

Wordly’s software can also be used to create real-time captioning in the original language of the presentation. This service can be helpful for those who prefer to combine reading and listening. Much like text translations, same-language captions offer multiple different display and delivery options.

Admin Portal

Event organizers have access to a back-end portal, providing a centralized place to set up and manage new live translation sessions. This portal offers quick access to upcoming sessions, the account balance, and reports. It also makes it easy to add more translation minutes to the account, with the added services provided instantly. Additionally, the portal allows organizers to create a custom translation glossary and access translation transcripts.

Web-Based Attendee App

Attendees can quickly join a live translation session via a QR code or URL using their computer or mobile device. Because the app is web-based, attendees do not need to create a Wordly account or download anything to access live translation and captions.

A 3-part graphic showing a QR code and link to Wordly's platform for step 1, a window for selecting the translation language in step 2, and icons for headphones and a phone screen with the instructions "Read Captions on Device Use Headset for Audio."
Instructions for joining a Wordly session.

Integration Hub

Wordly works alongside all major video and event management platforms. It has pre-built integrations with Zoom, Cvent, and many other platforms — along with an API to support easy integration with other systems. (It is also possible to use Wordly as a standalone service.)


As the leading provider in the industry, Wordly offers a user-friendly and affordable AI-powered translation and captioning service. The setup is quick and simple. While the speaker should wear a headset or earbuds, there’s no need for specialized equipment. At live events, the organizer (or production team) can connect Wordly to the sound mixer via a laptop, tablet, or phone. Further, Wordly’s customer service team provides onboarding services as well as ongoing assistance.

On the attendee side, there’s never any need to sign in or download the app.

According to Wordly, roughly 50% of clients use their service for in-person events. The audience typically connects on their phones by scanning a QR code or clicking on a link. Sometimes, organizers opt to set up a live display on stage screens as well.

For the 50% that are virtual, hybrid, or webinars, the most common arrangement is to embed Wordly’s captioning (either through API integration or iFrame) within the event platform or streaming tool.

A split-screen image of a city council member speaking on one side, and the Wordly interface for selecting a language on the other. It also showcases a QR code for Wordly access in the lower third of the city council meeting live stream.
Wordly on Zoom at a city council meeting.

A significant selling point for Worldly is that it comes at a much lower cost than a human translator or interpreter. Additionally, Wordly offers convenient, round-the-clock service; organizers simply schedule a translation/captioning session in the admin portal, no prior notice required.

Accuracy and Customization

Wordly’s captioning and translation engine offers a high level of accuracy, and organizers have the option to add a customizable glossary to improve results and block inappropriate words. Wordly does not, however, currently disambiguate between different speakers — in other words, the captions don’t signal when a new person begins speaking. With that said, the text appears in “digestible” chunks, so it is usually clear when a new sentence begins.

In terms of the visual presentation, it’s possible to customize the appearance of captions (font size, background color, etc).

Further, the transcripts can be downloaded in multiple file formats and translated into multiple languages. It’s also possible to edit subtitle files (e.g. to add speaker names in parentheses or correct any errors in translation or transcription).

Language Options

Wordly’s packages include a flat fee for all the possible translations available, which cover over 50 languages. It’s up to the organizer to choose how many of these language options will appear for the attendee in Wordly’s dropdown menu.

At live events, it’s possible to simultaneously display more than one language translation on the big screen. It’s simply a matter of deciding how to arrange the layout in a way that preserves readability.

Just as it’s possible to translate into multiple languages, it’s also possible to translate from more than one language. For example, one speaker could present in Spanish, and another could present in Mandarin. It’s worth noting, however, that this transition does not happen automatically. The admin or AV tech has to toggle the settings each time the speakers change languages, but the interface makes this switch simple.

An enlarged phone screen is visible on the left-hand side on the image showing the Wordly drop-down menu of language options (Chinese, Dutch, English, etc.). This screenshot is overlayed above a photo of two men looking at a phone together, with one smiling and seeming to give instructions to the other.
Speakers and attendees select from 50+ translation languages.

Bandwidth Requirements and Connectivity

At in-person events, venue WiFi is generally sufficient to process translations — i.e to connect between the speaker’s audio and the Wordly platform — and to support large numbers of attendee connections. Wordly recommends one megabyte per second to cover audience bandwidth needs, and five megabytes per second for the input connection.

For virtual live streams, the Wordly engine streams in sync with the video stream, so there is no lag.

Who is it for?

Wordly’s translation and captioning services could be useful for any meetings and events where some of the participants don’t speak the same native/first language as the presenter or simply prefer to read or listen in another language.

Wordly works for in-person, virtual, hybrid, and webinar formats. It is currently used by over 1,000 organizations and 2 million users. Common use cases include industry conferences, association meetings, customer webinars, sales kickoff meetings, corporate town halls, partner training, and employee onboarding. Wordly’s customers encompass small and large businesses, universities, nonprofit organizations, industry associations, universities, religious organizations, and governments.

A view of a Dreamforce event from behind the audience, looking ahead at the stage at the far right, a large screen displaying a slideshow in the center, and a small monitor to the left displaying Wordly's live captioning.
Wordly live captions displayed on a monitor (left-hand side) at an in-person event.

Who is it not for?

While Wordly could be used purely for transcriptions, it is designed for live translations and captioning. Those who do not need real-time translations or captioning might consider standalone transcription and translation services.


Language Options

  • Multiple languages – 50+ languages, including the top international business languages
  • AI-generated audio translations – Natural-sounding AI-generated voices are available in male or female options, with nearly instant translation.
  • Flexible display options – Attendees can follow captions/translations on their devices, but there is also the option to display captions on large LED screens at in-person events. There are also customization options for formatting (font size, background color, lower third vs. side screen, etc.).
  • Transcripts – Transcriptions are translatable and available in multiple file formats, including subtitle formats.
  • Same-language captioning – Wordly provides an affordable solution for those who prefer to read while listening.


  • Custom Glossaries – It’s possible to create an unlimited number of custom glossaries, with the ability to “boost,” “block,” or “replace” words. Organizers can add industry jargon, acronyms, abbreviations, and important names to improve the accuracy of transcriptions and translations; additions can be made minutes before the session. Organizers can also identify inappropriate words. The transcription and audio translation will then skip over the blocked words. (Wordly can also provide a suggested list of words to block.) Finally, organizers can set replacement rules for words or names that are likely to be misheard (e.g. replace “IMAX” with “IMEX”). This tool can also be used for possible mistranslations (for example, a coffee industry conference might suggest replacing “judía” with “grano de café” in Spanish).
  • Glossary Service – Although normally self-service, Wordly can provide a custom glossary creation service and on-site support.

Integrations and Ease of Access

  • Integrations – Wordly integrates with a wide range of video, streaming, and conference platforms — including Zoom, Teams, Cvent, vFairs, SpotMe, and many more. It also works alongside major event services providers, including Encore and Freeman.
  • No Integrations Required – Wordly can connect directly to the audio mixer, meaning that it’s not necessary to use an event management platform. Attendees would then connect to Wordly’s web-based browser on personal devices through the QR code or link.
  • Consistent Access – On-demand live translation is available 24/7, from anywhere.
  • Auto-generated links – Links provide easy connectivity, both for attendees and speaker/AV tech teams. Attendees do not need to download the app or sign in. AV tech can have control over individual sessions without access to the full account.
  • Simple Admin Interface – The dashboard provides a snapshot of key information, including minutes left, minutes used, and quick access to set up sessions.
A screen shot of the Wordly dashboard, with an easy-to-scan overview of the available time, scheduled time, and recent usage, as well as quick links to schedule a session.

Scalability and Security

  • Scalability – Wordly supports up to 100,000 users per session. It is able to translate to multiple languages simultaneously for different users.
  • Data Security and Privacy – Wordly offers strong security and privacy protection. The organizer/admin owns all the data. Moreover, the data set is not used to train the model, nor is it even stored. Notably, Wordly meets SOC 2 Type II compliance requirements.


  • Multiple translation and captioning options with high accuracy and near real-time output
  • Translates to over 50 different languages
  • AI-generated audio translations for live events
  • Highly affordable compared to human interpreters
  • Strong security measures with a well-established track record of success
  • Easy setup and far less logistically demanding than organizing human translation services


  • Does not currently disambiguate between different speakers
  • Can only interpret one input language per audio source at a time, although it’s easy to toggle between input languages; with that said, it supports multiple audio input sources and each source can have its own designated language, enabling events to have presenters speaking multiple languages simultaneously


Wordly is a SaaS solution that is sold as a subscription service. Pricing is based on the size of the package, which is based on the number of hours and attendees. Packages include live translation and captions into 50+ languages, along with text transcripts.

Depending on the number of languages needed for the event, organizers can usually save 75% or more compared to hiring human interpreters to individually translate into each language. For example, to run a 10-hour event for four languages with human interpreters who each charge $150 per hour, the translation services would cost $6,000. The same event with Wordly would only cost $1,500, saving customers 75%.

Packages start at 10 hours, and customers have 12 months to use the hours. It’s possible to spread the time across multiple events — short or long — with any time used counted in minutes. Volume and non-profit discounts are available.


Wordly is the leading provider of AI-generated live translations and captioning for meetings and business events. The company has over 5 years of experience powering over 300 million minutes of translation at over 50 thousand sessions for over 2 million users. It is used by over 1,000 organizations, including top event organizations like ASAE, ICCA, MPI, and hundreds of global companies.

By leveraging the power of AI, Wordly offers an affordable solution to make events more adaptable to different language needs.

To learn more about Wordly’s many use cases, visit the company website to schedule a demo.

Disclaimer: Skift Meetings receives a fee to review this product. The content above is verified for accuracy and aims to provide an objective look at the product’s key features, potential uses, and pricing, among other factors. If you have any questions please use the work with us section.