Is Social Audio The New Way To Connect?


In a quest to find the best way to connect during isolation, the world discovered the next big thing in social media—social audio apps. Social audio apps allow people to reach out to the rest of the world using just their voices.

Social audio has gained traction since the onset of the pandemic, and a year later, it does not seem to show any signs of slowing down. If anything, it’s only accelerating at a more rapid rate. 

People turn to social audio apps to communicate because it gives them an emotional satisfaction that texting and online messaging can’t. Apart from texting’s lack of nuance, it is more difficult to use it to convey emotion and empathy, which are critical to establishing a real human connection. On the other end of the spectrum, video calls can put too much pressure on users because of how we constantly think about our environment. Are there family members who might pass by? Are my kids around? Will my dog jump on my lap during an important meeting? Social audio sits right in the middle of these two extremes. Tech industry analyst Jeremiah Owyang named this perfect middle the ‘goldilocks medium’ [1].

Among the first to discover the potential of this middle ground was Clubhouse, a California-based live social audio app. The app does not have the option for users to post texts, videos, photos, or slide into one another’s DMs. At Clubhouse, there is only one thing users need to use to communicate—their voices. The social audio app is designed so users can casually enter live group conversations with friends or listen to strangers talk about a variety of topics such as sports, fashion, finance, politics, technology, and anything under the sun. Much of the app’s fame can be attributed to its exclusivity and how huge personalities like Elon Musk have the ability to suddenly pop into a random conversation. With its Series C funding in April 2021, the company is now valued at $4 billion [2]. Currently, the platform is available for millions of users to download, following the release of its Android version. However, only users who received an invitation from existing users can ultimately log in and use the app [3]. 

The popularity of Clubhouse has spurred the biggest social media platforms to integrate social audio features within their existing platforms. By the end of 2020, Twitter introduced Spaces, which works similarly to Clubhouse [4]. A user can create a Space where other users can hop in and either join the live conversation or just listen in. With billions of active users, Facebook is also on its way to releasing its version called Live Audio Rooms, which is expected to be rolled out by the summer of 2021 [5]. Even LinkedIn has also confirmed that it is working on its own social audio experience. However, LinkedIn claims that its version will be different from the rest of its competition in that its social audio feature will be designed to represent its users’ professional identities [6]. 

Discord, a messaging app that has gone beyond its gaming roots to welcome users with different interests, has also introduced a similar feature called Stage Channels [7]. While this feature attempts to compete with Clubhouse, Discord’s version departs from the ephemeral nature of most social audio apps—Stage Channels do not disappear into oblivion. Users can return to conversations they previously joined anytime they want, unlike typical social audio apps where conversations are permanently deleted once they are over [8].

Discussing the ephemeral nature of social audio apps is complex. We are just beginning to understand their potential, yet their impact has already proven to be polarizing. On the one hand, the social audio experience is the closest users can get to a real-world conversation, where everyone has the equal opportunity to speak. For example, activists and social media influencers in Cuba use Twitter Spaces to facilitate critical conversations about Cuban politics and current events, which would otherwise be impossible through short-form, scattered tweets [9]. Cuban users who post critical tweets against the Cuban government can potentially be harassed, intimidated, or worse, arrested—which is why the ephemerality of social audio is so important. 

On the other hand, social audio apps can also make certain groups of users feel unsafe. In September 2020, activist Ashoka Finley moderated an event on Clubhouse to discuss anti-Semitism in Black communities. However, the honest intention of creating a healthy conversation on a divisive topic turned south. According to a Jewish listener who attended the event, the conversation “devolved into anti-Semitic comments, one after another [10].”

Given the common highlights and lowlights of social audio apps, this rising subindustry will obviously require a multi-faceted approach at content moderation [11]. The gaming sector has already started implementing one of the solutions that can potentially address the challenge of moderating social audio apps—recording conversations.

Recorded audio can also be transcribed. For example, in an effort to make Xbox a more inclusive gaming platform, Microsoft equipped Xbox Party Chat with speech-to-text transcription technology [12]. This technology can convert all words spoken by the participants in a party into text and vice versa, which will be particularly helpful for gamers who are hard of hearing. The existence of this technology means that not only reviewing content at scale is possible—but by feeding these transcriptions through a natural language processing filter, categorizing content into themes, and flagging words or phrases that are usually associated with flagged content will be much easier.

While making sure that all users can express themselves without the fear of being harassed is fundamental to a great user experience, helping users find their way into the communities of people who share their interests is just as crucial. To achieve this, relevant content has to be an easy click away. Content such as this will serve as guideposts for users, and will eventually lead them to like-minded communities. Better content tagging and integration will increase the likelihood of the right content reaching intended listeners, generating higher view counts and increased interaction. 

TaskUs uses AI operations for tasks such as video tagging, photo tagging, and video processing to leverage big data initiatives for content generation and audience consumption insights.If you want to know how TaskUs keeps digital communities safe and how we can help create an awesome live social audio user experience, reach out to us through this link.


  1. Why Social Audio Blew Up. And Why Clubhouse And Twitter Spaces Will Spawn ‘Thousands Of Apps’
  2. Clubhouse closes an undisclosed $4B valuation Series C round, as tech giants’ clones circle.
  3. Clubhouse Outlines Plan for Next Stage of Android Roll Out
  4. Twitter: Spaces is here, let’s chat
  5. Be Heard: Bringing Social Audio Experiences to Facebook
  6. LinkedIn confirms it’s working on a Clubhouse rival, too
  7. Discord is launching new Clubhouse-like channels for audio events
  8. How Clubhouse Compares To Discord
  9. Cubans are using Twitter’s live audio platform Spaces to slip past government censors.
  10. A Clubhouse conversation has sparked accusations of anti-Semitism
  11. The Challenge of Moderating Audio-First Platforms
  12. Xbox is testing accessible chat options like transcription and speech synthesis


Pin It on Pinterest