How to Design Apple Music: Cloud Architecture
(To the readers: This blog is an attempt to design a streaming music system. The existing music systems like Spotify, SoundCloud, Rdio, Amazon Music, Deezer, Apple Music, Pandora, etc. are very complex and were developed over a period of time with the help of smart engineers. This blog tries to simplify the conception & development of streaming music system for easy comprehension of a smart designer or product leader).
Intro
Music has been an integral part of human being. Research shows that music has calming effect on babies and even animals. Music is proved to have an emotional impact on living beings (including animals and trees). The advent of internet, the (radical) P2P music sharing apps like napster and the still-evolving internet etiquette on music sharing has created a booming business for music creation, playing, listening and sharing.
EchoNest concluded that, though island population produces disproportionately more music in the world, there are large number of tracks created by artists worldwide and are made available through streaming systems. There were 317 billion streams (total tracks played by all music systems for all music lovers) in 2015. SoundCloud is reported to have 130m tracks (including remixes). The streaming music is more popular that music downloads. According IFPI 2016 report, the downloads went down by 10% while streaming revenue went up by 96%. The total world digital music revenue is $6.7billion. There are 68 million paid subscribers to streaming music industry. The sheer numbers makes it enticing for either new businesses to enter streaming music or for the existing music players to strengthen their offerings.
There are many music players that cater to the needs. Though I may focus on music players that are popular in US, there is a booming business in all the world for the local language music.
Let's have a quick look at these players. The music players come in various shapes and forms. The players are available as mobile apps and as desktop apps. Many of them play on the browsers and also on home audio systems too. The apps are available for the popular operating systems viz., iOS, Windows, Android and MacOS. Users can sign up for free, paid or freemium services. Some of free services have ads. These players have a good collection of songs and tracks. The library size of each of these streaming music players varies between 1 million songs and 130 million songs. The players can stream at different speeds (kbps).
The commonly heard players are Apple Music, Pandora, Soundcloud, Spotify, Amazon Music, Tidal, Deezer, Rdio, Rhapsody, etc. If you want to know the total list of streaming music players available in USA (according to IFPI 2016 report), go to the bottom of this blog.
Music players use different offerings to acquire customers. Free (ad supported), paid, freemium, offline play, telco-bundles, family packages, advertisement through TV & Radio Ads, free concert tickets and product upgrades are some of the ways. The last feature (i.e. product upgrades) is of particular interest to me as a product leader.
Why some music systems are more popular?
Some of the factors that drive user base are loyalty, quality of songs, size of the song collection, fees, visually appealing mobile app (with good interaction design), ease of use of the system, subscription packages (paid, family, bundled, etc.), live streaming, concert tickets & info, artists (not all players have all the artists).
Additional drivers that influence the user of a player are, the quality of the songs being played (bit rates range from 128kbps and 320kbps or lossless), the playlists available, the relevance of songs recommended, the ease of sharing with friends, etc. The size of the player (in megabytes) it consumes in a mobile device could also be a factor.
The following are some of the features that are available in popular streaming music players
- Offline play: Users can download the song to the device and play without internet connection
- Radio channel: The player has multiple channels that cater to different needs and moods
- Playlists selected by the company: A default list of songs (curated or otherwise) made available to listeners
- Share playlists with friends
- Lossless compression: This allows users that download will use less memory but do not lose quality
- Create a personalized playlist every week
- Human curated playlist (picked by music experts)
- Add songs to playlist: If the listener feels that a song is missing in the playlist, they could add to the playlist.
- Buy a song
- Early and exclusive access to new music for a short time
If you are a product manager of a music system, here are some additional ideas for players you may want to add (I got these ideas based on inputs from a few teenagers)
- Live broadcast of music
- Instant capture the details of the song being played on the radio; share it; store it for repeat listening (similar to Shazam; read about Shazam below in this blog)
- Lyrics on the screen while the song is playing; access to lyrics
- Remix songs; templates to remix; share with friends
- Temporal and contextual playing of songs: Christmas, morning, running, raining, etc.
- Support features: Listing of live shows, buy tickets; alerts / reminders about shows, share with friends, share live shows with friends with comments
- Memes based on songs; help with making memes using standard filters and templates
- Allow users modify the lyrics, sing it and share with friends
- Ring tone or Song pieces: cut 20 second song out of a song for free (to make a meme, to share with friends, to remix, to make a ring tone)
- First time free: Listen to the song first time free
- Karaoke: Sing karaoke and share with others
User behavior
Many online systems that cater to the needs of humans on a daily basis need to study user behavior. For example a News site or a music site should know the user behavior even more critically compared to a bank or a travel site. The user may want to listen to the favorite news or music during early morning drive to work, or want to play the favorite videos during his afternoon workout or listen to the favorite music just before he retires for the day. The mood changes multiple times in a day and hence the system has to be designed to cater to the same. Knowing this info will help the music player personalize the experience.
Let’s start
Lets design a music streaming system. Lets call it Bliss. Bliss will be a streaming music system that everyone wants. In the modern day world, people of all ages, even plants and animals need music. I would want Bliss to be handy, easy to use, have a large collection of tracks, quick to suggest what I want to listen to, is sensitive to my tastes and moods, plays appropriate music, and is device agnostic. Bliss is will be so attuned to its customers, I will address the listeners as music lovers.
PRD
PURPOSE: Design a streaming music mobile app that transforms its listeners’ moods and makes them happy.
USER PROFILE: Music lovers of age group 10-25 yrs. old living in urban areas with decent Internet bandwidth (2 MBPS). The users are currently in school or college, or entered workforce just recently. The users will have 0-2 hrs. a day to listen to music. They listen to music to relax, to think or to just be aware of the latest songs. They listen while working / studying / texting / talking / walking / sleeping.
PRODUCT PRINCIPLES: Super easy to download, install, find, play and store the songs.
FEATURE SET
- Easy to download and easy to use
- Show playlists and more
- “Play instantly when I click”; no buffering
- Prevent Fraud in streaming: Do not let bots (or other means) to fake streams of songs (a fakester sing mediocre quality songs, uploads and turn around to write a bot that continuously play that song, forcing the streaming company pay “per stream”; prevent silent songs (prevent disgruntled or otherwise singers to upload silent songs and ask their fans to play or write a bit to play)
- Load on the system:
- Amount of music uploaded per min: 12hrs of music
- Total tracks available to search and play: 100m
- # users: 150m/month
- # artists: 10m
- # streams per day: 1 billion
- # events per day: 90 billion (event is an activity that the system captures including state transitions; examples of events are: app downloaded, app opened, song selected, browsed, played, skipped, paused, shared with friends, checked friends comments, etc.)
- Skip time to register royalty: 30 seconds
6. “Arrange the songs on the screen in the order I like (those I like the most are at the top of the screen; also arrange them contextually)”
7. Examples that create worst experience for music lovers:
- “My playlist is missing”
- “The song I picked did not play”
- “Song is buffering”
- “The quality of song is bad”
- “The music app crashed or takes too long to launch”
Design components
Bliss is a content-heavy application. The design of such an application is discussed in a separate blog. Hence I will not go into the details here.
Nevertheless, here are a few special cases of design suggestions for Bliss.
Storage
The music system needs to have a master storage that stores all the tracks. The master storage needs to be replicated and also need to be geographically distributed. Hosting it on public domain clouds (Google, Amazon, Microsoft, etc.) may be efficient, cheap and quick to retrieve. The storage should have a mechanism to add new albums and new tracks into the storage. As addition of tracks to master storage is frequent and as replication is a major requirement to reduce latency, I prefer to use Cassandra. Cassandra has a built-in replication feature and stores large amount of data. Also Cassandra is open source with a huge developer community to support.
Bliss should also store the “popular” songs in cache that needs to be further replicated.
CDNs should be used to store and make tracks are accessible to users to reduce latency. Though some high profile companies (Amazon, Apple, Netflix) have used custom-made CDNs, OEM CDNs would serve a great purpose. Bliss is expected to have 150m tracks, though the tracks that are heard more often may be a fraction of it.
Recommendation
Music could be recommended as a single track or playlists. Bliss should have system recommended and personalized recommended tracks and playlists.
System recommended playlist are typically created and curated by music experts hired by Bliss. Creation of such playlists could be automated using Content based filtering (item-item filtering) method At a macro level, categorization of tracks could be done on genre, artists, geography, mood, etc.
Personalized recommendation could be created based on collaborative filtering method and item-item filtering. Some recommendation engines combine both the technique to arrive at a hybrid method called Matrix Factorization. (To know more about item-item filtering method and recommendation engines, please read a different blog).
Search
Music lovers should be able to search based on track, artist, keywords, genre, time period, contextual (for example mood of the listener, festival/seasonal specific, etc.). Every track is fingerprinted and has a unique International Standard Recording Code (ISRC). The same song with different recordings, or edits or remixes or cover songs will have different ISRCs.
Identifying the right track the listener is looking for, is important. There are various algorithms available to fingerprint a track. The ideal fingerprinting algorithm should satisfy a few expectations - should be accurate despite distortion in the songs (due to recordings, remixes, compression, etc.), should be easy & quick to compute and should create a unique fingerprint for the track.
Browse
Music lovers could browse the tracks based on factors like artists, genre, title of the track, album, etc.
Availability
What does the music lovers hate the most? They do when Bliss starts buffering or if the song is of low quality. Bliss should be intelligent enough to download the right quality track based on the bandwidth. At the same time, Bliss should have a load balancer to manage client traffic.
Shuffling
Shuffling of tracks in a playlist could be achieved by randomly rearranging the songs. But complete randomness may not work for music.
Here is the reason. If there are 50 songs in a playlist and if there are tracks from 3 different artists, a complete random re-arrangement of the tracks may end up listing some tracks of the same artist. This may not give a good user experience. Hence identifying different sub-categories and sub-sub-categories within a playlist (like artists, mood, temporal, etc.) and re-arranging the songs randomly with a sub or sub-sub category may work well.
Let me explain with an example. Let’s say that the playlist looks like the following:
AAAAAAABBBBBBBCCCCCCC (A, B and C being different artists),
a completely random re-arrangement (using random() in python / Java; or rand() in PHP; or <random> in C++), may end up the following:
ABCAAABCCCABBCBCBACBA
Though the above sequence is random enough, the music lover will be forced to listen to the artist C three times in a sequence which may not help with the user experience. An example of a sequence that may result in a better user experience is
ABCBCACABACBCABCBACBA
Hence Bliss needs to pick the right shuffling algorithm.
User & App data analytics
To make the right analytics, to understand the user behavior accurately and to provide the best user experience, many data points need to be captured. Here is a smaller list of data points.
- Monitor the market place to track the health of the application and usage. I would call this use case as aggregations: There are a lot of aggregations possible. (i) what the music lovers are listening to, (ii) where they are located, (iii) which client applications (iPhone, android, windows, etc.), (iv) who are the top music lovers (v) what are the top songs, (vi) average, sum, count, percent, of songs, users (vii) temporal data (i.e. users per day/hour, etc.)
- Check the state transition of a single user and his/her usage pattern: Which song, when, where, did he skip, friends, playlists, time of the day/week, etc.
Preventing misuse and fraud
Streaming music companies pay the artists based on “per-streams”.
Let me explain. Every subscriber pays a standard amount, say $10 per month. The streaming company takes a standard cut, say 30%, i.e. $3 as their profit. The rest ($7) is distributed to the artists. What is the math for distribution? In a simple terms, the streaming company divides the money among all artists based on the number of times their songs were heard.
Here is the math:
Let's say there are 1 million subscribers and they paid $10 each per month. Let's assume there are 3 tracks of artist A (i.e. the names of the tracks are A1, A2, A3), 2 tracks of artist B (B1 and B2) and 4 tracks of artist C (C1, C2, C3, C4). Let's say the subscribers listened to
A1 = 5 million times i.e. 5 millions of streams for A1
A2 = 7 million times i.e. 7 millions of streams for A2
A3 = 3 million times i.e. 3 millions of streams for A3
A1+A2+A3 = 15 million streams
Similarly let’s say, B1+B2= 7 million streams and C1+C2+C3+C4 = 14 million streams.
Total streams = A1+A2+A3+B1+B2+C1+C2+C3+C4 = 36 million
The per stream revenue is ($7 * 1 million subscribers)/Total streams = $ 0.1944.
Artist A gets (0.1944 * 15 million) = $2.92m, B gets $1.361m and C gets $2.722m
But there is a possibility of misuse. One could write a bot (or an automated process) that continuously plays the same track (of an artist), though no one is actually listening to the track, benefiting that artist.
How to mitigate fraud clicks?
Here are some ideas.
- Detect unusual streaming of a single track
- Detect unusual streaming by a single subscriber
- Detecting and removing low quality songs (a difficult one to implement and human intensive)
- Instead of paying per-stream, pay the artists “per-subscriber who listened to you”. In other words, paying the average amount of money from all subscribers to all artists, pay the average amount of money from those subscribers that heard the artist
- If a subscriber is a heavy user like a hair salon, gym, dentist, restaurant, etc. (small business), the streaming business may make them pay more (similar to telephone companies restricting usage of “unlimited talking minutes”)
- Untested idea (inspired by Taylor Swift’s WSJ article): The artist specifies the price to listen to the song (the more you listen, the less you pay). The streaming system will charge you <= (less or equal to) the monthly limit and alert you as you reach that limit (a la a cell phone company’s alerts on data usage). And research proves that gains in streaming sales compared to losses in permanent downloads are revenue-neutral. In others words, there is evidence to show the losses in permanent downloads almost outweigh the gains in streaming business. (Note to the Product Manager: This idea may complicate the pricing policy. As a product manager, I feel that the pricing policies should very simple for the subscribers to sign up).
I like a song but do not know about it...
What if you listen to a song on radio and instantly like it. You would like to know more about the song but don't know the artist, or the track or the name of the song (i.e. don’t know the metadata of the track). You need a song recognition feature that will listen to the track and identify the track for you. Shazam is a music recognition company that invented a simple yet effective method to recognize a track. The algorithm is explained in a paper.
Let me explain how this works in simple words. If you hear an unknown song and want to know more about the song, first record the song for 30 seconds. Lets call this as the sample-song.
First, Shazam makes a spectrogram of the sample-song. A spectrogram of a song is a three dimensional graph of frequency vs. amplitude vs. time. Then Shazam picks the peaks of the graph, which are unique to the song. The peak points together form the unique signature. The peaks will remain unique even if the sample song is recorded over noise or poorly recorded. Once the signature of the sample-song is identified, Shazam compares this signature with the signatures of million of stored songs. Once a match is found, Shazam extract the details of the song for you.
Streaming music players in USA
The total list of streaming music players available in USA is (according to IFPI 2016 report):
7digital, Acoustic Sounds, Amazon, AOL Radio Plus, Apple Music, ArtistXite, Beatport, CD Universe, ChristianBook.com, Classical Archives, Classics Online HD, Daily Motion, Deezer, Free All Music, Freegal Music, GhostTunes, Google Play, Groove Music Pass, Guvera, Hastings, HD Tracks, Hoopla, Hulu, iOldies, iTunes, Listen, MetroPCS, Microsoft Music Store, MTV, Music Choice, Naxos, Onkyo, Platinum League, Playster, Pono, Presto Classical, Pro Studio Masters, Pulselocker, Qello, Rhapsody, Rithm, Rokmobile, Shuv, Slacker, SongPop, SoundCloud, Spotify, Sprint, TheOverflow, TIDAL, Verizon Wireless, Vessel, VEVO, VidZone, Virgin, Yahoo! Music, YouTube
Side note: Why did I name the system Bliss?
Many systems do not have inspiring names. Names like Apple Music, Amazon Music, Microsoft Music Store, etc. are too bland. Spotify, Rdio, Deezer, etc. are just good wordplay.
I think music has a purpose. It soothes the mind, makes you calm and brings pleasure. It brings pure ecstasy for music lovers. Music can also bring harmony among humans. History shows that music has been used as a medium to bring social change. Hence music impact humans and makes you blissful. People enjoy great joy with music. Listening to music is pure bliss and hence the name. (A cyber squatter has taken Bliss domain name).
Conclusion
Music is very dear to living things. There is a lot of innovation yet to arrive in the streaming music player industry to bring the best user experience. The profit potential and impact (on human mind) potential is so high that more players and more consolidation will be seen in the near future.
Also people commonly listen to music of one language. Does this mean that there is NO market or there exists huge market to play foreign language music?
This blog focuses exclusively on audio music streaming players. I did not pay attention to video streaming music players. There is a big white space that needs to be addressed. More innovations in bandwidths, compression and decompression technologies, smart transmission of video, efficient storage, intelligent video finger-printing technologies, will arrive soon to make the video streaming music industry take off.
Thanks for following my Blog.
Nashet Ali
Senior Software Development Engineer.
Comments
Post a Comment