A developer’s guide to building a WhatsApp chatbot


Share

Long ago, I built a web app for an enterprise (200k+ employees). Feedback on the web app prompted me to do an experiment. What would happen if I were to create a WhatsApp chatbot to solve the specific use case, instead of a web app? Would people like it better? Would it be more useful?

But the story is much bigger than that. Chatbots have so much potential! In this WhatsApp chatbot tutorial, I wanted to share my learning journey, best practices I discovered, and my predictions on the app vs. chatbot question. And beyond: Will chatbots one day replace all traditional apps?

The company I worked for had an office for around 2,000 employees in the middle of downtown Tel Aviv. The parking space is very limited, so people park in spaces intentionally designed for double-parking, where one car is blocking another. In every double-parking space, the first driver parks inside, and the second parks outside.

From a whiteboard to a web app

Before there was an app, there was a big board. All drivers marked where they were parking, and their names and phone numbers. Before leaving the office, a driver had to check whether anyone was blocking their car. If so, they would have to call the other driver and hope that they were not in the middle of a meeting so they could go and move their car out of the way.

To make the process better, I created a web app.

The original web app concept, showing rows of inside and outside double-parking spaces with numbers and sometimes names. Those with a name are blue, and those without a name are green.

It was simple. No back end. No server hosting. No database maintenance. Not even any UI framework. No webpack and no JS bundles at all! Just vanilla JavaScript.

It was hosted by using GitHub’s free static page hosting. The database was FireBase, so we had realtime support and JSON support, and no need for a back end.

The user interface was straightforward. Users would see all parking spots and click on an empty one to fill in their details. If they were already parked, it would take the data from the browser’s local storage. If they clicked on a registered slot, they would see the relevant contact details and could choose to call the driver.

It was working great for almost a year. Less than one day of development helped and saved time for many people—a good investment.

From a web app to chatbot

One day, Facebook announced that they are going to release an API for WhatsApp. The next day, my brother bought an Amazon Echo, featuring Alexa. Around that time, I also started to see Google Assistant everywhere.

I started to think that maybe the world was moving toward chatbots, so I should experiment. Would users prefer to use chatbots? Would I need to give less support? Would it introduce any new meta-features simply by leveraging different infrastructure?

I got some feedback on the regular web app, and I believed it might address it if I were to create a WhatsApp chatbot:

  • The app didn’t work well on some old mobile phones.
  • It didn’t work underground (where the parking is—there is no good mobile signal there).
  • Drivers wished to send messages to the blockers instead of opening the phone dialer.
  • Drivers wanted to get push notifications if someone was blocking them, instead of opening the web app every time before leaving.

It’s important to remember that the developers of chat platforms like Telegram or WhatsApp had worked days and nights for years to ensure the stability of their apps. By using their resources and developing only a small engine for answering questions, that would leave the hard work of maintainability to chat platform developers. All I had to do was dig into how to make a WhatsApp chatbot.

Immediately after I started developing the new parking assistant chatbot, I realized how fantastic the idea was. It was so easy and fast to add new features, and I didn’t even need to do end-to-end testing.

No signal? No problem.

Not only that, I no longer needed a complicated CI/CD process. If it is working in a chat emulator, it would work everywhere. No .apk, no Xcode, no App Store, nor Google Play. The chatbot was able to send messages to users without me needing to register devices, use PubSub or similar services for push notifications, or save user tokens. No need for an authentication system—I was using the user’s phone number as identification.

No signal? No problem. I didn’t need to add offline support using manifest files: WhatsApp gave it to me out of the box. The message would go out soon enough, when the user would go to an upper level where wifi was better.

Then I realized that every time a chat platform would introduce a new feature, my app would immediately benefit from it. Wow—now that’s a really good investment. (To be fair, there is also the risk that new features may limit functionality or create breaking changes that require more development effort, so think carefully before implementing business-critical tasks).

Writing parking assistant, a prototype WhatsApp chatbot

To create a WhatsApp chatbot, the first challenge is to get messages from WhatsApp to your program. The simplest solution I found is to use a shared Twilio phone number. It’s just for development—when moving to production, developers will want to use a dedicated phone number.

Twilio’s free numbers are each shared across many Twilio users. To differentiate an app’s end users from those of other Twilio users’ apps, end users have to send a predefined message to the chatbot.

After a user sends a special message to the shared number, all the messages from their number will be directed to your Twilio account and webhooks. This is why a dedicated number is needed in production—there’s no guarantee that a given user will only want to use one app on a given shared number.

Sending WhatsApp messages

On Twilio’s “Programmable SMS Dashboard,” there’s a “WhatsApp Beta” link in the left-hand navbar:

A screenshot of Twilio's "Programmable SMS Dashboard," showing a graph of recent messages, and another graph of recent errors and warnings. The fourth option in the navbar is "WhatsApp Beta."
Clicking that, developers will see a page with the option “Sandbox.”
A screenshot of the setup step of Twilio's WhatsApp sandbox, waiting for a special WhatsApp message to be sent to a particular number.

To associate users, they will need to send one special message to the number Twilio provides. Once users do that, we can start sending messages to them and process messages from them, via Twilio.

Here is an example of sending a message using cURL:

curl 'https://api.twilio.com/2010-04-01/Accounts/{user_account}/Messages.json' -X POST 
  --data-urlencode 'To=whatsapp:+{to_phone_number}' 
  --data-urlencode 'From=whatsapp:+{from_phone_number}' 
  --data-urlencode 'Body={escaped_message_body}' 
  -u {user_account}:user_token

This is a simple text message. But you can also attach media (images, etc.) to your messages. Here’s a Node.js example:

function sendWhatsApp(to, body, media) {
  const auth =
    "twilio_clientid:twilio_api"
  const sendURL =
    "https://api.twilio.com/2010-04-01/Accounts/{account_id}/Messages.json"
  const res = await fetch(sendURL, {
    headers: {
      Authorization: "Basic " + Buffer.from(auth).toString("base64"),
    },
    method: "POST",
    body: objToFORM(
      JSONRemoveUndefined({
        To: "whatsapp:+972" + to.replace(/-/g, "").replace(/^0/, ""),
        From: "whatsapp:+18454069614",
        Body: body,
        MediaUrl: media,
      }),
    ),
  })
}

function objToFORM(obj) {
  const params = new URLSearchParams()
  for (var a in obj) {
    params.append(a, obj[a])
  }
  return params
}

function JSONRemoveUndefined(obj) {
  return JSON.parse(JSON.stringify(obj))
}

That’s it: Now we can start sending messages to clients! But it’s important to remember the two most crucial technical limitations of WhatsApp messages:

  1. When the bot receives a message, you can send one text reply for free. More than that cost money.
  2. The bot can send messages to users only during the 24-hour window starting when it receives a message from a user. Outside of this window, the bot can send only messages using approved templates, as we will see later on.

Receiving WhatsApp messages

Sending messages was fairly easy, but receiving and processing messages is even easier.

A screenshot of the "Twilio Sandbox for WhatsApp" page. The Sandbox Configuration section has two endpoint URL fields, for "when a message comes in" and "status callback URL." The Sandbox Participants section lists user ids (in the format "whatsapp:" followed by a phone number) and has the same instructions as before on how to invite friends to the sandbox via sending a special message.

On Twilio’s “sandbox” page, developers can define where Twilio should send messages it receives at the shared WhatsApp number. During development, services like Ngrok or Serveo can provide public URLs that route to local developer machines.

Twilio WhatsApp messages look like this:

{
  "NumMedia": "0",
  "SmsSid": "{sms_id}",
  "SmsStatus": "received",
  "Body": "Example Message from user",
  "To": "whatsapp:+{phone_number}",
  "NumSegments": "1",
  "MessageSid": "{message_sid}",
  "AccountSid": "{account_sid}",
  "From": "whatsapp:+{phone_number}",
  "ApiVersion": "2010-04-01"
}

That is all we need. We can use any programming language to get this message, parse it, and try to understand what the users are asking. This will probably result in some CRUD operations on a database, after which the bot can deliver the appropriate information (or success/fail message) to the user in its reply. Those are the basics of how to create a WhatsApp chatbot.

Message templates

As mentioned, chatbots can send freestyle messages only to users who are “currently” interacting with them, i.e., during the 24-hour window. But if you wish to send messages to new users, or outside of the window, you must use pre-approved message templates. This is to prevent spam.

In my use case, I wanted to update drivers when someone was blocking them, even if they aren’t users of the chatbot. In Twilio, click on “sender” and “configure.”

A screenshot of Twilio's "WhatsApp Enabled Senders" page, listing numbers, their business display names, and statuses (one listed is marked Approved, the other, "Waiting for Approval from WhatsApp.")

This is the template I chose:

{{1}} is blocking your exit from the parking lot. I will notify you when they leave.

Several days later, Facebook approved my template, and I could start to send those messages to everyone who had WhatsApp, not just drivers who had sent a message to the chatbot.

Sending a message from a template is exactly like sending a regular message, using the same API. WhatsApp automatically sees that it matches a template and approves the message.

Not only for parking assistant

This strategy is exciting for me when I imagine an online store: Perhaps one day people will be able to buy anything using chatbots. It would be as easy as sending WhatsApp messages and attaching images. Just imagine if users were able to attach real money to each WhatsApp message. It would be very straightforward to buy things. Users would easily be able to purchase anything by speaking with a supplier’s chatbot.

Imagine a chatbot that replaces Waze or Google Maps. You send it a text message of your destination. The chatbot platform is tracking your location, and the chatbot sends you a recorded message that plays automatically with the realtime spoken direction of the navigation.

It’s not fantasy. WhatsApp currently supports location sharing in real time—all that they need is the option to autoplay received messages, and voilà.

Think about a Waze chatbot or a taxi chatbot instead of the Gett or Uber apps. You send a message saying where you are, then the taxi arrives, and you pay using WhatsApp. So simple.

Some readers may be thinking, “Don’t users prefer graphical interfaces, and not just typing?” I believe that chatbot platforms will give the chatbot owner the option to send buttons, images, and pure HTML boxes during the conversion. Facebook already supports Webview for Messenger. Users don’t need to install anything, just use their preferred instant-messaging app.

These advantages are why developers are looking to create WhatsApp chatbots to handle important tasks, like giving instant authoritative answers about the coronavirus pandemic, to help curb the spread of misinformation.

TL;DR: 7 conclusions about migrating web apps to chatbots

In summary:

  • Many times, developing a chatbot can cut development time significantly, because there’s no need to design and plan a graphical user interface. (That said, it’s worth looking at the finer points of chatbot UX design before beginning, to learn from the mistakes of others.)
  • It is much easier to add new features to chatbots. Developers don’t need to redesign or change any current elements. The chatbot just needs to start understanding the new type of message.
  • Chatbots are much more accessible by default to people with special needs.
  • No need to customize a cross-platform solution. The chatbot platform does that already.
  • Users trust chatbots much more for sharing information. You don’t need to ask for permission or show warnings – e.g., the user can simply choose an image from their gallery and send it to your chatbot—permission to access the image gallery is already given to the chat platform.
  • Chat platforms make it easy to handle push notifications. Push notifications are what makes the difference between apps that users forget and apps users will engage with regularly.
  • Chat platforms handle moving between offline and online conditions for you.

How to build a WhatsApp chatbot: Parting advice and best practices

The merits of writing a chatbot are pretty clear. Developers who are ready to build one are advised to start small, with a chatbot that understands one message. And handles it well.

Chatbots should stick to short messages. People don’t read long messages. When you have something important to send that can’t be expressed concisely, it’s better to split messages to several small ones.

Chatbots with personality are received better. Even some bare-minimum “human speak” goes a long way compared with a “system message formality” approach: “I will update the parking map for you,” instead of, “The database has been updated.” A chatbot should leave the user with the sense that it’s a machine that’s there to serve the user, rather than a black box performing technical operations they may not be in a position to understand.

This WhatsApp chatbot tutorial didn’t get into the specifics of parsing the natural-language messages users will send to chatbots. But aspiring providers of chatbot development services are welcome to peruse the source code of the WhatsApp Parking Assistant bot (particularly hackparkDialogFlow.ts, which accepts requests from the user as actions) to get a feel for how that aspect works.

For a more in-depth article about how to detect different types of user messages—while also following the dependency injection approach to programming—see Toptal’s TypeScript chatbot tutorial.


Leave a reply