An interactive TikTok Live robot that engages with the audience in real time, replying to comments using custom TTS voices and augmenting its responses with long term memory stored in a vector database powered by Zep, making interactions feel personal and human like.
Background Project|TikTok was blowing up with this trend called "checking spiritual guides" (khodam). People would drop their names in a host’s livestream, and the host acting like a shaman would “reveal” their spiritual guide just from the name. Honestly, I thought it was ridiculous… yet somehow the host kept getting loads of gifts. Social media never fails to surprise me. So, out of curiosity, I made a bot that does the exact same thing just to see if it could grab viewers’ attention too.
Kino AI Version 1|This is the exploration version, where I was simply figuring out what’s actually needed to build a livestream bot from scratch.
Tech Stack V1 – Next.js detail|Build the bot public facing page.
Tech Stack V1 – Tikfinity detail|WebSocket tool for capturing TikTok livestream comments in real time.
Tech Stack V1 – OpenAI detail|Handles the AI logic to answer questions or generate responses.
Result V1|Boooooring, bounce rate is high because the bot wasn’t giving anything exciting. Without interesting or surprising responses, viewers quickly lost interest and scrolled away. On top of that, the async process could only handle one question at a time, and the OpenAI request often took 1–5 seconds, forcing viewers to stare at a loading screen, which only made them more impatient and complain like “why my comment isn’t being read!. I got only 1–5 viewers lol.
Kino AI Version 2|Learned from V1, I added queue system so the bot can process multiple comments in parallel no more blank screens. I also remove the “name only” rule so viewers can ask anything, introduced an anime style character for personality, and added TTS to make the stream feel alive.
Tech Stack V2 – Next.js detail|Build the bot dashboard and public facing page.
Tech Stack V2 – Tikfinity detail|WebSocket tool for capturing TikTok livestream comments in real time.
Tech Stack V2 – OpenAI detail|Answering question
Tech Stack V2 – Applio detail|A simple, high quality voice conversion tool focused on ease of use and performance.
TTS Exploration|Applio needs a clean voice dataset to train its custom TTS, so I used TopMediai to generate some sample paragraphs. However, the result wasn’t what I wanted, the silence between commas and periods was too long, which made the speech sound unnatural. To fix this, I used FFmpeg to automatically trim out the excessive silence and produce smoother audio.
Result V2 – Pros text|It caught a lot of attention my live viewers jumped to 50–100 per stream. I barely received any gifts or diamonds, but going from just 1–5 viewers is still solid progress. I also often gained over 100 new followers by the end of each stream.
Result V2 – Cons text|I had no idea how unpredictable and sometimes toxic anonymous viewer questions could be. I initially thought the questions would stay lighthearted and fun, but quickly realized that with anonymity comes a lot of unexpected or inappropriate prompts. I tried blocking certain keywords and even banning some viewers, but it didn’t work effectively. Inappropriate and sexual questions still slipped through. It became clear that simple filtering wasn’t enough to maintain the positive vibe I wanted, so I decided to change the bot’s character from a human figure to a robot figure instead.
Kino AI Version 3 – Intro|In this version I focused to create features to gather some diamonds
Tech Stack V3 – Next.js detail|Builds the bot dashboard and public facing page.
Tech Stack V3 – Tikfinity detail|WebSocket tool for capturing TikTok livestream comments in real time.
Tech Stack V3 – Applio detail|A simple, high quality voice conversion tool focused on ease of use and performance.
Tech Stack V3 – Flowise detail|A low-code visual tool for building and managing AI workflows, making it easier to connect APIs, models, and processing steps without complex coding.
Tech Stack V3 – Zep Memory detail|A long-term conversational memory service that stores and retrieves interactions, enabling the bot to remember context across sessions.
Tech Stack V3 – OpenAI detail|Provides embeddings for memory storage and retrieval (via Zep) to generate responses to user.
Tech Stack V3 – Three.js detail|A JavaScript 3D library used to create and render interactive 3D visuals in the livestream overlay.
Tech Stack V3 – Socket.IO detail|Enables real time communication between my PC and external device, allowing me to manage and control the livestream remotely.
Result V3|Although the viewer count doesn’t change much (around 50–100 viewers), I received a good number of gifts, most of them coming from the polling feature. On average, I earned between 50–900 diamonds per livestream, with each session lasting at least 3 hours. I still can’t compete with the “Shaman” I mentioned earlier, but I think this is good progress.
Blocker paragraph|TikTok has been stricter lately. I used to leave Kino AI running while I did other things even while sleeping but now they have a captcha challenge that appears at random times, which I can’t always complete. This first led to a 1 day ban, then a 1 week ban, and finally a 1 month ban. I think because of these frequent bans, my viewer count has dropped.