In June we hosted our third Tech Meetup at Orbit! This time, we were happy to welcome Ferdi Güran, who works as an Innovation Architect at Deepshore GmbH and took us on a deep dive into the world of large language models.
A brief history of Chatbots
We started the evening with a brief history of chatbots, introduced by our colleague Ellen. The first chatbot was ELIZA, created by Joseph Weizenbaum in 1966, and it simulated a psychotherapist. This was followed in the early 1970s by PARRY that simulated a person with paranoid schizophrenia. PARRY passed a variation of the famous Turing test and could be identified as non-human by the subjects in only 48 percent of the test cases. In 1972, there was even a strange conversation between ELIZA and PARRY (read part of the encounter here).
In the following years and decades, more chatbots followed and after the first precursors of personal assistants as well as today's popular assistants like Alexa and SIRI, we have now arrived at ChatGPT.
Learn more about the history of chatbots at onlim.com.
Building a custom knowledge chatbot
After we have gone through the timeline of chatbot history, Ferdi showed us hands-on how to develop a custom knowledge chatbot. It is accessible on the Deepshore Website and provides a prompt to search for topics in their knowledge base.
Using embeddings, the desired content (in this case Deepshores blog posts) is converted into vectors, which are then stored in a vector store. To process a user's question, the prompts are also converted by the model into vectors, which can then be located by the index together with the contents in a vector space. The closer the vectors, the more related they are. This way, you can ask “What does k8n stand for?” and the chatbot deduces the definition from the blog posts and adds the link to the post.
To make the location of the data in the vector space more tangible, it is possible to visualize the vectors, for example with t-SNE.
Check out this blog post from Deepshore for more details on how to create your own chatbot.
Takeaways on data protection
1. When processing data, you have to take into account who else gets access to it. With sensitive data, for example, you might want to reconsider converting the data or prompts into vectors by using OpenAPI.
2. In addition, most frameworks transmit telemetry data to manufacturers. To avoid this, you have to explicitly opt-out.
3. To keep the greatest possible control over the data, it is possible to use your own model. There are already well-trained models available for personal and also commercial use. However, this requires quite a bit of computing power, so you have to consider the hardware requirements accordingly.
Join us on our 4th Meetup
After the deep dive there was an open and relaxed networking with delicious pizza and drinks. The next Meetup will take place on September 5th at the Orbit office (Rödingsmarkt 20, Hamburg).
In September we will deal with a comparison of frontend frameworks based on a practical example, followed by a relaxed exchange, of course with tasty food and drinks.