ChatGPT. It is the hot topic of 2023. Whether you have embraced OpenAI with open arms, pretended it doesn’t exist, or somewhere in between, you’ve undoubtedly heard about it and everyone is asking about it.
At Tiber Solutions, we are embracing the technology with open arms. Some of our developers use it to debug misbehaving code or even produce original code faster. Others are building generative AI products with our customers.
Before diving too deep into the future, it is important to understand where we have been, where we are, and why. The reason textual data has lagged so far behind numerical data for years is because it is inherently easier to visualize and display numerical data. For decades, we have organized household budgets, statistics from our favorite sports teams, and business sales data in tables. Today, excel and SQL databases are commonplace. Modern computers can read numerical data and efficiently store it in this tabular form.
Textual data, on the other hand, is a bit trickier to house and manipulate. Spreadsheets are far less suitable. In order to “read” textual data, machines need the textual data converted to numbers. Large language models do this by converting words into vector embeddings, long sequences of numbers that uniquely represent words. For humans, however, the numerical representation of a word is completely unreadable. OpenAI’s GPT series of models translate prompts to numbers, complete the prompt in numerical form, and translate the completion back into human readable form. At its core, this is not a new technique, but OpenAI has managed to throw more data and more computing power than ever to construct the most powerful models to date.
For data scientists, software engineers, and developers of the like, this innovation is likely very unsettling. With this fear, however, there is an equal and opposite reaction in the other direction: excitement. As one piece of the industry gets automated, a brand new field comes alive. As debugging and code generation get easier with the help of AI, prompt engineering, cost control, and output customization come to the forefront of a developer’s mind.
The output from OpenAI is revolutionary, but it is not a specialized finished product at the enterprise level. At Tiber, we look at this technology much like a beautifully built home. The foundation is built, the walls are up, but there is no furniture and there are no appliances. Everyone has a different style and the look and feel of the interior design might suit one family well but not another. Many businesses are undoubtedly using large language models, but their use case is different from one business to the next. The engineers that grow alongside this technology, learn to manage the models, and effectively become world class interior designers will become invaluable to businesses looking to leverage artificial intelligence.
To illustrate the need for oversight and proper configuration of OpenAI’s models, we used OpenAI to build a simple chatbot that answers basic sports statistics questions for people. We asked Tiberbot the following question:
Without any model configuration, GPT3.5’s chat based model gave a sufficient answer, but a longer answer than what we needed. By adding brief instructions at the beginning of the conversation, we got the following response to the same question from Tiberbot:
Notice here we only got a brief numeric answer. Given the context, however, this shorter response sufficiently answers the question we asked Tiberbot. While this nuanced detail may seem irrelevant at first glance, changes like this can save companies delivering products built on ChatGPT quite a bit of money when using OpenAI’s models. Users of the GPT series of models are charged by “token”. While tokens are not the exact same as words, they are highly correlated. Longer prompts and responses require more tokens and will therefore incur more costs. By shortening the response to use fewer tokens here, we were able to cut our cost per API call significantly. Much in the same way that architects and engineers leveraging cloud computing must manage costs on AWS, Azure, and GCP, cost management and proper configuration of solutions is paramount to a successful integration of OpenAI’s technology. The best interior designers know how to cater the aesthetic of a home to the tastes of a client while also staying within a budget. The same can be said for the data professionals building artificial intelligence products for clients, for employers, or for new innovations hitting the market.
The future is bright for the data space, as innovation continues to disrupt the industry. While unsettling, new advances in artificial intelligence are opening new doors for skilled developers to walk through and continue to provide value.