In recent months, we have consistently written about the exciting developments in the artificial intelligence space, specifically as it pertains to large language models. ChatGPT, and large language models writ large, offer an exciting opportunity for companies to efficiently build exciting new offerings for customers. ChatBots, audio processing, and improved search are some of the most common implementations we have seen across the marketplace, and we have built some for our clients.
With the release of OpenAI’s public suite of models has come a predictable, and much deserved, hype cycle. The access and price point of these models has opened a field of artificial intelligence to the public that has primarily been reserved for highly specialized and highly capitalized experts. This hype cycle has not come without externalities, as we often see companies searching to use large language models for problems that do not require that level of horsepower.
While large language models and neural networks are easier and cheaper to implement than ever before, traditional machine learning models and old fashioned linear models are still cheaper, easier to implement, and often more effective. I could butter my toast with a saw. It will likely get people at the dinner table to look at me, but it is also difficult to manipulate, it might break, and it may or may not butter my toast properly. A butter knife is likely a better choice here, even if it isn’t as loud and does not get the entire dinner table to look my way. This is to say, just because we can, does not always mean we should.
When working with clients, we often work our way up the ladder of complexity instead of down the ladder. Given the excitement around large language models, we see many companies looking to large language models as the place to begin. Instead, we look at it as a place to finish, when we have exhausted other options. Simple approaches allow clients to maintain low cost models and go to market quickly. Only when convinced a simpler approach does not work, do we move up the ladder of complexity and cost. As enticing as it might be to add “LLM” to an investor deck, it is not the tool for every job, and certainly not the most efficient one. Traditional linear models and machine learning techniques are not obsolete as a result of LLMs.
In general, when given tabular data with a concrete dependent variable, more traditional approaches often offer better results. Why is that? Traditional modeling approaches are specifically designed such that when given a structured set of predictors X for a given outcome Y, we can predict Y in the future. LLMs, on the other hand, are specifically designed to handle unstructured sets of language, hence the name large language models. They are specifically designed to predict the next word when given a phrase. For instance, an LLM could easily predict that the next word in the phrase “the quick brown fox jumped over the lazy” will be “dog”. In practice, this is how large language models are useful in content creation and other more open ended problems.
This is not to say we do not love OpenAI, large language models, and their explosion in the last year. We view them as a wonderful addition to the suite of tools already at the disposal of a gifted data scientist or data engineer. They have not completely replaced the tool set. We have embraced the new tool and will continue to do so, but will not be neglecting the tried and true methods of modeling that are leaner, cheaper, and in many instances, more effective.