Update: This post has been revised with findings from GPT-4. Funnily enough, Tom Tunguz reported a similar incident during his testing.
Over the years, ever since Machine Learning started gaining traction, numerous companies have talked about “Democratizing AI”.
However only one company has actually been able to do it. I’d heard of OpenAI before, when they launched Gym, and took over The International in 2017. More recently, due to the actual democratization of AI, I’ve had the opportunity to use it for work and personal use.
Numbers game
I wanted to know if ChatGPT could decide which phone number it would deem best given a list of phone numbers. In my initial experimentation, I found that the earlier model (3.5) simply refuses to attempt the problem, saying it is subjective and does not try to reason at all. GPT 4, however goes a step ahead and “thinks” about the problem before responding.
GPT 3.5
For GPT 3.5 to work, I gave it only one heurestic to see how it would perform based on it - minimum unique numbers. For example, 9999999999 is a better phone number than 9999999990 because it has only one unique number (9) and the latter has two unique numbers (9 and 0).
When asked, it seemed that ChatGPT had some trouble counting.
I gave it a different set of phone numbers, but the result wasn’t any different.
At this point, I figure ChatGPT was just bad at math, so I asked it to write a program instead. This time, the result was correct (although the comment in the code snippet was still wrong).
GPT 4
My initial sentiment was that I would stay away from ChatGPT when it came to numbers but when GPT 4 came out, I gave that an attempt as well and the results were much better! (see: accurate)
It goes a step ahead to “think” about heurestics for judging a phone number as well!
AI has never been easier to access and apply in day-to-day lives, and I hope we continue to make more progress on this front.