ChatGPT-4 Is a Smart Analyst, Unlike GPT-3.5

ChatGPT has been trending on social media platforms. It has crossed one million users in just a week time. Those who haven't heard about ChatGPT, it's a large language model trained by OpenAI. In simple words, it's a chat bot which answers your questions and the responses it provides may sound human-like. It's an impressive machine learning solution. With the release of GPT-4 we can rely on it over Google search for learning on any topic.

Update: I updated this article with reviews on GPT-4.

Why ChatGPT-3.5 Isn't Smart enough, but GPT-4 is

You can't trust ChatGPT-3.5 for preparation on any certification or exam. It's a Big NO if you think you can refer ChatGPT-3.5 for answering questions in a telephonic interview round. Yes I know it's a cheating if you even use Google for the same but wanted to give a WARNING as many people do this and many social media influencers posted on how to leverage ChatGPT-3.5 for cracking interviews. After the release of GPT-4, I can confidently say that it will be a useful resource for exam preparation. GPT-4 has huge improvement over GPT-3.5. It can stun you with its ability to create human-like response with a very high precision on facts and creativity.

I asked ChatGPT a few questions related to statistics and programming. See the responses below.

Before digging into questions related to analytics, let's begin with a very generic question which does not require any special skill to answer

See the comment shown in the image above. Do you seriously think you can rely on ChatGPT-3.5 for anything after this?It says what do you want to hear. It does not correct even when you are completely wrong.

Update: GPT-4 response was informative and it simply said this question has no relevance, which is correct. See the response below.

The number of letters in a word doesn’t necessarily reflect its meaning or significance. The word "cool" and "intelligent" both have their origins and meanings that are independent of the number of letters they contain.

Q1. How c statistics is calculated :

A. percent concordant + (1.5* percent tied)
B. percent concordant + (0.5 * percent tied)
C. percent discordant + (0.5 * percent tied)
D. percent discordant + (1.5* percent tied)

Correct Answer : B
ChatGPT-3.5 : A

Surprisingly, if you ask the similar question differently, the answer would be totally different.

Now I asked ChatGPT-3.5 "c statistic = percent concordant + (0.5 * percent tied)". It returns that is correct. Does it sound like human? May be yes human-like nervousness in an interview.

Update: GPT-4 : B. Correct Answer

Q2. Does multicollinearity influence the predictions, precision of the predictions?

Correct Answer : No, collinearity does not influence the predictions and the precision of those predictions.
ChatGPT-3.5 : Yes, multicollinearity can influence the predictions made by a model and the precision of those predictions.

Update: GPT-4 : Multicollinearity affects the coefficients and p-values, but it does not influence the predictions, precision of the predictions, and the goodness-of-fit statistics. If your primary goal is to make predictions, and you don’t need to understand the role of each independent variable, you don’t need to reduce severe multicollinearity.

Q3. Does oversampling of rare events affect sensitivity and specificity in a rare event model?

Correct Answer : Oversampling does not affect sensitivity and specificity measures but false positive and negative rates are affected.
ChatGPT-3.5 : Oversampling of rare events can potentially affect sensitivity and specificity in a rare event model.

Update: GPT-4 : Oversampling does not affect sensitivity or specificity measures but false positive and negative rates are affected. ROC curve is not affected by oversampling. Oversampling does not affect rank ordering (sorting based on predicted probability) because adjusting oversampling is just a linear transformation.

Q4. Can PROC MEANS analyze ONLY the character variables in SAS?

Correct Answer : No
ChatGPT-3.5: Yes

Update: GPT-4 : No, PROC MEANS cannot analyze character variables. The core function of PROC MEANS is to analyze the values of variables that are defined as numeric variables.

Q5. You have two vectors in R as defined below -

x <- c(4,5,6)
y <- c(2,3)

If you run this vector z <- x*y , what would be the output? What would be the length of z?

Correct Answer : It returns 8 15 12 with the warning message. The length of z is 3.
ChatGPT-3.5 : Error message

Update: GPT-4 : In R, when two vectors of different lengths are used in an arithmetic operation, the elements of the shorter vector are recycled to match the length of the longer vector. In this case, y will be recycled to c(2,3,2) to match the length of x. The resulting vector z will be equal to x*y, which is c(4*2, 5*3, 6*2), or c(8,15,12). The length of z would be 3.

Final Note

ChatGPT-3.5 answers some questions correctly as well which I have not included in the post. The whole intend of this post is to compare GPT-4 with GPT-3.5 and to show the responses of the GPT-3.5 model is not reliable. As of now ChatGPT-4 can be an alternative to Google Search. With Microsoft's new Bing, it can also answer questions related to recent events (not restricted to ChatGPT which has training data goes up until 2021 only)

About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

While I love having friends who agree, I only learn from those who don't
Let's Get Connected Email LinkedIn