Example 1:
Kurtosis: leptokurtic distribution
Skewness: right-skewed distribution
Example 2:
Kurtosis: leptokurtic distribution
Skewness: left-skewed distribution
Example 3:
Kurtosis: platykurtic distribution
Skewness: right-skewed distribution
Example 4:
Kurtosis: platykurtic distribution
Skewness: left-skewed distribution
Example 5:
Kurtosis: mesokurtic (normal) distribution
Skewness: right-skewed distribution
Example 6:
Kurtosis: mesokurtic (normal) distribution
Skewness: left-skewed distribution
Example 7:
Kurtosis: leptokurtic distribution
Skewness: zero-skewed distribution
Example 8:
Kurtosis: platykurtic distribution
Skewness: zero-skewed distribution
Example 9:
Kurtosis: mesokurtic (normal) distribution
Skewness: zero-skewed distribution
Which of the above examples can be observed simultaneously on a dataset?
To examine the distribution characteristics of a dataset, we look at measures of skewness and kurtosis. Skewness is a measure of symmetry of a dataset, while kurtosis provides information about the distribution around the mean and indicates the presence of outliers. These two measures are examined separately, and their coefficients are calculated. I have shown all possible variations in 9 examples above. Which of these can be observed simultaneously in a dataset? AI tools (such as Gemini, GPT) can explain that all of them are possible with example scenarios. However, I think that examples 3,4,5 and 6 cannot be observed simultaneously in a dataset. For this reason, I asked AI tools to create example dataframes in Python, but they failed. The kurtosis and skewness coefficients were not consistent. I would be very happy if you could share your thoughts on this matter and provide some inspiration for me.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({'Gender': np.random.choice(['Woman', 'Man'], size=1000)})
# We use the lognormal distribution to obtain a right-skewed and mesokurtic distribution.
df['Grade'] = np.random.lognormal(mean=1, sigma=0.5, size=1000)
print(df.head())
print(df['Grade'].describe())
print('Kurtosis:', df['Grade'].kurtosis()) # it should be close to 0 (excess kurtosis - default(fisher=True)
print('Skewness:', df['Grade'].skew()) # it must be greater than 0
sns.kdeplot(data=df, x='Grade', shade=True)
plt.title('Notların KDE Dağılımı')
plt.show()
You need to sign in to view this answers
Leave feedback about this