Commit 2ab1a120 authored by Theophile Pace's avatar Theophile Pace

Merge branch 'projet' into 'master'

Projet

See merge request !6
parents 80e010ed d740657c
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Student Alcohol Consumption\n",
"\n",
"WHAT??? \n",
"\n",
"Prédire les notes des étudiants en fonction de facteurs tels que la consommation d'alcool ? \n",
"PS: peu importe ce qu'on en déduira, pas touche à notre cher... \n",
"\n",
"On s'intéresse à un jeu de données social où ont été recueillies des informations sur des lycéens à propos de leurs conditions. Les notes qu'ils ont obtenus à leurs examens sont églament présentes. On voudrait pouvoir prédire les notes obtenues en fonction des conditions sociales des étudiants.\n",
"\n",
"Plus infos sur le dataset: https://www.kaggle.com/uciml/student-alcohol-consumption/home"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Chargement des données"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(395, 33)"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import matplotlib.pyplot as plt\n",
"import pandas as pd\n",
"df = pd.read_csv('./student-alcohol-consumption/student-mat.csv')\n",
"df.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On a 395 lignes et 33 colonnes. Chaque ligne correspond à un étudiant. Les colonnes sont les variables, elles décrivent les informations à propos de l'étudiant comme son âge, sexe, son temps de transport pour aller à l'école, le temps qu'il travaille à la maison, sa consommation d'alcool, etc. La liste complète de ces variables (les features) et leur explication est disponible ici: https://www.kaggle.com/uciml/student-alcohol-consumption/home"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Index(['school', 'sex', 'age', 'address', 'famsize', 'Pstatus', 'Medu', 'Fedu',\n",
" 'Mjob', 'Fjob', 'reason', 'guardian', 'traveltime', 'studytime',\n",
" 'failures', 'schoolsup', 'famsup', 'paid', 'activities', 'nursery',\n",
" 'higher', 'internet', 'romantic', 'famrel', 'freetime', 'goout', 'Dalc',\n",
" 'Walc', 'health', 'absences', 'G1', 'G2', 'G3'],\n",
" dtype='object')"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.columns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On cherche à prédire quels sont les résultats scolaires des étudiants. Il y a pour chacun 3 notes: G1 (first period grade), G2 (second period grade) et G3 (final grade).\n",
"\n",
"Question, est-ce que la consommation d'alcool influe sur le résultat des élèves ? Réponse !"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA6IAAAEdCAYAAAAW+9oOAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzt3Xu0ZGV95vHvYzcISkSE9jISbQVRCcplWk0kKqIkGp1gZhKjMYjRhGhGk0xcKmZmFhKXjprEREfQEC+YhZc46jLRoOIFnDijSMutQcQLEekA0oioQVSQ3/xR1c7x2HSfc6revat2fT9rsbrOrjrdz9p0PV2/8761K1WFJEmSJElduUPfASRJkiRJi8VBVJIkSZLUKQdRSZIkSVKnHEQlSZIkSZ1yEJUkSZIkdcpBVJIkSZLUKQdRSZIkSVKnHETVVJKnJTk3yU1Jrhvf/oOMPDbJ2Um+neRrfWeVpLXaRde9KMklSb6b5F+SvKjvvJK0Vrvouz9OckWS7yS5OslfJVnfd2bNJgdRNZPkhcDrgD8H7gncA3gucCSwO3AT8FbAF2WS5tYKui7AM4F9gCcAz0/ytH7SStLaraDvPggcUVV3AQ4BDgX+sJ+0mnWpqr4zaICS7A1cDTyzqt63i8c+HnhzVW3sIpskTctqum7J97ye0b+/L2gaTpKmaLV9l2Rf4O+BL1XVH7TOp/njiqha+QXgjsA/9B1EkhpaVdclCfAo4NKWoSSpgRX1XZLfSvId4HpGK6J/00E2zSEHUbWyH3B9Vd26/UCS/5vkxiQ3J3l0j9kkaVpW23UvY/Rv79s6zChJ07Civquqd4635h4EvAn4Rj9xNescRNXKN4H9lr5BvaoeWVV3Hd/n3z1JQ7DirkvyfEbvFX1SVf2g86SSNJlVvbarqi8z2v1xaqcpNTccBtTKZ4AfAMf2HUSSGlpR1yV5NnAi8Liq2tpFMEmasrW8tlsPHNAmjuadg6iaqKobgZOBU5P8epK9ktwhyWHAnQHGX+8B7Db6Mnsk2b3H2JK0KivsumcArwSOqaoreowrSWu2wr773SR3H98+GHgp8IneQmumedVcNTV+AfZHjC7hfRNwBfAW4HTgkcDZy77lU1V1VIcRJWliu+i6y4H9Ga0kbHdGVT2345iSNLFd9N3fAL8C7AVsA/4X8N+r6vu9hNVMcxCVJEmSJHXKrbmSJEmSpE45iEqSJEmSOuUgKkmSJEnqlIOoJEmSJKlTDqKSJEmSpE6t7/IP22+//Wrjxo1d/pGSZtznP//566tqQ985psmuk7QjQ+s7u07Sjqy06zodRDdu3MjmzZu7/CMlzbgkV/adYdrsOkk7MrS+s+sk7chKu86tuZIkSZKkTjmISpIkSZI65SAqSZIkSeqUg6gkSZIkqVMOopIkSZKkTnV61dxWNp74T31HWLGvvepJfUeQNKfmqevAvpOkmfOyvftOsDov+3bfCdSQK6KSJEmSpE4NYkVUkiRJw+DuD2kxuCIqSZIkSeqUK6KSJPm+KS2QJG8FngxcV1WHjI/dDfh7YCPwNeCpVfWtvjJK8+ghb39I3xFWZcvxW3r9810RlSRJWiynA09YduxE4BNV9QDgE+OvJakZV0S1c64SSJI0KFX1v5NsXHb4WOCo8e23A+cAL+kslKSF4yAqSZKacrvaXLhHVV0DUFXXJLl734EkDZtbcyWJ0XumklyX5JIlx+6W5GNJvjz+dZ8+M0pS35KckGRzks3btm3rO46kOeYgKkkjp+N7piQtrm8kuRfA+NfrdvSgqjqtqjZV1aYNGzZ0GlDSsDiIShKj90wBNyw7fCyj90ox/vUpnYaSpO78I3D8+PbxwD/0mEXSAtjlIOp2NUkL7CfeMwXc7num3K4maV4keRfwGeCBSbYmeQ7wKuCYJF8Gjhl/LUnNrGRF9HTcriZJO+V2NUnzoqqeXlX3qqrdqmr/qnpLVX2zqh5XVQ8Y/7p8h4gkTdUur5rrJb6ldryS5Mz7RpJ7ja8gebvvmZIkSdLqrPU9om5Xk7QIfM+UJElSA80vVuR2NUnzwPdMSZIkdWeXW3Nvh9vVJA1KVT39du56XKdBJEmSFsBaV0TdriZJkiRJWpOVfHyL29UkSZIkSVOzkqvmul1NkiRJkjQ1zS9WJEmSJEnSUg6ikiRJkqROOYhKkiRJkjrlICpJkiRJ6pSDqCRJkiSpUw6ikiRJkqROOYhKkiRJkjrlICpJkiRJ6pSDqCRJkiSpUw6ikiRJkqROOYhKkiRJkjrlICpJkiRJ6pSDqCRJkiSpUw6ikiRJkqROOYhKkiSJJP8lyaVJLknyriR79J1J0nA5iEqSJC24JPcG/hDYVFWHAOuAp/WbStKQOYhKkiQJYD2wZ5L1wJ2Aq3vOI2nAHEQlaRfcriZp6KrqX4G/AL4OXAN8u6rOWv64JCck2Zxk87Zt27qOKWlAHEQlaSfcriZpESTZBzgWuB/w74A7J/nt5Y+rqtOqalNVbdqwYUPXMSUNyESDqKsEkhaE29UkDd3jgX+pqm1VdQvwfuCRPWeSNGBrHkRdJZC0CNyuJmlBfB34+SR3ShLgccBlPWeSNGCTbs11lUDSoLldTdIiqKpzgfcC5wNbGL1GPK3XUJIGbc2D6EpXCSRpzrldTdJCqKqTqupBVXVIVR1XVT/oO5Ok4Zpka+6KVgncriZpzrldTZIkacom2Zq7olUCt6tJmmduV5MkSZq+9RN8749XCYCbGa0SbJ5KKkmaIVV1EnBS3zkkSZKGYpL3iLpKIEmSJElatUlWRF0lkCRJkiSt2qQf3yJJkiRJ0qo4iEqSJEmSOuUgKkmSJEnqlIOoJEmSJKlTDqKSJEmSpE45iEqSJEmSOuUgKkmSJEnqlIOoJEmSJKlTDqKSJEmSpE45iEqSJEmSOuUgKkmSJEnqlIOoJEmSJKlTDqKSJEmSpE45iEqSJEmSOuUgKkmSJEnqlIOoJEmSJKlTDqKSJEkiyV2TvDfJF5NcluQX+s4kabjW9x1AkiRJM+F1wEeq6teT7A7cqe9AkobLFVFJ2gVXCSQNXZK7AI8G3gJQVT+sqhv7TSVpyCYaRH1xJmlBbF8leBBwKHBZz3kkadruD2wD3pbkgiRvTnLn5Q9KckKSzUk2b9u2rfuUkgZj0hVRX5xJGjRXCSQtiPXAEcAbq+pw4CbgxOUPqqrTqmpTVW3asGFD1xklDciaB1FfnElaEK4SSFoEW4GtVXXu+Ov3MhpMJamJSVZEV/TiTJLmnKsEkgavqq4FrkrywPGhxwFf6DGSpIGbZBBd0YszVwkkzTlXCSQtihcA70hyMXAY8Mqe80gasEkG0RW9OHOVQNI8c5VA0qKoqgvHr9keWlVPqapv9Z1J0nCt+XNEq+raJFcleWBVXY4vziQN1/ZVgt2BK4Df6TmPJEnSXFvzIDrmizNJg1dVFwKb+s4hSZI0FBMNor44kyRJkiSt1qSfIypJkiRJ0qo4iEqSJEmSOuUgKkmSJEnqlIOoJEmSJKlTDqKSJEmSpE45iEqSJEmSOuUgKkmSJEnqlIOoJEmSJKlTDqKSJEmSpE45iEqSJEmSOuUgKkmSJEnqlIOoJEmSJKlTDqKSJEmSpE45iEqSJEmSOuUgKkmSJEnqlIOoJEmSJKlTDqKSJEmSpE45iEqSJAmAJOuSXJDkQ31nkTRsDqKStAK+OJO0IP4IuKzvEJKGz0FUklbGF2eSBi3J/sCTgDf3nUXS8E08iLpKIGnofHEmaUH8NfBi4La+g0gavmmsiLpKIGnofHEmadCSPBm4rqo+v4vHnZBkc5LN27Zt6yidpCGaaBB1lUDS0PniTNKCOBL41SRfA94NHJ3kjOUPqqrTqmpTVW3asGFD1xklDcikK6K7XCXwxZmkOeeLM0mDV1Uvrar9q2oj8DTgk1X12z3HkjRgax5EV7pK4IszSfPMF2eSJEnTt36C792+SvArwB7AXZKc4Qs0SZKk+VVV5wDn9BxD0sCteUXUVQJJi6aqzqmqJ/edQ5Ikad75OaKSJEmSpE5NsjX3x9zCIUmSJElaKVdEJUmSJEmdchCVJEmSJHXKQVSSJEmS1CkHUUmSJElSpxxEJUmSJEmdchCVJEmSJHXKQVSSJEmS1CkHUUmSJElSpxxEJUmSJEmdchCVJEmSJHXKQVSSJEmS1CkHUUmSJElSpxxEJUmSJEmdchCVJEmSJHXKQVSSJEmS1CkHUUmSJElSpxxEJUmSJEmdchCVJEmSJHXKQVSSdiLJzyY5O8llSS5N8kd9Z5KkabPrJHVtzYOohSVpQdwKvLCqHgz8PPCfkxzccyZJmja7TlKnJlkRtbAkDV5VXVNV549vfxe4DLh3v6kkabrsOkldW/MgamFJWjRJNgKHA+f2m0SS2tlZ1yU5IcnmJJu3bdvWdTRJAzKV94haWJKGLslewPuAP66q7+zgfrtO0tzbVddV1WlVtamqNm3YsKH7gJIGY+JB1MKSNHRJdmPUc++oqvfv6DF2naR5t5Kuk6RpmWgQtbAkDV2SAG8BLquq1/adR5JasOskdW2Sq+ZaWJIWwZHAccDRSS4c//crfYeSpCmz6yR1av0E37u9sLYkuXB87E+r6szJY0nSbKiqTwPpO4cktWTXSeramgdRC0uSJEmStBZTuWquJEmSJEkr5SAqSZIkSeqUg6gkSZIkqVMOopIkSZKkTjmISpIkSZI65SAqSZIkSeqUg6gkSZIkqVMOopIkSZKkTjmISpIkSZI65SAqSZIkSeqUg6gkSZIkqVMOopIkSZKkTjmISpIkSZI65SAqSZIkSeqUg6gkSZIkqVMOopIkSZKkTjmISpIkSZI65SAqSZIkSeqUg6gkSZIkqVMTDaJJnpDk8iRfSXLitEJJ0iyx6yQtArtOUpfWPIgmWQecAjwROBh4epKDpxVMkmaBXSdpEdh1kro2yYrow4GvVNUVVfVD4N3AsdOJJUkzw66TtAjsOkmdmmQQvTdw1ZKvt46PSdKQ2HWSFoFdJ6lT6yf43uzgWP3Ug5ITgBPGX/5bkssn+DO7th9w/TR/w7x6mr/bXJv6uQXg5B39tVxITc5vntXk/N63xW86RXbdGtl3gF3X2jx1Hcx239l1a2TXAXZda4PsukkG0a3Azy75en/g6uUPqqrTgNMm+HN6k2RzVW3qO8cQeW7b8vxOlV2nNfPctuX5nSq7TmvmuW1rqOd3kq255wEPSHK/JLsDTwP+cTqxJGlm2HWSFoFdJ6lTa14Rrapbkzwf+CiwDnhrVV06tWSSNAPsOkmLwK6T1LVJtuZSVWcCZ04pyyyay60nc8Jz25bnd4rsOk3Ac9uW53eK7DpNwHPb1iDPb6p+6n3okiRJkiQ1M8l7RCVJkiRJWjUHUUmSJElSpyZ6j+hQJDliBQ+7paq2NA8zQJ5faXb4fGzHcyvNDp+P7XhuNS2+RxRI8l1Gly3f2ae63q+qNnaTaFg8v20lWcnl9W+oqme1zqLZ5/OxHc9tW3adVsPnYzue27YWqetcER05r6qO3tkDknyyqzAD5Plt68HA7+7k/gCndJRFs8/nYzue27bsOq2Gz8d2PLdtLUzXuSIqzbkkT62q90z6GEmaZXadpEWwSF3nIDqW5J4AVXVtkg3Ao4DL/TDnySV5aFVd3HcOSSP2XRt2nTRb7Lo27DpNi1fNBZL8PvAZ4LNJngd8CHgy8P4kz+k13DBckOQrSV6e5OC+wwxNkr2TvCrJF5N8c/zfZeNjd+07n2aLfdeUXdeQXafVsOuasusaWqSuc0UUSLIFeASwJ3AlcOD4p2f7AGdX1WG9BpxzSS4AjgOeDvwmcBPwLuDdVfW1HqMNQpKPAp8E3l5V146P3RM4Hnh8VR3TZz7NFvuuHbuuLbtOq2HXtWPXtbVIXecgCiQ5v6qOGN++qKoOXXLfBVV1eH/p5t/S8zv++uHA04DfAK6qqkf2Fm4AklxeVQ9c7X1aTPZdO3ZdW3adVsOua8eua2uRus6tuSO3JdltfPtJ2w8m2QPP0TT8xOW9q+pzVfUnwH2Al/YTaVCuTPLiJPfYfiDJPZK8BLiqx1yaTfZdO3ZdW3adVsOua8eua2thus4n4sh/3H6jqrYuOb4v8MLu4wzOn+/oYI18quswA/SbjP6ufirJDUluAM4B7gY8tc9gmkn2XTt2XVt2nVbDrmvHrmtrYbrOrbm3I8ndquqGvnMMVZJ9q+qbfeeQZN+1ZNdJs8Oua8eu01q4IgokOXJ8NapLkzwiyceAzUmuSvILfeebd+OrfO03vr0pyRXAuUmuTPKYnuMNWpIjdv0oLRL7rh27rj92nZaz69qx6/oztK5zRRRI8jngOcBewAeBp1TVp8f/s/9nVR3Za8A5l2RLVT1kfPts4MVVdV6Sg4B3VtWmfhMOV5K/rarf6zuHZod9145d1x+7TsvZde3Ydf0ZWtet7zvAjNitqrYAJNlWVZ8GqKrzk+zZb7RB2C3J+qq6Fdizqs4DqKovJbljz9kGbUhlpamx79qx63pi12kH7Lp27LqeDK3r3Jo7svQ8LL/a1+5dBhmoU4AzkxwNfCTJXyd5dJKTgQt7zjY4SfZKckQG9qHHmhr7rh27rkN2nXbBrmvHruvQkLvOrblAkl8FPl5V31t2/ADgP1XVa/pJNhxJjgKeBxzEaCV+K/AB4K1VdUuP0eZeklOr6g/Gt38ReCfwVeBA4Per6sw+82m22Hdt2XXt2HVaDbuuLbuunUXqOgdRac7lJz+0+2zgheOtR/cH3uN7NSQNgV0naREsUte5NRdIcs8kb0xySpJ9k7wsyZYk70lyr77zzbvx1eruMr69Z5KTk3wwyauT7N13voG5S1WdD1BVVwDres6jGWPftWPXdcqu007Zde3YdZ0adNc5iI6cDnwBuAo4G7gZeBLwz8Cb+os1GG8Ftm+NeR2wN/Dq8bG39RVqQB6U5OIkW4CDkuwDkOQOwG79RtMMOh37rhW7ri27TqtxOnZdK3ZdWwvTdW7NBZJcUFWHj29/varus+S+C6vqsP7Szb8kl1XVg8e3f7zdYPy153dCSe677NDVVXVLRp/x9eiqen8fuTSb7Lt27Lq27Dqthl3Xjl3X1iJ1nSuiI0vPw9/t5D6tzSVJfmd8+6IkmwAy+rwp39A+oaq6ctl/t4yPXz+kstLU2Hft2HUN2XVaJbuuHbuuoUXqOp+II/+QZC+Aqvpv2w8mORD4Um+phuN3gcck+SpwMPCZJFcAfzu+T1J37Lt27Dppdth17dh1mgq35qozSX4GuD/jy3xX1Td6jiRJU2fXSVoEdp0m5SC6C0mO2H61KkkaMvtO0iKw66TZ4NbcXXte3wGGLMmH+s4wVElemeQlSfbtO4vmhn3XiF3Xjl2nNbDrGrHr2hli17kiql4luVdVXdN3jiFK8hTgAODQqnpm33mkRWbXtWPXSbPDrmtniF3nILpEkt22X5lqybH9qur6vjJJUgv2naRFYNdJs8utuUCSxybZClyd5KwkG5fcfVY/qYYjyYOSfDjJPyU5IMnpSW5M8rkkD+4737xL8tokR/adQ/PBvmvHrmvLrtNq2HXt2HVtLVLXOYiOvAb45araAJwGfCzJz4/vS3+xBuM04FTgDOCTwEeAfYCXA2/oMddQHAe8LsmVSV6T5PC+A2mm2Xft2HVt2XVaDbuuHbuurYXpOgfRkd2r6lKAqnov8BTg7Ul+DXDv8uR+pqo+WFXvAm6pqnfXyAcZFZcms7WqNgGPB74LnJHki0lOGn+4tLSUfdeOXdeWXafVsOvasevaWpiucxAduSXJPbd/MS6uxwEnAQ/oLdVwrFty+7XL7tu9yyADVQBV9eWqenlV/RzwVGAP4Mxek2kW2Xft2HVt2XVaDbuuHbuurYXpOgfRkROBeyw9UFVbgccAr+ol0bCckmQvgKo6dfvBJAcCH+8t1XD81Bajqrq4ql5aVQf2EUgzzb5rx65ry67Tath17dh1bS1M13nVXGnOJdmrqv6t7xyS1JJdJ2kRLFLXre87gKSJ/TBJavxTpSSPBY4AvlBVH+43miRNjV0naREsTNe5NVeaf+cBdwVI8iLgFcCewJ8k+R99BpOkKbLrJC2Chek6t+ZKcy7JJVV1yPj2ZuBRVXVzkvXA+VX10H4TStLk7DpJi2CRus4V0Z1I8sokL0myb99ZhijJsUke0XeOAfhOkkPGt69ndFU1GG299zmuFbHv2rHrpsau08TsunbsuqlZmK7zPaI79zngAOCvgGf2nGWIHgE8JMn6qnpi32Hm2HOBdyS5CLgO2JzkU8BDgVf2mkzzxL5rx66bDrtO02DXtWPXTcfCdJ1bc6UBSLIO+CXgIEY/YNoKfLSqbuw1mCRNkV0naREsStc5iAJJfg34VFXdkGQD8JfA4cAXgBeOP3dKE0jycKCq6rwkBwNPAL5YVYP6YN5ZkWTfqvpm3zk0e+y7tuy6btl1uj12XVt2XbeG2nWD2mc8gVdU1Q3j228ALgCeCHwYeFtvqQYiyUnA64E3jq/29QZgL+DEJP+113ADkORVSfYb396U5Arg3CRXJnlMz/E0e+y7Ruy6tuw6rZJd14hd19YidZ0rokCSy6vqgePbn6+qf7/kvgur6rD+0s2/JFuAw4A7AtcC+1fVd5LsCZw7pKt/9SHJlqp6yPj22cCLxz+hPAh4Z1Vt6jehZol9145d15Zdp9Ww69qx69papK5zRXTknCR/Nn4CnZPkKfDjD5D9dr/RBuHWqvpRVX0P+GpVfQegqm4Gbus32iDsNr6kN8CeVXUeQFV9idE/EtJS9l07dl1bdp1Ww65rx65ra2G6zkF05PmMnjiXA78BvD/Jd4HfA47rM9hA/DDJnca3l/5Ecm8srGk4BTgzydHAR5L8dZJHJzkZuLDnbJo99l07dl1bdp1Ww65rx65ra2G6zq25y4yfROuH+IbgviS5Y1X9YAfH9wPuVVVbeog1KEmOAp7H/7+62lXAB4C3VdUtPUbTDLPvpsuua8+u01rYddNl17W3KF3nILpEkk3AzwK3Al+uqi/2HGkwktwBoKpuS7I7cAjwtSUXEpDUIfuuDbtOmi12XRt2nabBQRQYX4HqL4EbGW0x+D/APsAtwHFVdVWP8ebe+H0Zf8Nou8ZzgT8FbmL0U57nVdUHe4w3aEmOqKrz+86h2WHftWPX9ceu03J2XTt2XX+G1nUOokCSC4BfqqptSe4HvLaqfi3JMcCLquqXeo4418bn94nAnsBFwMOq6vIk9wXeN6Srf82aJH9bVb/Xdw7NDvuuHbuuP3adlrPr2rHr+jO0rvNiRSPrqmrb+PbXgfsCVNXHgHv3lmpAquraqvoX4OtVdfn42JX4d7CpIZWVpsa+a8iu64ddpx2w6xqy6/oxtK5bv+uHLITNSd4CfAI4FjgHYHxFsHU95hqMJHeoqtuAZy85tg7Yvb9Uw5HknjD6hyHJBuBRwOVVdWm/yTSD7LuG7Lq27Dqtgl3XkF3X1qJ0nVtzgSS7Mbqc98GMthi8tap+NP7sqbuPf8KjNUryMGBLVX1/2fGNwC9W1Rl95BqKJL8PnAgEeDXwLOBS4EjgNVX1lv7SadbYd+3YdW3ZdVoNu64du66tReo6B1FpziXZAjyC0Xs1rgQOHP8EbR/g7Ko6rNeAkjQFdp2kRbBIXec+biDJXkn+LMklSb6dZFuSzyZ5Vt/ZhiDJg5J8OMk/JTkgyelJbkzyuSQP7jvfANxSVd8bfz7aV6vqWoCq+hbgT5r0E+y7duy65uw6rZhd145d19zCdJ2D6Mg7gCuAJwAnA68HjgMem+SVfQYbiNOAU4EzgE8CH2F0CfWXA2/oMddQ3DbeggTwpO0Hk+yBz3H9NPuuHbuuLbtOq2HXtWPXtbUwXefWXCDJRVV16JKvz6uqh40/rPcLVfWgHuPNvSQXVNXh49tfqaoDl9x3flUd0V+6+ZfkPsDVVXXrsuP3Bh5cVR/vJ5lmkX3Xjl3Xll2n1bDr2rHr2lqkrhvUVD2Bm5L8IkCS/wDcADC+Glj6DDYQS69O99pl93l1tcldtbysAKrqX7eXVRL/Hms7+64du64tu06rYde1Y9e1tTBd5yA68lzgtUluBF4CvABgfLnkU/oMNhCnJNkLoKpO3X4wyYHAYH6q06Ozk7xg/BO0H0uye5Kjk7wdOL6nbJo99l07dl1bdp1Ww65rx65ra2G6zq250pwbv2fg2cAzgPsBNwJ7MPqJ5VnAKVV1YX8JJWlydp2kRbBIXecgugtJjqiq8/vOMVRJnlxVH+o7x1CM39y+H3BzVd3Ydx7NF/uuHbtuuuw6TcKua8eum66hd51bc3fteX0HGLiH9R1gSKrqlqq6ZohlpU7Yd+3YdVNk12lCdl07dt0UDb3rXBGVJEmSJHVqfd8BZkWSvRl91tS9GX1Y7NXAR4f6E4iuJbkLsKGqvrrs+EOr6uKeYkkLyb5rx66TZodd145dp2lway6Q5JnA+cBRwJ2AOwOPBT4/vk8TSPJU4IvA+5JcmmTpto3T+0klLSb7rh27Tpoddl07dp2mxa25QJLLgUcs/wlZkn2Ac6vqoH6SDUOSC4EnVtU1SR4O/B3wp1X1/qUfiiypPfuuHbtOmh12XTt2nabFrbkjYbRlYzk/9Hg61lXVNQBV9bkkjwU+lGR/dnzeJbVj37Vj10mzw65rx67TVDiIjrwCOD/JWcBV42P3AY4BXt5bquH4bpIDtr+PYPwTtKOADwA/12syafHYd+3YddLssOvases0FW7NHRtv1fhlRm9oD7CV0Rvav9VrsAFIcijwvar68rLjuwFPrap39JNMWkz2XRt2nTRb7Lo27DpNi4MokCS1ixOxksdoxzy/0uzw+diO51aaHT4f2/Hcalq8au7I2UlekOQ+Sw8m2T3J0UneDhzfU7Yh8PxKs8PnYzueW2l2+Hxsx3OrqXBFFEiyB/Bs4BnA/YAbgT0ZDepnAadU1YX9JZxvt3N+9wDW4fmVOmXftWPXSbPDrmvHrtO0OIguM97fvh9wsx94PH2eX2l2+Hxsx3MrzQ6fj+14bjUJB1FJkiRJUqd8j6gkSZIkqVMOopIkSZKkTjmIaqqS/CjJhUkuTXJRkj80B9FLAAABpklEQVRJstO/Z0k2Jrmkq4ySNCm7TtIisOvU0vq+A2hwbq6qwwCS3B14J7A3cFKvqSRpuuw6SYvArlMzroiqmaq6DjgBeH5GNib55yTnj/975PLvSbIuyV8k2ZLk4iQv6D65JK2cXSdpEdh1mjZXRNVUVV0x3sJxd+A64Jiq+n6SBwDvAjYt+5YTGH0m1eFVdWuSu3WbWJJWz66TtAjsOk2Tg6i6kPGvuwFvSHIY8CPgoB089vHAm6rqVoCquqGbiJI0MbtO0iKw6zQVDqJqKsn9GZXTdYzeT/AN4FBG28K/v6NvAfxwW0lzxa6TtAjsOk2T7xFVM0k2AG8C3lBVxejN7ddU1W3AccC6HXzbWcBzk6wf/x5u4ZA00+w6SYvArtO0OYhq2vbcfplv4OOMCujk8X2nAscn+Syj7Rs37eD73wx8Hbg4yUXAb3WQWZJWy66TtAjsOjWT0Q80JEmSJEnqhiuikiRJkqROOYhKkiRJkjrlICpJkiRJ6pSDqCRJkiSpUw6ikiRJkqROOYhKkiRJkjrlICpJkiRJ6pSDqCRJkiSpU/8PZj9pZziTkmkAAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 1152x216 with 3 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# On affiche pour chaque examen (G1-3) les moyennes des notes selon la catégorie de consommation\n",
"# d'alcool en semaine (oui, en semaine). Dalc est la variable correspondante (voir la doc)\n",
"fig, axes = plt.subplots(1,3, figsize=(16, 3))\n",
"for a in [(\"G1\", axes[0]), (\"G2\", axes[1]), (\"G3\", axes[2])]:\n",
" df.groupby(pd.cut(df[\"Dalc\"], 3))[a[0]].mean().plot.bar(ax=a[1], title=a[0])\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On se rend compte que NON, ça n'influe pas (presque pas, aller)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Plus sérieusement"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On souhaite toujours prédire la note d'un élève en fonction de toutes les variables à disposition. On choisi de ne garder que la variable G3 pour simplifier le problème. On pourrait aussi faire une moyenne de ces 3 notes. Au choix. G3 sera donc notre variable à prédire (target feature)."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"Index(['school', 'sex', 'age', 'address', 'famsize', 'Pstatus', 'Medu', 'Fedu',\n",
" 'Mjob', 'Fjob', 'reason', 'guardian', 'traveltime', 'studytime',\n",
" 'failures', 'schoolsup', 'famsup', 'paid', 'activities', 'nursery',\n",
" 'higher', 'internet', 'romantic', 'famrel', 'freetime', 'goout', 'Dalc',\n",
" 'Walc', 'health', 'absences', 'G3'],\n",
" dtype='object')"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = df.drop([\"G1\", \"G2\"], axis=1)\n",
"data.columns"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>school</th>\n",
" <th>sex</th>\n",
" <th>age</th>\n",
" <th>address</th>\n",
" <th>famsize</th>\n",
" <th>Pstatus</th>\n",
" <th>Medu</th>\n",
" <th>Fedu</th>\n",
" <th>Mjob</th>\n",
" <th>Fjob</th>\n",
" <th>...</th>\n",
" <th>internet</th>\n",
" <th>romantic</th>\n",
" <th>famrel</th>\n",
" <th>freetime</th>\n",
" <th>goout</th>\n",
" <th>Dalc</th>\n",
" <th>Walc</th>\n",
" <th>health</th>\n",
" <th>absences</th>\n",
" <th>G3</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>55</th>\n",
" <td>GP</td>\n",
" <td>F</td>\n",
" <td>16</td>\n",
" <td>U</td>\n",
" <td>GT3</td>\n",
" <td>A</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>other</td>\n",
" <td>other</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>yes</td>\n",
" <td>5</td>\n",
" <td>3</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>8</td>\n",
" <td>10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>155</th>\n",
" <td>GP</td>\n",
" <td>M</td>\n",
" <td>15</td>\n",
" <td>R</td>\n",
" <td>GT3</td>\n",
" <td>T</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>at_home</td>\n",
" <td>services</td>\n",
" <td>...</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>151</th>\n",
" <td>GP</td>\n",
" <td>M</td>\n",
" <td>16</td>\n",
" <td>U</td>\n",
" <td>LE3</td>\n",
" <td>T</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>at_home</td>\n",
" <td>other</td>\n",
" <td>...</td>\n",
" <td>no</td>\n",
" <td>yes</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>6</td>\n",
" <td>14</td>\n",
" </tr>\n",
" <tr>\n",
" <th>82</th>\n",
" <td>GP</td>\n",
" <td>F</td>\n",
" <td>15</td>\n",
" <td>U</td>\n",
" <td>LE3</td>\n",
" <td>T</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>services</td>\n",
" <td>other</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>no</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>5</td>\n",
" <td>10</td>\n",
" <td>6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>GP</td>\n",
" <td>F</td>\n",
" <td>15</td>\n",
" <td>R</td>\n",
" <td>GT3</td>\n",
" <td>T</td>\n",
" <td>2</td>\n",
" <td>4</td>\n",
" <td>services</td>\n",
" <td>health</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>no</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>5</td>\n",
" <td>2</td>\n",
" <td>8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>GP</td>\n",
" <td>M</td>\n",
" <td>17</td>\n",
" <td>U</td>\n",
" <td>GT3</td>\n",
" <td>T</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>services</td>\n",
" <td>services</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>no</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>2</td>\n",
" <td>4</td>\n",
" <td>5</td>\n",
" <td>16</td>\n",
" <td>5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75</th>\n",
" <td>GP</td>\n",
" <td>M</td>\n",
" <td>15</td>\n",
" <td>U</td>\n",
" <td>GT3</td>\n",
" <td>T</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>teacher</td>\n",
" <td>other</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>no</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>6</td>\n",
" <td>10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>150</th>\n",
" <td>GP</td>\n",
" <td>M</td>\n",
" <td>18</td>\n",
" <td>U</td>\n",
" <td>LE3</td>\n",
" <td>T</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>other</td>\n",
" <td>other</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>yes</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>2</td>\n",
" <td>5</td>\n",
" <td>4</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>85</th>\n",
" <td>GP</td>\n",
" <td>F</td>\n",
" <td>15</td>\n",
" <td>U</td>\n",
" <td>GT3</td>\n",
" <td>T</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>services</td>\n",
" <td>services</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>yes</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>6</td>\n",
" <td>8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>261</th>\n",
" <td>GP</td>\n",
" <td>M</td>\n",
" <td>18</td>\n",
" <td>U</td>\n",
" <td>GT3</td>\n",
" <td>T</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>teacher</td>\n",
" <td>other</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>no</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>8</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>10 rows × 31 columns</p>\n",
"</div>"
],
"text/plain": [
" school sex age address famsize Pstatus Medu Fedu Mjob Fjob \\\n",
"55 GP F 16 U GT3 A 2 1 other other \n",
"155 GP M 15 R GT3 T 2 3 at_home services \n",
"151 GP M 16 U LE3 T 2 1 at_home other \n",
"82 GP F 15 U LE3 T 3 2 services other \n",
"24 GP F 15 R GT3 T 2 4 services health \n",
"18 GP M 17 U GT3 T 3 2 services services \n",
"75 GP M 15 U GT3 T 4 3 teacher other \n",
"150 GP M 18 U LE3 T 1 1 other other \n",
"85 GP F 15 U GT3 T 4 4 services services \n",
"261 GP M 18 U GT3 T 4 3 teacher other \n",
"\n",
" ... internet romantic famrel freetime goout Dalc Walc health absences \\\n",
"55 ... yes yes 5 3 4 1 1 2 8 \n",
"155 ... no no 4 4 4 1 1 1 2 \n",
"151 ... no yes 4 4 4 3 5 5 6 \n",
"82 ... yes no 4 4 4 1 1 5 10 \n",
"24 ... yes no 4 3 2 1 1 5 2 \n",
"18 ... yes no 5 5 5 2 4 5 16 \n",
"75 ... yes no 4 3 3 2 3 5 6 \n",
"150 ... yes yes 2 3 5 2 5 4 0 \n",
"85 ... yes yes 4 4 4 2 3 5 6 \n",
"261 ... yes no 4 3 2 1 1 3 2 \n",
"\n",
" G3 \n",
"55 10 \n",
"155 8 \n",
"151 14 \n",
"82 6 \n",
"24 8 \n",
"18 5 \n",
"75 10 \n",
"150 0 \n",
"85 8 \n",
"261 8 \n",
"\n",
"[10 rows x 31 columns]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data.sample(10)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Travailler avec les variables catégorielles"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Il y a ce qu'on appelle des variables catégorielles, dont les valeurs ne sont pas numériques et continues. Il y a par exemple la variable `school` qui est soit GP soit MS (nom des deux écoles). Mais aussi `Mjob` (mother's job) qui prend des valeurs dans {teacher, health, services, ...}. Ces valeurs ne sont pas calculables telles quelles. Seuls les arbres de décisions peuvent accepter ces valeurs (pourquoi ?).\n",
"\n",
"Il faut donc faire une transformation. Deux stratégies sont possibles:\n",
"\n",
"* **Integer encoding** : on donne une valeur entière à chaque modalité (teacher devient 0, health devient 1, services 2, etc). Le problème avec ça, c'est qu'on introduit un biais. 0 < 2 et donc _teacher_ deviendrait \"inférieur\" à _services_ ? Cette stratégie n'est donc pas toujours pertinente, sauf pour les modalités où il y a une relation d'ordre.\n",
"\n",
"* **One-hot encoding**: on va binariser les variables, par exemple `school` va prendre les valeurs 0 et 1. Si school=GP alors on met 0 à place, si school=MS on met 1. Mais qu'est-ce qu'il se passe lorsque notre a variable a plus de deux modalités possibles (`Mjob`) ? Il faut créer des nouvelles variables ! `Mjob_teacher` prend la valeur 0 si ce n'est pas teacher et 1 si oui, `Mjob_health` pareil, etc. En fait, si on a $n$ modalités, on peut créer $n$ variables. En pratique $n-1$, car la dernière modalité correspondrait au cas où les $n-1$ variables crées sont à 0. Un article qui reprend cette méthode: https://machinelearningmastery.com/why-one-hot-encode-data-in-machine-learning/\n",
"\n",
"Notre dataset contient beaucoup de variables catégorielles, on vous épargne le preprocessing à faire dessus, mais c'est toujours bon d'avoir ça en tête. En pratique, la majorité des datasets ont besoin de passer par là !"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>traveltime</th>\n",
" <th>studytime</th>\n",
" <th>failures</th>\n",
" <th>famrel</th>\n",
" <th>freetime</th>\n",
" <th>goout</th>\n",
" <th>Dalc</th>\n",
" <th>Walc</th>\n",
" <th>health_Mjob</th>\n",
" <th>...</th>\n",
" <th>no_nursery</th>\n",
" <th>yes_nursery</th>\n",
" <th>no_higher</th>\n",
" <th>yes_higher</th>\n",
" <th>no_internet</th>\n",
" <th>yes_internet</th>\n",
" <th>no_romantic</th>\n",
" <th>yes_romantic</th>\n",
" <th>no</th>\n",
" <th>yes</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>108</th>\n",
" <td>15</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>1</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>349</th>\n",
" <td>18</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>182</th>\n",
" <td>17</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>5</td>\n",
" <td>4</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>384</th>\n",
" <td>18</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>5</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>226</th>\n",
" <td>17</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>5</td>\n",
" <td>3</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>102</th>\n",
" <td>15</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>5</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>5</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>93</th>\n",
" <td>16</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>5</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>48</th>\n",
" <td>15</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>5</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",