Commit 2ab1a120 authored by Theophile Pace's avatar Theophile Pace

Merge branch 'projet' into 'master'

Projet

See merge request !6
parents 80e010ed d740657c
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Student Alcohol Consumption\n",
"\n",
"WHAT??? \n",
"\n",
"Prédire les notes des étudiants en fonction de facteurs tels que la consommation d'alcool ? \n",
"PS: peu importe ce qu'on en déduira, pas touche à notre cher... \n",
"\n",
"On s'intéresse à un jeu de données social où ont été recueillies des informations sur des lycéens à propos de leurs conditions. Les notes qu'ils ont obtenus à leurs examens sont églament présentes. On voudrait pouvoir prédire les notes obtenues en fonction des conditions sociales des étudiants.\n",
"\n",
"Plus infos sur le dataset: https://www.kaggle.com/uciml/student-alcohol-consumption/home"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Chargement des données"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(395, 33)"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import matplotlib.pyplot as plt\n",
"import pandas as pd\n",
"df = pd.read_csv('./student-alcohol-consumption/student-mat.csv')\n",
"df.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On a 395 lignes et 33 colonnes. Chaque ligne correspond à un étudiant. Les colonnes sont les variables, elles décrivent les informations à propos de l'étudiant comme son âge, sexe, son temps de transport pour aller à l'école, le temps qu'il travaille à la maison, sa consommation d'alcool, etc. La liste complète de ces variables (les features) et leur explication est disponible ici: https://www.kaggle.com/uciml/student-alcohol-consumption/home"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Index(['school', 'sex', 'age', 'address', 'famsize', 'Pstatus', 'Medu', 'Fedu',\n",
" 'Mjob', 'Fjob', 'reason', 'guardian', 'traveltime', 'studytime',\n",
" 'failures', 'schoolsup', 'famsup', 'paid', 'activities', 'nursery',\n",
" 'higher', 'internet', 'romantic', 'famrel', 'freetime', 'goout', 'Dalc',\n",
" 'Walc', 'health', 'absences', 'G1', 'G2', 'G3'],\n",
" dtype='object')"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.columns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On cherche à prédire quels sont les résultats scolaires des étudiants. Il y a pour chacun 3 notes: G1 (first period grade), G2 (second period grade) et G3 (final grade).\n",
"\n",
"Question, est-ce que la consommation d'alcool influe sur le résultat des élèves ? Réponse !"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 1152x216 with 3 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# On affiche pour chaque examen (G1-3) les moyennes des notes selon la catégorie de consommation\n",
"# d'alcool en semaine (oui, en semaine). Dalc est la variable correspondante (voir la doc)\n",
"fig, axes = plt.subplots(1,3, figsize=(16, 3))\n",
"for a in [(\"G1\", axes[0]), (\"G2\", axes[1]), (\"G3\", axes[2])]:\n",
" df.groupby(pd.cut(df[\"Dalc\"], 3))[a[0]].mean().plot.bar(ax=a[1], title=a[0])\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On se rend compte que NON, ça n'influe pas (presque pas, aller)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Plus sérieusement"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On souhaite toujours prédire la note d'un élève en fonction de toutes les variables à disposition. On choisi de ne garder que la variable G3 pour simplifier le problème. On pourrait aussi faire une moyenne de ces 3 notes. Au choix. G3 sera donc notre variable à prédire (target feature)."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"Index(['school', 'sex', 'age', 'address', 'famsize', 'Pstatus', 'Medu', 'Fedu',\n",
" 'Mjob', 'Fjob', 'reason', 'guardian', 'traveltime', 'studytime',\n",
" 'failures', 'schoolsup', 'famsup', 'paid', 'activities', 'nursery',\n",
" 'higher', 'internet', 'romantic', 'famrel', 'freetime', 'goout', 'Dalc',\n",
" 'Walc', 'health', 'absences', 'G3'],\n",
" dtype='object')"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = df.drop([\"G1\", \"G2\"], axis=1)\n",
"data.columns"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>school</th>\n",
" <th>sex</th>\n",
" <th>age</th>\n",
" <th>address</th>\n",
" <th>famsize</th>\n",
" <th>Pstatus</th>\n",
" <th>Medu</th>\n",
" <th>Fedu</th>\n",
" <th>Mjob</th>\n",
" <th>Fjob</th>\n",
" <th>...</th>\n",
" <th>internet</th>\n",
" <th>romantic</th>\n",
" <th>famrel</th>\n",
" <th>freetime</th>\n",
" <th>goout</th>\n",
" <th>Dalc</th>\n",
" <th>Walc</th>\n",
" <th>health</th>\n",
" <th>absences</th>\n",
" <th>G3</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>55</th>\n",
" <td>GP</td>\n",
" <td>F</td>\n",
" <td>16</td>\n",
" <td>U</td>\n",
" <td>GT3</td>\n",
" <td>A</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>other</td>\n",
" <td>other</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>yes</td>\n",
" <td>5</td>\n",
" <td>3</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>8</td>\n",
" <td>10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>155</th>\n",
" <td>GP</td>\n",
" <td>M</td>\n",
" <td>15</td>\n",
" <td>R</td>\n",
" <td>GT3</td>\n",
" <td>T</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>at_home</td>\n",
" <td>services</td>\n",
" <td>...</td>\n",
" <td>no</td>\n",
" <td>no</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>151</th>\n",
" <td>GP</td>\n",
" <td>M</td>\n",
" <td>16</td>\n",
" <td>U</td>\n",
" <td>LE3</td>\n",
" <td>T</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>at_home</td>\n",
" <td>other</td>\n",
" <td>...</td>\n",
" <td>no</td>\n",
" <td>yes</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>6</td>\n",
" <td>14</td>\n",
" </tr>\n",
" <tr>\n",
" <th>82</th>\n",
" <td>GP</td>\n",
" <td>F</td>\n",
" <td>15</td>\n",
" <td>U</td>\n",
" <td>LE3</td>\n",
" <td>T</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>services</td>\n",
" <td>other</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>no</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>5</td>\n",
" <td>10</td>\n",
" <td>6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>GP</td>\n",
" <td>F</td>\n",
" <td>15</td>\n",
" <td>R</td>\n",
" <td>GT3</td>\n",
" <td>T</td>\n",
" <td>2</td>\n",
" <td>4</td>\n",
" <td>services</td>\n",
" <td>health</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>no</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>5</td>\n",
" <td>2</td>\n",
" <td>8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>GP</td>\n",
" <td>M</td>\n",
" <td>17</td>\n",
" <td>U</td>\n",
" <td>GT3</td>\n",
" <td>T</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>services</td>\n",
" <td>services</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>no</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>2</td>\n",
" <td>4</td>\n",
" <td>5</td>\n",
" <td>16</td>\n",
" <td>5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>75</th>\n",
" <td>GP</td>\n",
" <td>M</td>\n",
" <td>15</td>\n",
" <td>U</td>\n",
" <td>GT3</td>\n",
" <td>T</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>teacher</td>\n",
" <td>other</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>no</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>6</td>\n",
" <td>10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>150</th>\n",
" <td>GP</td>\n",
" <td>M</td>\n",
" <td>18</td>\n",
" <td>U</td>\n",
" <td>LE3</td>\n",
" <td>T</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>other</td>\n",
" <td>other</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>yes</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>2</td>\n",
" <td>5</td>\n",
" <td>4</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>85</th>\n",
" <td>GP</td>\n",
" <td>F</td>\n",
" <td>15</td>\n",
" <td>U</td>\n",
" <td>GT3</td>\n",
" <td>T</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>services</td>\n",
" <td>services</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>yes</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>6</td>\n",
" <td>8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>261</th>\n",
" <td>GP</td>\n",
" <td>M</td>\n",
" <td>18</td>\n",
" <td>U</td>\n",
" <td>GT3</td>\n",
" <td>T</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>teacher</td>\n",
" <td>other</td>\n",
" <td>...</td>\n",
" <td>yes</td>\n",
" <td>no</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>8</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>10 rows × 31 columns</p>\n",
"</div>"
],
"text/plain": [
" school sex age address famsize Pstatus Medu Fedu Mjob Fjob \\\n",
"55 GP F 16 U GT3 A 2 1 other other \n",
"155 GP M 15 R GT3 T 2 3 at_home services \n",
"151 GP M 16 U LE3 T 2 1 at_home other \n",
"82 GP F 15 U LE3 T 3 2 services other \n",
"24 GP F 15 R GT3 T 2 4 services health \n",
"18 GP M 17 U GT3 T 3 2 services services \n",
"75 GP M 15 U GT3 T 4 3 teacher other \n",
"150 GP M 18 U LE3 T 1 1 other other \n",
"85 GP F 15 U GT3 T 4 4 services services \n",
"261 GP M 18 U GT3 T 4 3 teacher other \n",
"\n",
" ... internet romantic famrel freetime goout Dalc Walc health absences \\\n",
"55 ... yes yes 5 3 4 1 1 2 8 \n",
"155 ... no no 4 4 4 1 1 1 2 \n",
"151 ... no yes 4 4 4 3 5 5 6 \n",
"82 ... yes no 4 4 4 1 1 5 10 \n",
"24 ... yes no 4 3 2 1 1 5 2 \n",
"18 ... yes no 5 5 5 2 4 5 16 \n",
"75 ... yes no 4 3 3 2 3 5 6 \n",
"150 ... yes yes 2 3 5 2 5 4 0 \n",
"85 ... yes yes 4 4 4 2 3 5 6 \n",
"261 ... yes no 4 3 2 1 1 3 2 \n",
"\n",
" G3 \n",
"55 10 \n",
"155 8 \n",
"151 14 \n",
"82 6 \n",
"24 8 \n",
"18 5 \n",
"75 10 \n",
"150 0 \n",
"85 8 \n",
"261 8 \n",
"\n",
"[10 rows x 31 columns]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data.sample(10)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Travailler avec les variables catégorielles"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Il y a ce qu'on appelle des variables catégorielles, dont les valeurs ne sont pas numériques et continues. Il y a par exemple la variable `school` qui est soit GP soit MS (nom des deux écoles). Mais aussi `Mjob` (mother's job) qui prend des valeurs dans {teacher, health, services, ...}. Ces valeurs ne sont pas calculables telles quelles. Seuls les arbres de décisions peuvent accepter ces valeurs (pourquoi ?).\n",
"\n",
"Il faut donc faire une transformation. Deux stratégies sont possibles:\n",
"\n",
"* **Integer encoding** : on donne une valeur entière à chaque modalité (teacher devient 0, health devient 1, services 2, etc). Le problème avec ça, c'est qu'on introduit un biais. 0 < 2 et donc _teacher_ deviendrait \"inférieur\" à _services_ ? Cette stratégie n'est donc pas toujours pertinente, sauf pour les modalités où il y a une relation d'ordre.\n",
"\n",
"* **One-hot encoding**: on va binariser les variables, par exemple `school` va prendre les valeurs 0 et 1. Si school=GP alors on met 0 à place, si school=MS on met 1. Mais qu'est-ce qu'il se passe lorsque notre a variable a plus de deux modalités possibles (`Mjob`) ? Il faut créer des nouvelles variables ! `Mjob_teacher` prend la valeur 0 si ce n'est pas teacher et 1 si oui, `Mjob_health` pareil, etc. En fait, si on a $n$ modalités, on peut créer $n$ variables. En pratique $n-1$, car la dernière modalité correspondrait au cas où les $n-1$ variables crées sont à 0. Un article qui reprend cette méthode: https://machinelearningmastery.com/why-one-hot-encode-data-in-machine-learning/\n",
"\n",
"Notre dataset contient beaucoup de variables catégorielles, on vous épargne le preprocessing à faire dessus, mais c'est toujours bon d'avoir ça en tête. En pratique, la majorité des datasets ont besoin de passer par là !"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>age</th>\n",
" <th>traveltime</th>\n",
" <th>studytime</th>\n",
" <th>failures</th>\n",
" <th>famrel</th>\n",
" <th>freetime</th>\n",
" <th>goout</th>\n",
" <th>Dalc</th>\n",
" <th>Walc</th>\n",
" <th>health_Mjob</th>\n",
" <th>...</th>\n",
" <th>no_nursery</th>\n",
" <th>yes_nursery</th>\n",
" <th>no_higher</th>\n",
" <th>yes_higher</th>\n",
" <th>no_internet</th>\n",
" <th>yes_internet</th>\n",
" <th>no_romantic</th>\n",
" <th>yes_romantic</th>\n",
" <th>no</th>\n",
" <th>yes</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>108</th>\n",
" <td>15</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>1</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>349</th>\n",
" <td>18</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>5</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>182</th>\n",
" <td>17</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>5</td>\n",
" <td>4</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>3</td>\n",
" <td>5</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>384</th>\n",
" <td>18</td>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>5</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>226</th>\n",
" <td>17</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>5</td>\n",
" <td>3</td>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>102</th>\n",
" <td>15</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>5</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>5</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>93</th>\n",
" <td>16</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>5</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>...</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>48</th>\n",
" <td>15</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>0</td>\n",
" <td>4</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>5</td>\n",
" <td>...</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",