Commit 80e010ed authored by Theophile Pace's avatar Theophile Pace

Merge branch 'TP2_mardi' into 'master'

Tp2 mardi

See merge request !4
parents 79bda59f 2b048b42
File added
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# TP Apprentissage supervisé: Régression\n",
"Dans ce TP, on va faire la regression. C'est pour analyser la relation d'une variable par rapport à une ou plusieurs autres."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dataset"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On va utiliser les données Boston.\n",
"https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html\n",
"\n",
"Prix des maisons à Boston (cf le site pour les variables)\n",
"https://scikit-learn.org/stable/datasets/index.html#boston-dataset\n",
"\n",
"Importez les libraries de ce matin: `numpy` et `scikit datasets`.\n",
"Consultation de la doc du dataset\n",
"\n",
"Chargement du dataset boston"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Analyse exploratoire et préparation du dataset\n",
"Étudier les corrélations en utilisant `np.corrcoef`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Split du dataset boston\n",
"\n",
"Pour cela, utilisez la fonction scikit-learn `sklearn.model_selection.train_test_split`. Importez cette méthode, "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Linear regression\n",
"Modèle classique, assez peu puissant et interprétable. Basée sur la Mean Square Error. Très sensible au outliers.\n",
"Trouver le modèle sur scikit learn."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Run sur boston. afficher les coef de chaque features. Quelles features sont significative?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Arbre de régression\n",
"![](https://fr.wikipedia.org/wiki/Arbre_de_d%C3%A9cision#/media/File:Arbre_de_decision.jpg)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Essayer avec une profondeur max de 3"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Essayer avec une profondeur max de 5"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Essayer avec une profondeur max de 10"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Comparer les résultats"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Random forest\n",
"Trouver sur scikit\n",
"image\n",
"modèle"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Essayer avec 3 arbres"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Essayer avec 10 arbres"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"100 arbres"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Comparer avec les arbres de régression. Quels sont les avantages?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"_optionel_ Tracer le résultat avec 1 arbre, 3 arbres et 100 arbres "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Si vous vous ennuyez\n",
"Comparer les différents modèles, en lançant tout ça su le test\n",
"\n",
"Faire une régression sur le résultat d'une PCA (touchy)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
......@@ -222,7 +222,7 @@
{
"data": {
"text/plain": [
"[<matplotlib.lines.Line2D at 0x10cea2748>]"
"[<matplotlib.lines.Line2D at 0x106acd860>]"
]
},
"execution_count": 10,
......@@ -265,7 +265,7 @@
{
"data": {
"text/plain": [
"[<matplotlib.lines.Line2D at 0x10ce03748>]"
"[<matplotlib.lines.Line2D at 0x106a30860>]"
]
},
"execution_count": 11,
......@@ -329,7 +329,7 @@
{
"data": {
"text/plain": [
"\u001b[0;31mInit signature:\u001b[0m \u001b[0mGaussianNB\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpriors\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvar_smoothing\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1e-09\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mInit signature:\u001b[0m \u001b[0mGaussianNB\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpriors\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvar_smoothing\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1e-09\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mDocstring:\u001b[0m \n",
"Gaussian Naive Bayes (GaussianNB)\n",
"\n",
......@@ -411,6 +411,24 @@
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"nbc = GaussianNB"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"nbc = GaussianNB"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
......@@ -446,7 +464,7 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 17,
"metadata": {},
"outputs": [
{
......@@ -512,7 +530,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
......@@ -521,7 +539,7 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 19,
"metadata": {},
"outputs": [
{
......@@ -559,7 +577,7 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 20,
"metadata": {},
"outputs": [
{
......@@ -587,7 +605,7 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 21,
"metadata": {},
"outputs": [
{
......@@ -605,7 +623,7 @@
" -5.79028509e-01, -6.93277983e-01, -1.24763925e-01]])"
]
},
"execution_count": 19,
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
......@@ -676,7 +694,7 @@
},
{
"cell_type": "code",
"execution_count": 20,
"execution_count": 22,
"metadata": {},
"outputs": [
{
......@@ -714,7 +732,7 @@
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 23,
"metadata": {},
"outputs": [
{
......@@ -737,7 +755,7 @@
},
{
"cell_type": "code",
"execution_count": 22,
"execution_count": 24,
"metadata": {},
"outputs": [
{
......@@ -774,15 +792,15 @@
},
{
"cell_type": "code",
"execution_count": 23,
"execution_count": 25,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"204 µs ± 6.73 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n",
"829 µs ± 56.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
"183 µs ± 4.71 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n",
"776 µs ± 6.52 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
]
}
],
......@@ -838,25 +856,14 @@
},
{
"cell_type": "code",
"execution_count": 43,
"execution_count": 33,
"metadata": {},
"outputs": [
{
"ename": "NotImplementedError",
"evalue": "INSTALLEZ GRAPHVIZ",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNotImplementedError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-43-e8b47aed4505>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0msklearn\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtree\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mDecisionTreeClassifier\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mNotImplementedError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"INSTALLEZ GRAPHVIZ\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4\u001b[0m \u001b[0;31m# ne vous occupez pas de cette fonction, c'est juste de la visu\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;31m#!pip install graphviz\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mNotImplementedError\u001b[0m: INSTALLEZ GRAPHVIZ"
]
}
],
"outputs": [],
"source": [
"from sklearn.tree import DecisionTreeClassifier\n",
"\n",
"raise NotImplementedError(\"INSTALLEZ GRAPHVIZ\")\n",
"# pour les utilisateurs d'os normaux:\n",
"! conda install graphviz python-graphviz\n",
"# raise NotImplementedError(\"INSTALLEZ GRAPHVIZ\")\n",
"# ne vous occupez pas de cette fonction, c'est juste de la visu\n",
"\n",
"from graphviz import Source\n",
......@@ -879,15 +886,407 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 40,
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"image/svg+xml": [
"<svg height=\"790pt\" viewBox=\"0.00 0.00 950.00 790.00\" width=\"950pt\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
"<g class=\"graph\" id=\"graph0\" transform=\"scale(1 1) rotate(0) translate(4 786)\">\n",
"<title>Tree</title>\n",
"<polygon fill=\"#ffffff\" points=\"-4,4 -4,-786 946,-786 946,4 -4,4\" stroke=\"transparent\"/>\n",
"<!-- 0 -->\n",
"<g class=\"node\" id=\"node1\">\n",
"<title>0</title>\n",
"<polygon fill=\"none\" points=\"584,-782 395,-782 395,-699 584,-699 584,-782\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"489.5\" y=\"-766.8\">worst concave points &lt;= 0.142</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"489.5\" y=\"-751.8\">gini = 0.464</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"489.5\" y=\"-736.8\">samples = 341</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"489.5\" y=\"-721.8\">value = [125, 216]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"489.5\" y=\"-706.8\">class = benign</text>\n",
"</g>\n",
"<!-- 1 -->\n",
"<g class=\"node\" id=\"node2\">\n",
"<title>1</title>\n",
"<polygon fill=\"none\" points=\"471,-663 342,-663 342,-580 471,-580 471,-663\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"406.5\" y=\"-647.8\">worst area &lt;= 929.8</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"406.5\" y=\"-632.8\">gini = 0.17</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"406.5\" y=\"-617.8\">samples = 234</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"406.5\" y=\"-602.8\">value = [22, 212]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"406.5\" y=\"-587.8\">class = benign</text>\n",
"</g>\n",
"<!-- 0&#45;&gt;1 -->\n",
"<g class=\"edge\" id=\"edge1\">\n",
"<title>0-&gt;1</title>\n",
"<path d=\"M460.4706,-698.8796C454.3145,-690.0534 447.7549,-680.6485 441.4064,-671.5466\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"444.2448,-669.4978 435.6533,-663.2981 438.5033,-673.5024 444.2448,-669.4978\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"431.2669\" y=\"-684.2103\">True</text>\n",
"</g>\n",
"<!-- 16 -->\n",
"<g class=\"node\" id=\"node17\">\n",
"<title>16</title>\n",
"<polygon fill=\"none\" points=\"623,-663 494,-663 494,-580 623,-580 623,-663\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"558.5\" y=\"-647.8\">worst area &lt;= 444.3</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"558.5\" y=\"-632.8\">gini = 0.072</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"558.5\" y=\"-617.8\">samples = 107</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"558.5\" y=\"-602.8\">value = [103, 4]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"558.5\" y=\"-587.8\">class = malignant</text>\n",
"</g>\n",
"<!-- 0&#45;&gt;16 -->\n",
"<g class=\"edge\" id=\"edge16\">\n",
"<title>0-&gt;16</title>\n",
"<path d=\"M513.6329,-698.8796C518.6461,-690.2335 523.9813,-681.0322 529.1581,-672.1042\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"532.2758,-673.7047 534.2641,-663.2981 526.2202,-670.1934 532.2758,-673.7047\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"540.6897\" y=\"-683.7582\">False</text>\n",
"</g>\n",
"<!-- 2 -->\n",
"<g class=\"node\" id=\"node3\">\n",
"<title>2</title>\n",
"<polygon fill=\"none\" points=\"306,-544 149,-544 149,-461 306,-461 306,-544\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"227.5\" y=\"-528.8\">symmetry error &lt;= 0.009</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"227.5\" y=\"-513.8\">gini = 0.054</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"227.5\" y=\"-498.8\">samples = 215</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"227.5\" y=\"-483.8\">value = [6, 209]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"227.5\" y=\"-468.8\">class = benign</text>\n",
"</g>\n",
"<!-- 1&#45;&gt;2 -->\n",
"<g class=\"edge\" id=\"edge2\">\n",
"<title>1-&gt;2</title>\n",
"<path d=\"M343.8945,-579.8796C329.3146,-570.1868 313.6849,-559.7961 298.7618,-549.8752\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"300.3355,-546.7185 290.0702,-544.0969 296.4601,-552.5479 300.3355,-546.7185\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 11 -->\n",
"<g class=\"node\" id=\"node12\">\n",
"<title>11</title>\n",
"<polygon fill=\"none\" points=\"478.5,-544 334.5,-544 334.5,-461 478.5,-461 478.5,-544\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"406.5\" y=\"-528.8\">worst texture &lt;= 19.33</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"406.5\" y=\"-513.8\">gini = 0.266</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"406.5\" y=\"-498.8\">samples = 19</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"406.5\" y=\"-483.8\">value = [16, 3]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"406.5\" y=\"-468.8\">class = malignant</text>\n",
"</g>\n",
"<!-- 1&#45;&gt;11 -->\n",
"<g class=\"edge\" id=\"edge11\">\n",
"<title>1-&gt;11</title>\n",
"<path d=\"M406.5,-579.8796C406.5,-571.6838 406.5,-562.9891 406.5,-554.5013\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"410.0001,-554.298 406.5,-544.2981 403.0001,-554.2981 410.0001,-554.298\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 3 -->\n",
"<g class=\"node\" id=\"node4\">\n",
"<title>3</title>\n",
"<polygon fill=\"none\" points=\"115,-417.5 0,-417.5 0,-349.5 115,-349.5 115,-417.5\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"57.5\" y=\"-402.3\">gini = 0.0</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"57.5\" y=\"-387.3\">samples = 1</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"57.5\" y=\"-372.3\">value = [1, 0]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"57.5\" y=\"-357.3\">class = malignant</text>\n",
"</g>\n",
"<!-- 2&#45;&gt;3 -->\n",
"<g class=\"edge\" id=\"edge3\">\n",
"<title>2-&gt;3</title>\n",
"<path d=\"M168.0422,-460.8796C150.6804,-448.7263 131.7512,-435.4759 114.5425,-423.4297\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"116.4467,-420.4904 106.2472,-417.623 112.4324,-426.225 116.4467,-420.4904\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 4 -->\n",
"<g class=\"node\" id=\"node5\">\n",
"<title>4</title>\n",
"<polygon fill=\"none\" points=\"322,-425 133,-425 133,-342 322,-342 322,-425\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"227.5\" y=\"-409.8\">worst concave points &lt;= 0.111</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"227.5\" y=\"-394.8\">gini = 0.046</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"227.5\" y=\"-379.8\">samples = 214</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"227.5\" y=\"-364.8\">value = [5, 209]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"227.5\" y=\"-349.8\">class = benign</text>\n",
"</g>\n",
"<!-- 2&#45;&gt;4 -->\n",
"<g class=\"edge\" id=\"edge4\">\n",
"<title>2-&gt;4</title>\n",
"<path d=\"M227.5,-460.8796C227.5,-452.6838 227.5,-443.9891 227.5,-435.5013\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"231.0001,-435.298 227.5,-425.2981 224.0001,-435.2981 231.0001,-435.298\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 5 -->\n",
"<g class=\"node\" id=\"node6\">\n",
"<title>5</title>\n",
"<polygon fill=\"none\" points=\"171.5,-298.5 65.5,-298.5 65.5,-230.5 171.5,-230.5 171.5,-298.5\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"118.5\" y=\"-283.3\">gini = 0.0</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"118.5\" y=\"-268.3\">samples = 185</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"118.5\" y=\"-253.3\">value = [0, 185]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"118.5\" y=\"-238.3\">class = benign</text>\n",
"</g>\n",
"<!-- 4&#45;&gt;5 -->\n",
"<g class=\"edge\" id=\"edge5\">\n",
"<title>4-&gt;5</title>\n",
"<path d=\"M189.3771,-341.8796C178.8014,-330.3337 167.3188,-317.7976 156.7367,-306.2446\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"159.2667,-303.825 149.9313,-298.8149 154.1049,-308.5531 159.2667,-303.825\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 6 -->\n",
"<g class=\"node\" id=\"node7\">\n",
"<title>6</title>\n",
"<polygon fill=\"none\" points=\"391.5,-306 189.5,-306 189.5,-223 391.5,-223 391.5,-306\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"290.5\" y=\"-290.8\">mean fractal dimension &lt;= 0.056</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"290.5\" y=\"-275.8\">gini = 0.285</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"290.5\" y=\"-260.8\">samples = 29</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"290.5\" y=\"-245.8\">value = [5, 24]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"290.5\" y=\"-230.8\">class = benign</text>\n",
"</g>\n",
"<!-- 4&#45;&gt;6 -->\n",
"<g class=\"edge\" id=\"edge6\">\n",
"<title>4-&gt;6</title>\n",
"<path d=\"M249.5343,-341.8796C254.064,-333.3236 258.8815,-324.2238 263.5618,-315.3833\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"266.7859,-316.7736 268.3716,-306.2981 260.5994,-313.4983 266.7859,-316.7736\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 7 -->\n",
"<g class=\"node\" id=\"node8\">\n",
"<title>7</title>\n",
"<polygon fill=\"none\" points=\"274,-179.5 159,-179.5 159,-111.5 274,-111.5 274,-179.5\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"216.5\" y=\"-164.3\">gini = 0.0</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"216.5\" y=\"-149.3\">samples = 3</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"216.5\" y=\"-134.3\">value = [3, 0]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"216.5\" y=\"-119.3\">class = malignant</text>\n",
"</g>\n",
"<!-- 6&#45;&gt;7 -->\n",
"<g class=\"edge\" id=\"edge7\">\n",
"<title>6-&gt;7</title>\n",
"<path d=\"M264.6184,-222.8796C257.7121,-211.7735 250.2361,-199.7513 243.2825,-188.5691\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"246.0917,-186.4587 237.8387,-179.8149 240.1473,-190.1552 246.0917,-186.4587\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 8 -->\n",
"<g class=\"node\" id=\"node9\">\n",
"<title>8</title>\n",
"<polygon fill=\"none\" points=\"436.5,-187 292.5,-187 292.5,-104 436.5,-104 436.5,-187\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"364.5\" y=\"-171.8\">worst texture &lt;= 33.23</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"364.5\" y=\"-156.8\">gini = 0.142</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"364.5\" y=\"-141.8\">samples = 26</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"364.5\" y=\"-126.8\">value = [2, 24]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"364.5\" y=\"-111.8\">class = benign</text>\n",
"</g>\n",
"<!-- 6&#45;&gt;8 -->\n",
"<g class=\"edge\" id=\"edge8\">\n",
"<title>6-&gt;8</title>\n",
"<path d=\"M316.3816,-222.8796C321.8142,-214.1434 327.5992,-204.8404 333.2053,-195.8253\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"336.1993,-197.6383 338.5079,-187.2981 330.2549,-193.9418 336.1993,-197.6383\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 9 -->\n",
"<g class=\"node\" id=\"node10\">\n",
"<title>9</title>\n",
"<polygon fill=\"none\" points=\"351,-68 252,-68 252,0 351,0 351,-68\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"301.5\" y=\"-52.8\">gini = 0.0</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"301.5\" y=\"-37.8\">samples = 24</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"301.5\" y=\"-22.8\">value = [0, 24]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"301.5\" y=\"-7.8\">class = benign</text>\n",
"</g>\n",
"<!-- 8&#45;&gt;9 -->\n",
"<g class=\"edge\" id=\"edge9\">\n",
"<title>8-&gt;9</title>\n",
"<path d=\"M341.0411,-103.9815C336.1078,-95.2504 330.8926,-86.0202 325.9248,-77.2281\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"328.8263,-75.2483 320.8597,-68.2637 322.7319,-78.6918 328.8263,-75.2483\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 10 -->\n",
"<g class=\"node\" id=\"node11\">\n",
"<title>10</title>\n",
"<polygon fill=\"none\" points=\"484,-68 369,-68 369,0 484,0 484,-68\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"426.5\" y=\"-52.8\">gini = 0.0</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"426.5\" y=\"-37.8\">samples = 2</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"426.5\" y=\"-22.8\">value = [2, 0]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"426.5\" y=\"-7.8\">class = malignant</text>\n",
"</g>\n",
"<!-- 8&#45;&gt;10 -->\n",
"<g class=\"edge\" id=\"edge10\">\n",
"<title>8-&gt;10</title>\n",
"<path d=\"M387.5865,-103.9815C392.4415,-95.2504 397.574,-86.0202 402.4629,-77.2281\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"405.6467,-78.7043 407.4476,-68.2637 399.5288,-75.3025 405.6467,-78.7043\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 12 -->\n",
"<g class=\"node\" id=\"node13\">\n",
"<title>12</title>\n",
"<polygon fill=\"none\" points=\"437,-417.5 340,-417.5 340,-349.5 437,-349.5 437,-417.5\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"388.5\" y=\"-402.3\">gini = 0.0</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"388.5\" y=\"-387.3\">samples = 2</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"388.5\" y=\"-372.3\">value = [0, 2]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"388.5\" y=\"-357.3\">class = benign</text>\n",
"</g>\n",
"<!-- 11&#45;&gt;12 -->\n",
"<g class=\"edge\" id=\"edge12\">\n",
"<title>11-&gt;12</title>\n",
"<path d=\"M400.2045,-460.8796C398.5911,-450.2134 396.8499,-438.7021 395.2162,-427.9015\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"398.6468,-427.179 393.6905,-417.8149 391.7255,-428.226 398.6468,-427.179\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 13 -->\n",
"<g class=\"node\" id=\"node14\">\n",
"<title>13</title>\n",
"<polygon fill=\"none\" points=\"626,-425 455,-425 455,-342 626,-342 626,-425\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"540.5\" y=\"-409.8\">worst smoothness &lt;= 0.088</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"540.5\" y=\"-394.8\">gini = 0.111</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"540.5\" y=\"-379.8\">samples = 17</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"540.5\" y=\"-364.8\">value = [16, 1]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"540.5\" y=\"-349.8\">class = malignant</text>\n",
"</g>\n",
"<!-- 11&#45;&gt;13 -->\n",
"<g class=\"edge\" id=\"edge13\">\n",
"<title>11-&gt;13</title>\n",
"<path d=\"M453.3667,-460.8796C463.8125,-451.6031 474.9781,-441.6874 485.711,-432.1559\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"488.2801,-434.5553 493.4333,-425.2981 483.632,-429.3213 488.2801,-434.5553\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 14 -->\n",
"<g class=\"node\" id=\"node15\">\n",
"<title>14</title>\n",
"<polygon fill=\"none\" points=\"543,-298.5 446,-298.5 446,-230.5 543,-230.5 543,-298.5\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"494.5\" y=\"-283.3\">gini = 0.0</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"494.5\" y=\"-268.3\">samples = 1</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"494.5\" y=\"-253.3\">value = [0, 1]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"494.5\" y=\"-238.3\">class = benign</text>\n",
"</g>\n",
"<!-- 13&#45;&gt;14 -->\n",
"<g class=\"edge\" id=\"edge14\">\n",
"<title>13-&gt;14</title>\n",
"<path d=\"M524.4114,-341.8796C520.2034,-330.9935 515.655,-319.227 511.4057,-308.2344\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"514.6348,-306.8804 507.7646,-298.8149 508.1056,-309.4043 514.6348,-306.8804\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 15 -->\n",
"<g class=\"node\" id=\"node16\">\n",
"<title>15</title>\n",
"<polygon fill=\"none\" points=\"676,-298.5 561,-298.5 561,-230.5 676,-230.5 676,-298.5\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"618.5\" y=\"-283.3\">gini = 0.0</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"618.5\" y=\"-268.3\">samples = 16</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"618.5\" y=\"-253.3\">value = [16, 0]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"618.5\" y=\"-238.3\">class = malignant</text>\n",
"</g>\n",
"<!-- 13&#45;&gt;15 -->\n",
"<g class=\"edge\" id=\"edge15\">\n",
"<title>13-&gt;15</title>\n",
"<path d=\"M567.7806,-341.8796C575.1323,-330.6636 583.0964,-318.5131 590.4874,-307.2372\" fill=\"none\" stroke=\"#000000\"/>\n",
"<polygon fill=\"#000000\" points=\"593.4531,-309.0972 596.0079,-298.8149 587.5986,-305.2598 593.4531,-309.0972\" stroke=\"#000000\"/>\n",
"</g>\n",
"<!-- 17 -->\n",
"<g class=\"node\" id=\"node18\">\n",
"<title>17</title>\n",
"<polygon fill=\"none\" points=\"607,-536.5 510,-536.5 510,-468.5 607,-468.5 607,-536.5\" stroke=\"#000000\"/>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"558.5\" y=\"-521.3\">gini = 0.0</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"558.5\" y=\"-506.3\">samples = 2</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"558.5\" y=\"-491.3\">value = [0, 2]</text>\n",
"<text fill=\"#000000\" font-family=\"Times,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"558.5\" y=\"-476.3\">class = benign</text>\n",
"</g>\n",
"<!-- 16&#45;&gt;17 -->\n",