diff --git a/TP/TP2_mardi/TP_Regression.ipynb b/TP/TP2_mardi/TP_Regression.ipynb index f99517df7d7bdc576be97e4372801b7fc1cdf47c..85629602908f0c770e554bba2d5a451c4d6f5c52 100644 --- a/TP/TP2_mardi/TP_Regression.ipynb +++ b/TP/TP2_mardi/TP_Regression.ipynb @@ -4,8 +4,8 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Regression\n", - "Rappel problème de régression" + "# TP Apprentissage supervisé: Régression\n", + "Dans ce TP, on va faire la regression. C'est pour analyser la relation d'une variable par rapport à une ou plusieurs autres." ] }, { @@ -19,32 +19,56 @@ "cell_type": "markdown", "metadata": {}, "source": [ + "On va utiliser les données Boston.\n", + "https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html\n", + "\n", + "Prix des maisons à Boston (cf le site pour les variables)\n", + "https://scikit-learn.org/stable/datasets/index.html#boston-dataset\n", + "\n", + "Importez les libraries de ce matin: `numpy` et `scikit datasets`.\n", "Consultation de la doc du dataset\n", + "\n", "Chargement du dataset boston" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Analyse exploratoire et préparation du dataset\n", - "Étudier les corrélations" + "Étudier les corrélations en utilisant `np.corrcoef`" ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": null, "metadata": {}, - "source": [ - "Split du dataset boston" - ] + "outputs": [], + "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "##" + "Split du dataset boston\n", + "\n", + "Pour cela, utilisez la fonction scikit-learn `sklearn.model_selection.train_test_split`. Importez cette méthode, " ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -54,6 +78,13 @@ "Trouver le modèle sur scikit learn." ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -61,15 +92,28 @@ "Run sur boston. afficher les coef de chaque features. Quelles features sont significative?" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Arbre de régression\n", - "Rappel modèle\n", - "image" + "![](https://fr.wikipedia.org/wiki/Arbre_de_d%C3%A9cision#/media/File:Arbre_de_decision.jpg)" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -77,6 +121,13 @@ "Essayer avec une profondeur max de 3" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -84,6 +135,13 @@ "Essayer avec une profondeur max de 5" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -91,6 +149,13 @@ "Essayer avec une profondeur max de 10" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -98,6 +163,13 @@ "Comparer les résultats" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -108,6 +180,13 @@ "modèle" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -115,6 +194,13 @@ "Essayer avec 3 arbres" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -122,6 +208,13 @@ "Essayer avec 10 arbres" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -129,6 +222,13 @@ "100 arbres" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -136,6 +236,13 @@ "Comparer avec les arbres de régression. Quels sont les avantages?" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -143,6 +250,13 @@ "_optionel_ Tracer le résultat avec 1 arbre, 3 arbres et 100 arbres " ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "metadata": {}, @@ -152,6 +266,13 @@ "\n", "Faire une régression sur le résultat d'une PCA (touchy)\n" ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { @@ -170,7 +291,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.6" + "version": "3.7.2" } }, "nbformat": 4,