Commit 80e010ed by Theophile Pace

### Merge branch 'TP2_mardi' into 'master'

```Tp2 mardi

See merge request !4```
parents 79bda59f 2b048b42
TP/.DS_Store 0 → 100644
File added
 { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# TP Apprentissage supervisé: Régression\n", "Dans ce TP, on va faire la regression. C'est pour analyser la relation d'une variable par rapport à une ou plusieurs autres." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On va utiliser les données Boston.\n", "https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html\n", "\n", "Prix des maisons à Boston (cf le site pour les variables)\n", "https://scikit-learn.org/stable/datasets/index.html#boston-dataset\n", "\n", "Importez les libraries de ce matin: `numpy` et `scikit datasets`.\n", "Consultation de la doc du dataset\n", "\n", "Chargement du dataset boston" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Analyse exploratoire et préparation du dataset\n", "Étudier les corrélations en utilisant `np.corrcoef`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Split du dataset boston\n", "\n", "Pour cela, utilisez la fonction scikit-learn `sklearn.model_selection.train_test_split`. Importez cette méthode, " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Linear regression\n", "Modèle classique, assez peu puissant et interprétable. Basée sur la Mean Square Error. Très sensible au outliers.\n", "Trouver le modèle sur scikit learn." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run sur boston. afficher les coef de chaque features. Quelles features sont significative?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Arbre de régression\n", "![](https://fr.wikipedia.org/wiki/Arbre_de_d%C3%A9cision#/media/File:Arbre_de_decision.jpg)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Essayer avec une profondeur max de 3" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Essayer avec une profondeur max de 5" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Essayer avec une profondeur max de 10" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Comparer les résultats" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Random forest\n", "Trouver sur scikit\n", "image\n", "modèle" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Essayer avec 3 arbres" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Essayer avec 10 arbres" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "100 arbres" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Comparer avec les arbres de régression. Quels sont les avantages?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "_optionel_ Tracer le résultat avec 1 arbre, 3 arbres et 100 arbres " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Si vous vous ennuyez\n", "Comparer les différents modèles, en lançant tout ça su le test\n", "\n", "Faire une régression sur le résultat d'une PCA (touchy)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" } }, "nbformat": 4, "nbformat_minor": 2 }
This diff is collapsed.
 ... ... @@ -2260,7 +2260,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" "version": "3.6.7" }, "toc": { "colors": {
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment