Commit ddb7750f authored by Mathilde Rineau's avatar Mathilde Rineau 🙂
Browse files

Questions, limits, variants of the methods

parent 6aab3d9c
......@@ -23,7 +23,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
......@@ -51,9 +51,18 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 5,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'numpy.ndarray'>\n",
"(610, 340, 103)\n"
]
}
],
"source": [
"image_data = image['paviaU']\n",
"print(type(image_data))\n",
......@@ -74,7 +83,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
......@@ -95,7 +104,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
......@@ -118,9 +127,17 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 8,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(207400, 3)\n"
]
}
],
"source": [
"from sklearn.decomposition import PCA\n",
"\n",
......@@ -161,6 +178,47 @@
"\n",
"plt.imshow(output_image)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Questions about the solution\n",
"As requested, we have performed a dimension reduction to 3 principal components which is a very significant reduction considering that the original dimension was 103. We might have chosen another reduction for instance by considering the percentage of explained variance instead of the number of principal components or by drawing the associated scree plot.\n",
"\n",
"## Limits\n",
"By reducing the dimension from 103 to 3 we have lost information and we can't know exactly what this information was and how much it was significant. However, this reduction allows us to visualize more easily the data, indeed band reduction is supposed to give the same color to similar objects.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exploring variants of the methods\n",
"\n",
"As said before we might have choosen to use the explained variance to determine how much principal components have to be retained.\n",
"You can see below two plots, the first one is a scree-plot which plots the importance of each principal components (from 1 to 103).\n",
"The second one represents the cumulative explained variance in terms of number of principal components.\n",
"We observe on the first plot that only 3 principal componants are really significant which is consistent to the results of the second plot where the explained variance grows very quickly to 1.\n",
"Consequently, by using another implementation of PCA we would have had the same results. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.decomposition import PCA\n",
"pca = PCA()\n",
"pca.fit(X_scaled)\n",
"plt.figure()\n",
"plt.bar(range(1, X.shape[1]+1), pca.explained_variance_)\n",
"plt.show()\n",
"plt.figure()\n",
"plt.plot(range(1, X.shape[1]+1), np.cumsum(pca.explained_variance_ratio_))\n",
"plt.show()"
]
}
],
"metadata": {
......@@ -168,7 +226,8 @@
"hash": "75384ff26d2d5ce1444ca190d7bd826ecec6b011430d85ce2059ae14ba6abb78"
},
"kernelspec": {
"display_name": "Python 3.9.6 64-bit ('AOS1-3HAiNONq': pipenv)",
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
......@@ -180,7 +239,8 @@
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3"
"pygments_lexer": "ipython3",
"version": "3.8.8"
}
},
"nbformat": 4,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment