This commit is contained in:
garanews 2021-10-01 15:18:56 +02:00 коммит произвёл GitHub
Родитель 816bbc05d6
Коммит 835874614b
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
17 изменённых файлов: 20 добавлений и 20 удалений

Просмотреть файл

@ -145,7 +145,7 @@ Part of the data scientist's role is to demonstrate the quality and nature of th
Visualizations can also help determine the machine learning technique most appropriate for the data. A scatterplot that seems to follow a line, for example, indicates that the data is a good candidate for a linear regression exercise.
One data visualization libary that works well in Jupyter notebooks is [Matplotlib](https://matplotlib.org/) (which you also saw in the previous lesson).
One data visualization library that works well in Jupyter notebooks is [Matplotlib](https://matplotlib.org/) (which you also saw in the previous lesson).
> Get more experience with data visualization in [these tutorials](https://docs.microsoft.com/learn/modules/explore-analyze-data-with-python?WT.mc_id=academic-15963-cxa).

Просмотреть файл

@ -79,7 +79,7 @@
"\n",
"`install.packages(c(\"tidyverse\", \"tidymodels\", \"janitor\", \"ggbeeswarm\"))`\n",
"\n",
"Alternatiely, the script below checks whether you have the packages required to complete this module and installs them for you in case they are missing."
"Alternately, the script below checks whether you have the packages required to complete this module and installs them for you in case they are missing."
],
"metadata": {
"id": "KPmut75XkmXY"

Просмотреть файл

@ -48,7 +48,7 @@ You can have them installed as:
`install.packages(c("tidyverse", "tidymodels", "janitor", "ggbeeswarm"))`
Alternatiely, the script below checks whether you have the packages required to complete this module and installs them for you in case they are missing.
Alternately, the script below checks whether you have the packages required to complete this module and installs them for you in case they are missing.
```{r, message=F, warning=F}
suppressWarnings(if (!require("pacman"))install.packages("pacman"))

Просмотреть файл

@ -93,7 +93,7 @@
"\n",
"`install.packages(c(\"tidyverse\", \"tidymodels\", \"DataExplorer\", \"here\"))`\n",
"\n",
"Alternatiely, the script below checks whether you have the packages required to complete this module and installs them for you in case they are missing."
"Alternately, the script below checks whether you have the packages required to complete this module and installs them for you in case they are missing."
],
"metadata": {
"id": "ri5bQxZ-Fz_0"
@ -247,7 +247,7 @@
" filter(cuisine == \"korean\")\r\n",
"\r\n",
"\r\n",
"# Find out how much data is avilable per cuisine\r\n",
"# Find out how much data is available per cuisine\r\n",
"cat(\" thai df:\", dim(thai_df), \"\\n\",\r\n",
" \"japanese df:\", dim(japanese_df), \"\\n\",\r\n",
" \"chinese_df:\", dim(chinese_df), \"\\n\",\r\n",

Просмотреть файл

@ -64,7 +64,7 @@ You can have them installed as:
`install.packages(c("tidyverse", "tidymodels", "DataExplorer", "here"))`
Alternatiely, the script below checks whether you have the packages required to complete this module and installs them for you in case they are missing.
Alternately, the script below checks whether you have the packages required to complete this module and installs them for you in case they are missing.
```{r, message=F, warning=F}
suppressWarnings(if (!require("pacman"))install.packages("pacman"))
@ -145,7 +145,7 @@ korean_df <- df %>%
filter(cuisine == "korean")
# Find out how much data is avilable per cuisine
# Find out how much data is available per cuisine
cat(" thai df:", dim(thai_df), "\n",
"japanese df:", dim(japanese_df), "\n",
"chinese_df:", dim(chinese_df), "\n",

Просмотреть файл

@ -573,7 +573,7 @@
"source": [
"Now we have to decide which algorithm to use for the job 🤔.\r\n",
"\r\n",
"In Tidymodels, the [`parsnip package`](https://parsnip.tidymodels.org/index.html) provides consistent interface for working with models across different engines (packages). Please see the parsnip documentation to explore [model types & engines](https://www.tidymodels.org/find/parsnip/#models) and their corresponding [model arguements](https://www.tidymodels.org/find/parsnip/#model-args). The variety is quite bewildering at first sight. For instance, the following methods all include classification techniques:\r\n",
"In Tidymodels, the [`parsnip package`](https://parsnip.tidymodels.org/index.html) provides consistent interface for working with models across different engines (packages). Please see the parsnip documentation to explore [model types & engines](https://www.tidymodels.org/find/parsnip/#models) and their corresponding [model arguments](https://www.tidymodels.org/find/parsnip/#model-args). The variety is quite bewildering at first sight. For instance, the following methods all include classification techniques:\r\n",
"\r\n",
"- C5.0 Rule-Based Classification Models\r\n",
"\r\n",

Просмотреть файл

@ -158,7 +158,7 @@ Now we are ready to train a model 👩‍💻👨‍💻!
Now we have to decide which algorithm to use for the job 🤔.
In Tidymodels, the [`parsnip package`](https://parsnip.tidymodels.org/index.html) provides consistent interface for working with models across different engines (packages). Please see the parsnip documentation to explore [model types & engines](https://www.tidymodels.org/find/parsnip/#models) and their corresponding [model arguements](https://www.tidymodels.org/find/parsnip/#model-args). The variety is quite bewildering at first sight. For instance, the following methods all include classification techniques:
In Tidymodels, the [`parsnip package`](https://parsnip.tidymodels.org/index.html) provides consistent interface for working with models across different engines (packages). Please see the parsnip documentation to explore [model types & engines](https://www.tidymodels.org/find/parsnip/#models) and their corresponding [model arguments](https://www.tidymodels.org/find/parsnip/#model-args). The variety is quite bewildering at first sight. For instance, the following methods all include classification techniques:
- C5.0 Rule-Based Classification Models

Просмотреть файл

@ -472,7 +472,7 @@
"anaconda-cloud": "",
"kernelspec": {
"display_name": "R",
"langauge": "R",
"language": "R",
"name": "ir"
},
"language_info": {

Просмотреть файл

@ -5,7 +5,7 @@
"anaconda-cloud": "",
"kernelspec": {
"display_name": "R",
"langauge": "R",
"language": "R",
"name": "ir"
},
"language_info": {
@ -467,7 +467,7 @@
"id": "pLYyt5XSLXzG"
},
"source": [
"The plot shows a large reduction in WCSS (so greater *tightness*) as the number of clusters increases from one to two, and a further noticable reduction from two to three clusters. After that, the reduction is less pronounced, resulting in an `elbow` 💪in the chart at around three clusters. This is a good indication that there are two to three reasonably well separated clusters of data points.\n",
"The plot shows a large reduction in WCSS (so greater *tightness*) as the number of clusters increases from one to two, and a further noticeable reduction from two to three clusters. After that, the reduction is less pronounced, resulting in an `elbow` 💪in the chart at around three clusters. This is a good indication that there are two to three reasonably well separated clusters of data points.\n",
"\n",
"We can now go ahead and extract the clustering model where `k = 3`:\n",
"\n",

Просмотреть файл

@ -273,7 +273,7 @@ kclusts %>%
geom_point(size = 2, color = "#FF7F0EFF")
```
The plot shows a large reduction in WCSS (so greater *tightness*) as the number of clusters increases from one to two, and a further noticable reduction from two to three clusters. After that, the reduction is less pronounced, resulting in an `elbow` 💪in the chart at around three clusters. This is a good indication that there are two to three reasonably well separated clusters of data points.
The plot shows a large reduction in WCSS (so greater *tightness*) as the number of clusters increases from one to two, and a further noticeable reduction from two to three clusters. After that, the reduction is less pronounced, resulting in an `elbow` 💪in the chart at around three clusters. This is a good indication that there are two to three reasonably well separated clusters of data points.
We can now go ahead and extract the clustering model where `k = 3`:

Просмотреть файл

@ -90,7 +90,7 @@ Here they are grouped in a way that might be easier to examine:
| Average Score | Total Number Reviews | Reviewer Score | Negative <br />Review | Positive Review | Tags |
| -------------- | ---------------------- | ---------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------- | ----------------------------------------------------------------------------------------- |
| 7.8 | 1945 | 2.5 | This is currently not a hotel but a construction site I was terroized from early morning and all day with unacceptable building noise while resting after a long trip and working in the room People were working all day i e with jackhammers in the adjacent rooms I asked for a room change but no silent room was available To make thinks worse I was overcharged I checked out in the evening since I had to leave very early flight and received an appropiate bill A day later the hotel made another charge without my concent in excess of booked price It s a terrible place Don t punish yourself by booking here | Nothing Terrible place Stay away | Business trip Couple Standard Double Room Stayed 2 nights |
| 7.8 | 1945 | 2.5 | This is currently not a hotel but a construction site I was terroized from early morning and all day with unacceptable building noise while resting after a long trip and working in the room People were working all day i e with jackhammers in the adjacent rooms I asked for a room change but no silent room was available To make thinks worse I was overcharged I checked out in the evening since I had to leave very early flight and received an appropriate bill A day later the hotel made another charge without my concent in excess of booked price It s a terrible place Don t punish yourself by booking here | Nothing Terrible place Stay away | Business trip Couple Standard Double Room Stayed 2 nights |
As you can see, this guest did not have a happy stay at this hotel. The hotel has a good average score of 7.8 and 1945 reviews, but this reviewer gave it 2.5 and wrote 115 words about how negative their stay was. If they wrote nothing at all in the Positive_Review column, you might surmise there was nothing positive, but alas they wrote 7 words of warning. If we just counted words instead of the meaning, or sentiment of the words, we might have a skewed view of the reviewers intent. Strangely, their score of 2.5 is confusing, because if that hotel stay was so bad, why give it any points at all? Investigating the dataset closely, you'll see that the lowest possible score is 2.5, not 0. The highest possible score is 10.

Просмотреть файл

@ -50,7 +50,7 @@ class TimeSeriesTensor(UserDict):
- **dataset**: original time series
- **target** name of the target column
- **H**: the forecast horizon
- **tensor_structures**: a dictionary discribing the tensor structure of the form
- **tensor_structures**: a dictionary describing the tensor structure of the form
{ 'tensor_name' : (range(max_backward_shift, max_forward_shift), [feature, feature, ...] ) }
if features are non-sequential and should not be shifted, use the form
{ 'tensor_name' : (None, [feature, feature, ...])}

Просмотреть файл

@ -50,7 +50,7 @@ class TimeSeriesTensor(UserDict):
- **dataset**: original time series
- **target** name of the target column
- **H**: the forecast horizon
- **tensor_structures**: a dictionary discribing the tensor structure of the form
- **tensor_structures**: a dictionary describing the tensor structure of the form
{ 'tensor_name' : (range(max_backward_shift, max_forward_shift), [feature, feature, ...] ) }
if features are non-sequential and should not be shifted, use the form
{ 'tensor_name' : (None, [feature, feature, ...])}

Просмотреть файл

@ -50,7 +50,7 @@ class TimeSeriesTensor(UserDict):
- **dataset**: original time series
- **target** name of the target column
- **H**: the forecast horizon
- **tensor_structures**: a dictionary discribing the tensor structure of the form
- **tensor_structures**: a dictionary describing the tensor structure of the form
{ 'tensor_name' : (range(max_backward_shift, max_forward_shift), [feature, feature, ...] ) }
if features are non-sequential and should not be shifted, use the form
{ 'tensor_name' : (None, [feature, feature, ...])}

Просмотреть файл

@ -50,7 +50,7 @@ class TimeSeriesTensor(UserDict):
- **dataset**: original time series
- **target** name of the target column
- **H**: the forecast horizon
- **tensor_structures**: a dictionary discribing the tensor structure of the form
- **tensor_structures**: a dictionary describing the tensor structure of the form
{ 'tensor_name' : (range(max_backward_shift, max_forward_shift), [feature, feature, ...] ) }
if features are non-sequential and should not be shifted, use the form
{ 'tensor_name' : (None, [feature, feature, ...])}

Просмотреть файл

@ -153,7 +153,7 @@
"source": [
"## Q-Learning\n",
"\n",
"Build a Q-Table, or multi-dimensional array. Since our board has dimentions `width` x `height`, we can represent Q-Table by a numpy array with shape `width` x `height` x `len(actions)`:"
"Build a Q-Table, or multi-dimensional array. Since our board has dimensions `width` x `height`, we can represent Q-Table by a numpy array with shape `width` x `height` x `len(actions)`:"
],
"cell_type": "markdown",
"metadata": {}

Просмотреть файл

@ -253,7 +253,7 @@
"source": [
"## Q-Learning\n",
"\n",
"Build a Q-Table, or multi-dimensional array. Since our board has dimentions `width` x `height`, we can represent Q-Table by a numpy array with shape `width` x `height` x `len(actions)`:"
"Build a Q-Table, or multi-dimensional array. Since our board has dimensions `width` x `height`, we can represent Q-Table by a numpy array with shape `width` x `height` x `len(actions)`:"
],
"cell_type": "markdown",
"metadata": {}