зеркало из
1
0
Форкнуть 0

intros to each technical ML learning section

This commit is contained in:
Mike Tokic 2021-10-18 15:05:57 -07:00
Родитель 05345088d5
Коммит 52795844c8
1 изменённых файлов: 36 добавлений и 0 удалений

Просмотреть файл

@ -23,6 +23,8 @@
<!-- ![](images/3.2.gif) -->
<!-- [source](https://dilbert.com/strip/2018-08-03) -->
Getting started with the right developer environment can save tons of headaches further down the road. While there are many options on what type of Interactive Developer Environment's (IDE) to use, the below ones are quickly becoming the standard for each language.
### Python
* [Coding Tools for Python Development](https://docs.microsoft.com/en-us/learn/modules/install-code-tools-python-nasa/)
* [VSCode Python Setup](https://www.youtube.com/playlist?list=PLo32uKohmrXt7VB91DLB-hMcDcWswE53Y)
@ -38,6 +40,8 @@
<!-- ![](images/3.3.gif) -->
<!-- [source](https://dilbert.com/strip/2017-07-10) -->
Learning how to manipulate data outside of existing tools like Excel or Power BI quickly give you data super powers you never thought possible before. Breaking out of the four walls of excel and into the data universe by leveraging languages like Python and R unlock so much more potential for impact in whatever job you do. Even if you don't plan to build your own Machine Learning models, knowing the basics of data manipulation is an important skill to have, and builds a data foundation that Machine Learning is built upon if you ever want to come back and start building models.
### Python
* Pick one of the following
+ [Option 1 - Free Code Camp](https://www.freecodecamp.org/learn/data-analysis-with-python/)
@ -76,6 +80,8 @@
<!-- ![](images/3.4.gif) -->
<!-- [source](https://dilbert.com/strip/2000-09-21) -->
If you plan to work with others on any project that contains code, knowing version control and specifically git is a must. Get up to speed with how to use git and it's most famous git server, GitHub. This skill opens up new opportunities to contribute to open source projects and even build your own open source software. It's also required to work on any technical team who collaborate on projects together.
### High Level Overview
* [Git and GitHub for Beginners](https://www.youtube.com/watch?v=RGOj5yH7evk&t=491s)
@ -98,6 +104,10 @@
<!-- ![](images/3.5.gif) -->
<!-- [source](https://dilbert.com/strip/2014-07-04) -->
Let's get our feet wet on the introductory concepts of machine learning. Learn more of the terminology, build a few models, and start to understand how the data science life cycle starts to take shape. This section is by no means a comprehensive view of machine learning today, but it's a good starting point.
Future sections will cover most of these topics again but in more depth. Having some repetition of terms and concepts will help reinforce the knowledge in your brain and help you understand how there is always different angles to attack data problems with machine learning.
### High Level Topics
* [Three Things to do when Starting Out in Data Science](https://www.youtube.com/watch?v=ilUbD7EoQnk)
@ -144,6 +154,8 @@
<!-- ![](images/3.6.gif) -->
<!-- [source](https://dilbert.com/strip/2016-06-21) -->
Regression deals with predicting numerical quantities. It will quickly become your bread and butter for leveraging machine learning in finance. Understanding how to use software packages to train models and how each model works are both crucial to leveraging regression techniques to the fullest. Most of the resources here deal with examples of regression in action. Take time to soak in how these tutorials and experts approach a regression problem, how they structure their code, and the way they communicate the outputs.
### High Level Topics
* [Making Friends with Regression](https://www.youtube.com/watch?v=WNvOtwP_yf4)
@ -189,6 +201,10 @@
<!-- ![](images/3.7.gif) -->
<!-- [source](https://dilbert.com/strip/2001-12-29) -->
Time series forecasting is a sub domain of regression, where we are trying to forecast a numerical quantity over time. Prediction over time is a separate world in machine learning, and has deep roots in more classic statistical methods.
While most regression models can be turned into a time series model by incorporating various date based features, there are also traditional statistical models that have been solely used for time series forecasting for decades. An interesting component of time series forecasting is that it can use multivariate data as well as univariate. For example you could forecast sales revenue by just using previous historical values of sales revenue (univariate) or use external regressor information like country holidays and population size to help forecast (multivariate). Knowing both types of models is a key component of being an expert time series practitioner.
### High Level Topics
* [Time Series Forecasting with Machine Learning](https://www.youtube.com/watch?v=_ZQ-lQrK9Rg)
@ -228,6 +244,10 @@
<!-- ![](images/3.8.gif) -->
<!-- [source](https://dilbert.com/strip/2018-08-02) -->
Classification models try to forecast an outcome of an event. For example if a credit card transaction is fraud or if a self-driving car sees a stop sign next to the road. Usually the prediction outcome is a binary yes or no, and oftentimes a probability score between 0 and 1. With 1 having a 100% probability of something occurring. Classification models can even predict an outcome across multiple categories or buckets, like if a picture of a fruit is an apple, pear, orange, etc.
Classification models are some of the most widely used machine learning across industries today. Within finance there are many important implementations that range from compliance to risk management.
### High Level Topics
* [Classification in Machine Learning](https://www.youtube.com/watch?v=pXdum128xww) (skip tutorial at end)
@ -278,6 +298,8 @@
<!-- ![](images/3.9.gif) -->
<!-- [source](https://dilbert.com/strip/2018-01-06) -->
Unsupervised learning is an evolving field of machine learning, and many say is the future of AI in general. Instead of relying on existing data with known outcomes to learn from like supervised learning (regression and classification), unsupervised learning tries to learn its own unique things about a data set without needing to know the answer ahead of time. This can be a game changer in finance when trying to segment customers into specific groups based on their purchasing behavior or finding anomolies to flag for potential fraud or corruption.
### Python
* [Train and Evaluate Clustering Models](https://docs.microsoft.com/en-us/learn/modules/train-evaluate-cluster-models/)
@ -301,6 +323,8 @@
<!-- ![](images/3.10.gif) -->
<!-- [source](https://dilbert.com/strip/2017-07-13) -->
Natural language processing (NLP) is all about extracting insight from unstructured data in the form of text. Our world is drowning in openly available text from twitter, blogs, and countless documents like PDFs that could be useful within our jobs in finance. Knowing how to extract insights out of a pile of documents is a super power worth learning about!
### Python
* [Natural Language Processing: Kaggle](https://www.kaggle.com/learn/natural-language-processing)
@ -325,6 +349,8 @@
<!-- ![](images/3.11.gif) -->
<!-- [source](https://dilbert.com/strip/2021-03-12) -->
The most rapidly evolving area of AI is deep learning, which use a completely new modeling architecture called neural networks. Most of the most exciting advancements in AI over the last decade have come from training neural networks on huge data sets. Deep learning has the potential to totally change how we build any type of prediction across all types of machine learning.
### High Level Topics
* [Deep Learning Crash Course](https://www.youtube.com/watch?v=VyWAvY2CF9c)
@ -354,6 +380,8 @@
<!-- ![](images/3.12.gif) -->
<!-- [source](https://dilbert.com/strip/1998-12-10) -->
A lot of times you may be asked to help understand how a particular machine learning model came up with its prediction. Knowing how to leverage various interpretability frameworks helps decode the black box of these models for better adoption by non-technical business partners and enables better understanding what features are most impactful in your model.
### High Level Topics
* [Interpretable Machine Learning](https://christophm.github.io/interpretable-ml-book/)
@ -374,6 +402,8 @@
<!-- ![](images/3.13.gif) -->
<!-- [source](https://dilbert.com/strip/2015-02-10) -->
With great power, comes great responsibility. As machine learning becomes more ingrained in our society, ethical consequences of poorly deployed models will only increase. Make sure you are building models that help enrich a diverse and inclusive future by checking out the below resources.
### High Level Topics
* [Intro to AI Ethics: Kaggle](https://www.kaggle.com/learn/intro-to-ai-ethics)
@ -387,6 +417,8 @@
## Web Apps
Building user interfaces that bring machine learning models directly to the end user to consume code free can be a total game changer for your business partners. You don't have to be a web developer to build applications that your users will love thanks to some amazing packages within the data science community. Check them out below.
<!-- ![](images/3.14.jpg) -->
<!-- [source](https://dilbert.com/strip/2016-03-06) -->
@ -409,11 +441,15 @@
<!-- ![](images/3.15.gif) -->
<!-- [source](https://dilbert.com/strip/2014-01-17) -->
One of the harder aspects of machine learning is getting your work in a production environment to run at scale. This involves loading models to run in a cloud like Microsoft Azure.
## Life as a Data Scientist
<!-- ![](images/3.16.gif) -->
<!-- [source](https://dilbert.com/strip/2018-04-03) -->
Ready to commit to data science as a career? Check out the below content that features interviews from existing data scientists and best practices to be a great data practitioner.
### Build Models and Build Community
### Coding Best Practices