sqlworkshops/k8stobdc
Buck Woody ededf98189
Merge pull request #242 from microsoft/buckk8tobdc
Bravo-Lock for K8's course for Modules README, 00, 01, 04 and 05.
2020-02-19 06:39:28 -08:00
..
.vscode Added stub for new workshop 2019-12-12 14:40:14 -05:00
KubernetesToBDC Merge pull request #242 from microsoft/buckk8tobdc 2020-02-19 06:39:28 -08:00
graphics Disaster recovery section added 2020-02-13 17:10:17 +00:00
scripts Storage concepts section completed and section on secrets added. 2020-02-13 12:00:22 +00:00
.DS_Store Fixed all graphics references 2020-01-28 08:57:31 -05:00
README.md Bravo-Lock for K8's course for Modules README, 00, 01, 04 and 05. 2020-02-19 09:39:07 -05:00

README.md

Workshop: Kubernetes - From Bare Metal to SQL Server Big Data Clusters

A Microsoft Course from the SQL Server team

About this Workshop

Welcome to this Microsoft solutions workshop on Kubernetes - From Bare Metal to SQL Server Big Data Clusters. In this workshop, you'll learn about setting up a production grade SQL Server 2019 big data cluster environment on Kubernetes. Topics covered include: hardware, virtualization, and Kubernetes, with a full deployment of SQL Server's Big Data Cluster on the environment that you will use in the class. You'll then walk through a set of Jupyter Notebooks in Microsoft's Azure Data Studio tool to run T-SQL, Spark, and Machine Learning workloads on the cluster. You'll also receive valuable resources to learn more and go deeper on Linux, Containers, Kubernetes and SQL Server big data clusters.

The focus of this workshop is to understand the hardware, software, and environment you need to work with SQL Server 2019's big data clusters on a Kubernetes platform.

You'll start by understanding Containers and Kubernetes, moving on to a discussion of the hardware and software environment for Kubernetes, and then to more in-depth Kubernetes concepts. You'll follow-on with the SQL Server 2019 big data clusters architecture, and then how to use the entire system in a practical application, all with a focus on how to extrapolate what you have learned to create other solutions for your organization.

NOTE: This course is designed to be taught in-person with hardware provided by the instructional team. You will also get instructions for setting up your own hardware, virtual or Cloud environments for Kubernetes for a workshop backup or if you are not attending in-person.

This github README.MD file explains how the workshop is laid out, what you will learn, and the technologies you will use in this solution.

(You can view all of the source files for this workshop on this github site, along with other workshops as well. Open this link in a new tab to find out more.)

Learning Objectives

In this workshop you'll learn:

  • How Containers and Kubernetes work and when and where you can use them
  • Hardware considerations for setting up a production Kubernetes Cluster on-premises
  • Considerations for Virtual and Cloud-based environments for production Kubernetes Cluster

The concepts and skills taught in this workshop form the starting points for:

Solution Architects, to understand how to design an end-to-end solution. System Administrators, Database Administrators, or Data Engineers, to understand how to put together an end-to-end solution.

Business Applications of this Workshop

Businesses require stable, secure environments at scale, which work in on-premises and in-cloud configurations. Using Kubernetes and Containers allows for manifest-driven DevOps practices, which further streamline IT processes.

Technologies used in this Workshop

The solution includes the following technologies - although you are not limited to these, they form the basis of the workshop. At the end of the workshop you will learn how to extrapolate these components into other solutions. You will cover these at an overview level, with references to much deeper training provided.

Technology Description
LinuxThe primary operating system used in and by Containers and Kubernetes
ContainersThe atomic layer of a Kubernetes Cluster
KubernetesThe primary clustering technology for manifest-driven environments
SQL Server Big Data ClustersRelational and non-relational data at scale with Spark, HDFS and application deployment capabilities

Before Taking this Workshop

There are a few requirements for attending the workshop, listed below:

  • You'll need a local system that you are able to install software on. The workshop demonstrations use Microsoft Windows as an operating system and all examples use Windows for the workshop. Optionally, you can use a Microsoft Azure Virtual Machine (VM) to install the software on and work with the solution.
  • You must have a Microsoft Azure account with the ability to create assets.
  • This workshop expects that you understand computer technologies, networking, SQL Server, HDFS, Spark, and general use of Hypervisors.
  • The Setup section below explains the steps you should take prior to coming to the workshop

If you are new to these, here are a few references you can complete prior to class:

Setup

A full pre-requisites document is located here. These instructions should be completed before the workshop starts, since you will not have time to cover these in class. Remember to turn off any Virtual Machines from the Azure Portal when not taking the class so that you do incur charges (shutting down the machine in the VM itself is not sufficient).

Workshop Details

This workshop uses Kubernetes to deploy a workload, with a focus on Microsoft SQL Server's big data clusters deployment for Big Data and Data Science workloads.

Primary Audience:Technical processionals tasked with configuring, deploying and managing large-scale clustering systems
Secondary Audience: Data professionals tasked with working with data at scale
Level: 300
Type: In-Person (self-guided possible)
Length: 8

Related Workshops

Workshop Modules

This is a modular workshop, and in each section, you'll learn concepts, technologies and processes to help you complete the solution.

ModuleTopics
01 - An introduction to Linux, Containers and Kubernetes This module covers Container technologies and how they are different than Virtual Machines. You'll learn about the need for container orchestration using Kubernetes.
02 - Hardware and Virtualization environment for Kubernetes This module explains how to make a production-grade environment using "bare metal" computer hardware or with a virtualized platform, and most importantly the storage hardware aspects.
03 - Kubernetes Concepts and Implementation Covers deploying Kubernetes, Kubernetes contexts, cluster troubleshooting and management, services: load balancing versus node ports, understanding storage from a Kubernetes perspective and making your cluster secure.
04 - SQL Server Big Data Clusters Architecture This module will dig deep into the anatomy of a big data cluster by covering topics that include: the data pool, storage pool, compute pool and cluster control plane, active directory integration, development versus production configurations and the tools required for deploying and managing a big data cluster.
05 - Using the SQL Server big data cluster on Kubernetes for Data Science Now that your big data cluster is up, it's ready for data science workloads. This Jupyter Notebook and Azure Data Studio based module will cover the use of python and PySpark, T-SQL and the execution of Spark and Machine Learning workloads.

Next Steps

Next, Continue to Pre-Requisites

Workshop Authors and Contributors

Legal Notice

Kubernetes and the Kubernetes logo are trademarks or registered trademarks of The Linux Foundation. in the United States and/or other countries. The Linux Foundation and other parties may also have trademark rights in other terms used herein. This Workshop is not certified, accredited, affiliated with, nor endorsed by Kubernetes or The Linux Foundation.