Documentation refactor (#40)

* Added significant project overview text; moved Getting Started and Sample App content to stand-alone markdown pages * Additional edits * Resolved PR comments * Fixed additional review comments related to typos and grammar * Add newline at end of file
2018-12-27 10:26:18 -08:00 · 2018-12-27 10:26:18 -08:00 · 4d846138e6
--- a/GettingStarted.md
+++ b/GettingStarted.md
@ -0,0 +1,8 @@
+## Quickstart
+
+This project is composed of many different pieces - This section is designed to get you up and running as quickly as possible.
+
+* The largest component of this service is the Java Backend - see [the Backend Readme](./api/README.md)
+* Our UI component is a separate service that's built using React and Webpack - see [the UI Readme](./ui/README.md)
+* To scale our service on Azure, we leverage ARM templates - see [the Infrastructure Readme](./infrastructure/README.md)
+
--- a/README.md
+++ b/README.md
@ -1,16 +1,6 @@
-# SpringDAL
+# Containerized Java REST Services on Azure App Service with a CosmosDB backend

-> Looking to get running quickly? Jump ahead to our [quickstart](#quickstart).
-
-A RESTful DAL (Database Abstraction Layer) reference implementation written using Spring.
-
-# Introduction
-
-This project provides a reference implementation for Java-based microservices with REST APIs that read and write data stored in Azure Cosmos DB. The services are hosted in containers running in Azure App Service for Containers, (FUTURE: with Azure Redis providing caching). HA/DR is provided by hosting the microservices in multiple regions, as well as CosmosDB's native geo-redundancy. Traffic Manager is used to route traffic based on geo-proximity, and Application Gateway provides path-based routing, service authentication and DDoS protection.
-
-Cosmos DB is configured to use the NoSQL MongoDB API.
-
-In order to demonstrate Cosmos DB performance with large amounts of data, the project imports historical movie data from [IMDb](https://www.imdb.com/interfaces/). See (https://datasets.imdbws.com/). The datasets include 8.9 million people, 5.3 million movies and 30 million relationships between them.
+## Project Health

 API Build Status: [![Build Status](https://dev.azure.com/csebostoncrew/ProjectJackson/_apis/build/status/GitHub%20Builds/ProjectJackson-API-GitHub?branchName=master)](https://dev.azure.com/csebostoncrew/ProjectJackson/_build/latest?definitionId=22?branchName=master)

@ -18,55 +8,71 @@ UI Build Status: [![Build Status](https://dev.azure.com/csebostoncrew/ProjectJac

 Infrastructure Build Status: [![Build Status](https://dev.azure.com/csebostoncrew/ProjectJackson/_apis/build/status/GitHub%20Builds/ProjectJackson-Infrastructure-GitHub?branchName=master)](https://dev.azure.com/csebostoncrew/ProjectJackson/_build/latest?definitionId=23?branchName=master)

-## Architecture
+## Contents:

-This solution provides a foundation to build and deploy microservices solutions, using the following technologies: 
+* Introduction & Overiew (this document)
+* [Quick Start for Developers](./GettingStarted.md)
+* [Sample Application and REST APIs](./SampleApp.md)

- Java-based microservices
- Data stored in Cosmos DB
- Redis-based caching
- High Availability & Disaster Recovery (HA/DR)
- CI/CD pipeline
- Load and failure simulators to validate scale, resiliency and failover
+## Introduction

-### API Routes
+This project was created to demonstrate end-to-end best practices building and running "enterprise-class"
+applications on Azure. This document explains what the project provides and why, and it provides instructions for getting started.

-This solution uses three kinds of models: `Person`, `Title`, and `Principal`. The `Person` model represents a person who participates in media, either in front of the camera or behind the scenes. The `Title` represents the title of the piece of media, be it a movie, a TV series, or some other kind of media. Finally, the `Principal` model and its derivative child class `PrincipalWithName` represent the intersection of Person and Title, ie. what a particular person does or plays in a specific title.
+## Enterprise-Class Applications Defined

-To meaningfully access this IMDb dataset and these models, this solution provides the following API:
+We are using the term "enterprise-class app" to refer to an end-to-end solution that delivers the following 
+capabilities:

-+ `/people`
-  + `POST` - Creates a person, and returns information and ID of new person
-  + `GET` - Returns a small number of people entries
-+ `/people/{nconst}` > nconst is the unique identifier
-  + `GET` - Gets the person associated with ID, and returns information about the person
-  + `PUT` - Updates a person for a given ID, and returns information about updated person
-  + `DELETE` - Deletes a person with a given ID, and returns the success/failure code
-+ `/people/{nconst}/titles` > nconst is the unique identifier
-  + `GET` - Gets the titles in the dataset associated with the person with specified ID and returns them in an array
-+ `/titles`
-  + `POST` - Creates a title, and returns the information and ID of the new titles
-  + `GET` - Returns a small number of title entries
-+ `/titles/{tconst}` > tconst is the unique identifier
-  + `GET` - Gets the title of piece given the ID, and returns information about that title
-  + `PUT` - Updates the title of a piece given the ID, and returns that updated information based on ID
-  + `DELETE` - Deletes the piece of media given the ID, and returns the success/failure code
-+ `/titles/{tconst}/people` > tconst is the unique identifier
-  + `GET` - Gets the people in the dataset associated with the given title, and returns that list
-+ `/titles/{tconst}/cast` > tconst is the unique identifier
-  + `GET` - Gets the people in the dataset associated with the given title who act, and returns that list
-+ `/titles/{tconst}/crew` > tconst is the unique identifier
-  + `GET` - Gets the people in the dataset associated with the given title who participate behind the scenes, and returns that list
+* **Horizontal scalability:** Add capacity by adding additional containers and/or VMs
+* **Infrastructure as code:** Create and manage Azure environments using template code that is under source control
+* **Agile engineering and rapid updates:** Use CI/CD for automated builds, tests and deployments, safe code check-ins, and frequent updates to the production environment and application.
+* **High Availability:** Design and deploy robust applications and infrastructure, so that the application continues to run normally even when some components fail or go offline.
+* **Blue/Green (aka Canary Deployments):** Rollout updates to a "green" application instance, while the existing deployment continues to run on the "blue" instance. The green instance is intially exposed to only a small number of users. Monitoring is performed to look for any degradations in service related to the green instance. If everything looks good, traffic is gradually diverted to the green instance. Should the service quality degrade, the deployment is rolled back by returning all traffic to the blue instance.
+* **Testable:** Continuously test the application in production to validate scalability, resilience, and security.
+* **Hardened:** Assure that the application and infrastructure is instrinsically resistant to attacks from bad actors, such as Distributed Denial of Service (DDoS) attacks.
+* **Networking compliance:** Comply with enterprise network security requirements, such as the use of ExpressRoute to communicate with enterprise data-centers and/or on-premises networks, and private IPs for all but public endpoints.
+* **Monitoring and Analytics:** Capture telemetry to enable operations dashboards and automatic alerting of critical issues.
+* **Service Authentication:** Allow only authorized access to services via token- or certificate-based service authentication.
+* **Simulated Traffic:**
+* **Chaos Testing:**

-For more details, check out the [Swagger documentation](./api/swagger.yml).
+## OSS Technology Choices

-### Why We Chose App Services
+Our team, Commercial Software Engineering (CSE), collaboratively codes with Microsoft's biggest and most important customers.
+We see a huge spectrum of technology choices at different customers, ranging from all-Microsoft to all-OSS. More commonly, we see a mix.

-This solution uses Azure App Services instead of Azure Kubernetes Cluster because Azure App Services provides better control over scaling the app accross regions with less configuration in this scenario.   In addition, Azure App Services has an easy-to-use, built-in load testing service that we utilize to test the container scaling of our app. Out of the box, Azure App Services offers auto-scaling, authentication, and deployment slots. In the future, because Azure App Services is a PaaS provider, we can implement [Platform Chaos](https://github.com/Azure/platform-chaos) to do chaos testing. While this approach does not provide as much control of the server itself, the deployed docker container will keep the JVM consistent across deployments.
+Given the wide range of technology choices, it's difficult to create a one-size-fits-all solution. For this project, we selected a set of technologies that are of interest to many of our customers.

-The following articles provide further details: 
- - [Container? Why not App Services?](https://blogs.msdn.microsoft.com/premier_developer/2018/06/15/container-why-not-app-services/)
- - [Azure Deployment Models](https://stackify.com/azure-deployment-models/)
+
+This OSS solution uses the following OSS technologies:
+
+* **GitHub:** Publishing this project to GitHub indicates our desire to share it widely and to encourage community contributions. 
+* **Docker:** Though there are other container technologies out there, Docker/Moby is pretty much synonymous with the idea.
+* **Java Version 8 (1.8.x):** A very common choice of programming langauages by many enterpises.
+* **Spring Boot:** One of the most widely used and capable Java frameworks.
+* **Spring Data REST:** A simple way to build REST APIs in a Spring Boot application that are backed by a persistent data repository.
+* **Maven:** A commonly used tool for building and managing Java projects.
+* **React:** Popular JavaScript framework for building UI. (Additional OSS tools used in the UI sample include TypeScript, webpack, and Jest.)
+
+## Azure Technologies & Services
+
+As with our OSS technology choices, we intentionally selected a set of Azure technologies and services that support common enterprise requirements, including:
+
+* **Azure DevOps:** Microsoft's CI/CD solution, which is the Azure-branded version of Microsoft's mature and widely used VSTS solution.
+* **Azure Resource Manager (ARM):** Azure's solution for deploying and managing Azure resources via JSON-based templates.
+* **App Services:** A robust platform-as-a-service (PaaS) solution for application hosting. App Services hides the complexity of provisioning and managing VMs, auto-scaling, creating public IPs, etc.
+
+>**Note:** App Services is appropriate for a wide range of enterprise apps, including certain highly scaled apps, though we often recommend 
+>Azure Kubernetes Service (AKS) for apps that require certain advanced capabilities.
+
+* **Cosmos DB:** Cosmos DB is perhaps the fastest and most reliable NoSQL data storage service in the world. It is an excellent choice when performance and reliability are a must, and when enterprises require multi-region write capabilities, which are essential for both application/service performance and for HA/DR scenarios.
+* **Azure Traffic Manager:** DNS-based routing service to connect users to the nearest data center. Redirects traffic to healthy location when another region goes offline. Also enables recommended method blue-green (aka canary) deployments with Azure App Services.
+* **Application Gateway:** Provides a single public end-point (public IP) and acts as a reverse proxy (based on URI path) to send requests to the correct App Service instance.
+* **App Insights:** Enterprise developers use App Insights to monitor and detect performance anomalies in production applications.
+
+The solution leverages Azure Dev Ops for Continuous Integration 
+and Delivery (CI/CD), and it deploys complete Azure environments via Azure Resource Manager (ARM) templates.

 ## Key Benefits

@ -79,17 +85,9 @@ Key technologies and concepts demonstrated:
 | CI/CD pipeline | Continuous integration/continuous delivery (CI/CD) is implemented using Azure DevOps with a pipeline of environments that support dev, testing and production
 | Automated deployment | <li>Azure ARM templates<li>App Service for Containers<li>Azure container registry
 | High Availability/Disaster Recovery (HA/DR) | Full geo-replication of microservices and data, with automatic failover in the event of an issue in any region:<br><br><li>Cosmos DB deployed to multiple regions with active-active read/write<li>Session consistency to assure that user experience is consistent across failover<li>Stateless microservices deployed to multiple regions<li>Health monitoring to detect errors that require failover<li>Azure Traffic Manager redirects traffic to healthy region
-| Infrastructure best practices | <li>Application auto-scaling<li>Minimize network latency through geo-based DNS routing<li>API authentication<li>Distributed denial of service (DDoS) protection & mitigation
-| Load and performance testing | The solution uses an integrated traffic simulator to demonstrate auto-scaling 
-| Application resiliency | A Chaos Monkey-style solution that shuts down different portions of the architecture to validate that service-level resiliency
-  
-## Quickstart
-
-This project is composed of three discrete pieces - This section is designed to get you up and running quickly.
-
-* Java Backend - this is the largest component - see [Backend Readme](./api/README.md)
-* UI component - built using React and Webpack - see [UI Readme](./ui/README.md)
-* ARM template - to deploy and scale  - see [Infrastructure Readme](./infrastructure/README.md)
+| Demonstrates insfrastructure best practices | <li>Application auto-scaling<li>Minimize network latency through geo-based DNS routing<li>API authentication<li>Distributed denial of service (DDoS) protection & mitigation
+| Load and performance testing | The solution includes an integrated traffic simulator to demonstrate that the solution auto-scales properly, maintaining application performance as scale increases
+| Proves application resiliency through chaos testing | A Chaos Monkey-style solution to shut down different portions of the architecture in order to validate that resilience measures keep everything running in the event of any single failure

 ## Contribute

--- a/SampleApp.md
+++ b/SampleApp.md
@ -0,0 +1,44 @@
+# Sample Application
+
+This project includes a Java sample application, built on the Sprint Boot Data REST framework, that exposes multiple REST services to read and write data stored in Azure Cosmos DB.
+
+The REST services are hosted in containers running in Azure App Service for Containers.
+
+HA/DR is provided by hosting the services in multiple regions, as well as Cosmos DB's native geo-redundancy.
+
+Traffic Manager is used to route traffic based on geo-proximity, and Application Gateway provides path-based routing, service authentication and DDoS protection.
+
+Cosmos DB is configured to use the NoSQL MongoDB API. *(Note: We are currently working to add a sample that uses the Cosmos SQL API.)*
+
+In order to demonstrate Cosmos DB performance with large amounts of data, the project imports historical movie data from [IMDb](https://www.imdb.com/interfaces/). See (https://datasets.imdbws.com/). The datasets include 8.9 million people, 5.3 million movies and 30 million relationships between them.
+
+## REST API
+
+We're using three kinds of models: `Person`, `Title`, and `Principal`. The `Person` model represents a person who participates in media, either in front of the camera or behind the scenes. The `Title` represents what it sounds like - the title of the piece of media, be it a movie, a TV series, or some other kind of similar media. Finally, the `Principal` model and its derivative child class `PrincipalWithName` represent the intersection of Person and Title, ie. what a particular person does or plays in a specific title.
+
+To meaningfully access this IMDb dataset and these models, there are a few routes one can access on the API.
+
+ `/people`
+  + `POST` - Creates a person, and returns information and ID of new person
+  + `GET` - Returns a small number of people entries
+ `/people/{nconst}` > nconst is the unique identifier
+  + `GET` - Gets the person associated with ID, and returns information about the person
+  + `PUT` - Updates a person for a given ID, and returns information about updated person
+  + `DELETE` - Deletes a person with a given ID, and returns the success/failure code
+ `/people/{nconst}/titles` > nconst is the unique identifier
+  + `GET` - Gets the titles in the dataset associated with the person with specified ID and returns them in an array
+ `/titles`
+  + `POST` - Creates a title, and returns the information and ID of the new titles
+  + `GET` - Returns a small number of title entries
+ `/titles/{tconst}` > tconst is the unique identifier
+  + `GET` - Gets the title of piece given the ID, and returns information about that title
+  + `PUT` - Updates the title of a piece given the ID, and returns that updated information based on ID
+  + `DELETE` - Deletes the piece of media given the ID, and returns the success/failure code
+ `/titles/{tconst}/people` > tconst is the unique identifier
+  + `GET` - Gets the people in the dataset associated with the given title, and returns that list
+ `/titles/{tconst}/cast` > tconst is the unique identifier
+  + `GET` - Gets the people in the dataset associated with the given title who act, and returns that list
+ `/titles/{tconst}/crew` > tconst is the unique identifier
+  + `GET` - Gets the people in the dataset associated with the given title who participate behind the scenes, and returns that list
+
+For more details, check out the [Swagger documentation](./api/swagger.yml).