Building Microservices, 2nd Edition=Sam Newman;Note=Erxin.txt

Building Microservices, 2nd Edition=Sam Newman;Note=Erxin

# What are microservices 
- They are a type of service-oriented architecture, albeit one that is opinionated

- Independent deployability is the idea that we can make a change to a microservice, deploy it, and release that change to our users

front end <=> back end <=> dba 

- modular monolith, single process monolith, modular with decoupled db monolith 

- much simpler developer workflows, and monitoring, troubleshooting, and activities like end-to-end testing can be greatly simplified as well.

- Log Aggregation and Distributed Tracing

With the increasing number of processes you are managing, it can be difficult to understand how your system is behaving in a production setting

- Container orchestration platforms like Kubernetes do exactly that, allowing you to distribute container instances in such a way as to provide the robustness and throughput your service needs

- Streaming, Kafka, dedicated stream-processing solutions like Apache Flink. Debezium is an open source tool developed to help stream data

- Heterogeneity architecture 

document store 

graph db 

blob store 

- robustness 

- scaling 

smaller services can scale just those services need scaling, run other parts of the system on smaller less powerful hardware 

- compatibility
- microservice pain points, like the JVM can limit the number of microservices that can be run on a single developer machine. 
- a monolithic system, if our CPU is stuck at 100% for a long time, we know it’s a big problem.
- monitor performance 

- When working with smaller teams with just a handful of developers, I’m very hesitant to suggest microservices for this reason.



# How to model microservices 
- information hiding 
- coupling 

loosely coupled, a change to one service should not require a change to another.
 
A structure is stable if cohesion is strong and coupling is low.5

-  shared models between two services 



# Splitting the monolith 
- Prematurely decomposing a system into microservices can be costly, especially if you are new to the domain. In many ways, having an existing codebase you want to decompose into microservices is much easier

- hotspot view in CodeScene, helping identify parts of the codebase that change frequently

- decomposition by layer 

- code first 
- data first, extracted first before application code 
- decompositional patterns 

    + strangler fig pattern, a term coined by Martin Fowler. Inspired by a type of plant, the pattern describes the process of wrapping an old system with the new system over time, allowing the new system to take over more and more features of the old system incrementally.
    
    If the functionality is still provided by the monolith, the call is allowed to continue to the monolith itself.
    
    + parallel run 
    
    + A feature toggle is a mechanism that allows a feature to be switched off or on, or to switch between two different implementations of some functionality. 
    
- data decomposition concerns 
    + performance 
    + data integrity 
    + transactions 
    + tooling
    
    https://flywaydb.org/, Version control for your database
    https://www.liquibase.org/, Liquibase Community is an open source project that helps millions of developers rapidly manage database schema changes.
    
    + reporting database 
    
    
    
# Microservice communication styles 
- from in-process to interprocess 
- changing interfaces, communication between microservices 
- error hanlding 

carash 
omission 
timing 
response 
arbitrary 

- styles of microsservices communication 
synchronous blocking 
asynchronous nonblocking 
request response 
event driven 
common data, collaborate via some shared data source 
just an id 

- kafka have limit message size to 1mb, rabbitmq is 500mb. the performance of the kafka is better than rabbitmq 



# Implementing microservice communication 
- APIs used for communication between microservices technology agnostic. This means avoiding integration technology that dictates what technology stacks
- technical choices 
remote procedure calls 
REST
graphql 
message broker 

- remote procedure calls 
    + Most of the technology in this space requires an explicit schema, such as SOAP or gRPC. In the context of RPC, the schema is often referred to as an interface definition language (IDL), with SOAP referring to its schema format as a web service definition language (WSDL)
    
    + Java RMI, calls for a tighter coupling between the client and the server
    
    + gRPC, SOAP, and Thrift are all examples that allow for interoperability between different technology stacks
    
- REST over HTTP application protocol like you can with RPC implementations. This has often lead to people creating REST APIs that provide client libraries for consumers 
- GraphQL has gained more popularity, due in large part to the fact that it excels in one specific area. Namely, it makes it possible for a client-side device to define queries that can avoid the need to make multiple requests to retrieve the same information
    
- message brokers 
kafka 
rabbitmq 
rocketmq 

- serialization formats 

    + textual formats 
    
    json, xm, json5 https://json5.org/
        
    avro
    ```
    {
      "type": "record",
      "name": "thecodebuzz_schema",
      "namespace": "thecodebuzz.avro",
      "fields": [
        {
          "name": "username",
          "type": "string",
          "doc": "Name of the user account on Thecodebuzz.com"
        },
        {
          "name": "email",
          "type": "string",
          "doc": "The email of the user logging message on the blog"
        },
        {
          "name": "timestamp",
          "type": "long",
          "doc": "time in seconds"
        }
      ],
      "doc:": "A basic schema for storing thecodebuzz blogs messages"
    }
    ```

    + binary format 
    
    + schemas 
    
- coexist incompatible microservice versions 

- tracking usage 
- extreme measures 
- sharing code via libraries 
- service discovery 
- domain name system 
- dynamic service registries 

    + zookeeper, hadoop project 
    
    + consul, both configuration management and service discovery 
    
    + etcd and kubernetes, Kubernetes is no different, and it comes partly from etcd, a configuration management store bundled with Kubernetes
    
    + rolling your own 

- service mesh and API gate way 

Common features implemented by service meshes include mutual TLS, correlation IDs, service discovery and load balancing, and more. 

Kubernetes, you would deploy each microservice instance in a pod with its own local proxy

Monzo is one organization that has spoken openly about how its use of a service mesh, https://oreil.ly/5dLGC

-  OpenAPI as a schema format

- During the early evolution of SOA, standards like Universal Description, Discovery, and Integration (UDDI) emerged to help us make sense of what services were running.



# Workflow 
- ACID transaction 

atomicity, ensures the operations attempted with the transaction either all complete or all fail 

consistency

isolation 

durability, transaction has been completed 

- transactional boundary 

- distributed transactions two phase commits 

two phase broke into two phase, one is voting phase and second is committing phase 

When a two-phase commit works, at its heart it is very often just coordinating distributed locks. The workers need to lock local resources to ensure that the commit can take place during the second phase.

two-phase commits are typically used only for very short-lived operations. The longer the operation takes, the longer you’ve got resources locked!

- a saga is by design an algorithm that can coordinate multiple changes in state, but avoids the need for locking resources for long periods of time. A saga does this by modeling the steps involved as discrete activities that can be executed independently. 

- reordering workflow steps to reduce rollbacks 



# Build 
- CI 
build > compile and fast test > slow tests > performance tests > production 


# Deployment 
- multiple instances 

- don't share database 
- microservice deployment 
isolated execution 
focus on automation 
infrastructure as code, configuration store this code in source control, allow environments to be re-created 
zero-downtime deployment, take independent 
desired state management 

- isolation 
physical machine > virtual machine > container 

- Automation is also how you can make sure that your developers still remain productive. 

- infrastructure as code 

specialist tools in this area such as Puppet, Chef, Ansible, and others, all of which took their lead from the earlier CFEngine

- AWS CloudFormation and the AWS Cloud Development Kit (CDK) are examples of platform-specific tools

 flexibility of a cross-platform tool like Terraform.
 
- Desired state management is the ability to specify the infrastructure requirements you have for your application

Kubernetes is one such tool that embraces this idea, and you can also achieve something similar using concepts such as autoscaling groups on a public cloud provider like Azure or AWS

-  shutting down the managed virtual machine instances (provided by AWS’s EC2 product) to save money

- GitOps, a fairly recent concept pioneered by Weaveworks, brings together the concepts of desired state management and infrastructure as code

-  infrastructure configuration tools like Chef or Puppet, this model is familiar for managing infrastructure. When using Chef Server or Puppet Maste

- deployment options 

physical machine 
virtual machine 
container 
application container, a microservice instance is run inside an application container 
platform as a service, include heroku, google app engine and aws beanstalk 
function as a service, a microservice instance is deployed as one or more functions which are run an dmanaged by an underlying platform like aws lambda or azure functions 

- vm 

Type 2 virtualization is the sort implemented by AWS, VMware, vSphere, Xen, and KVM

- container 
    + linux 
    
    A container can run its own operating system, but that operating system makes use of a part of the shared kernel—it’s in this kernel that the process tree for each container lives
    
    + Microsoft reacted to this by creating a cut-down operating system called Windows Nano Server. The idea is that Nano Server should have a small-footprint OS and be capable of running things like microservice instances. 
    
    still need 1gb well linux like alpine will take up only a few mb 
    
    + With process isolation, each container runs in part of the same underlying kernel, which manages the isolation between the containers. With Windows containers, you also have the option of providing more isolation by running containers inside their own Hyper-V VM. This gives you something closer to the isolation level of full virtualization
    
- fitness for microservices 

- Wasm is an official standard that was originally defined to give developers a way of running sandboxed programs written in a variety of programming languages on client browsers
    
- a big fan of Pulumi, which eschews the use of domain-specific languages (DSLs) in favor of using normal programming languages to help developers manage their cloud infrastructure. 

- kubernetes cluster 

have multple pod, each pod will contain several containers, each pod have a public port to connect outside world 
- multitenancey and federation 

federation management software 

- The Cloud Native Computing Foundation (CNCF for short) is an offshoot of the nonprofit Linux Foundation. The CNCF focuses on curating the ecosystem of projects to help promote cloud native development

- Knative is an open source project that aims to provide FaaS-style workflows to developers, using Kubernetes under the hood. 

- projects like OpenFaaS are already being used in production by organizations all over the world

The people you’ll have managing the platform may need a deeper dive. Katacoda has some great online tutorials for coming to grips with the core concepts

- progressive delivery 

    + separating deployment from release 
    
    + on to progressive delivery, ability to control the potential impact of our newly released software.
    
    + feature toggles 
    
    + canary release 
    
    To err is human, but to really foul things up you need a computer.7

    Tools like Spinnaker for example have the ability to automatically ramp up calls based on metrics, such as increasing the percentage of calls to a new microservice version
     
    + parallel run 



# Testing 
- types of tests 

acceptance testing 

exploratory testing 

unit testing

property testing , response time, scalability, performance, security tools 

- UI tests end-to-end tests, which I’ll do from this point forward.

- Service tests are designed to bypass the user interface and test our microservices directly. 

- End-to-end tests are tests run against your entire system. Often they will be driving a GUI through a browser

- standard way to handle end to end test across services 

- Pact is a consumer-driven testing tool that was originally developed in-house at realestate.com.au 

https://docs.pact.io/

- from preproduction to in-production testing 

Smoke tests are another example of in-production testing. 

Canary releases, which we covered in Chapter 8, are also a mechanism that is arguably about testing.

- making testing in production safe 

- mean time between failures (MTBF) and optimizing for mean time to repair (MTTR).

- cross functional testing 



# From monitoring to observability 
- single microservice, mulitple servers 

- building blocks for observability 

log aggregation 

metrics aggregation 

distributed tracing, tracking a flow of calls across multiple microservice 

are you doing ok 

alerting 

semantic monitoring 

testing in production 

- Elasticsearch itself, and therefore much of the Elastic company as a whole, was built on technology

- Kibana was a laudable attempt to create an open source alternative to expensive commercial options like Splunk

- log aggregation options, a big fan of Humio, many people like to use Datadog for log aggregation

- shortcomings 

- metridcs aggregations 

Prometheus

- A/B testing, deploy two different versions of the same functionality with users seeing either the A or B 

- chaos engineering 

- standardization 

- democratic 

- easy to integrate 



# Security 
- topics 

core principles 

five functins of cybersecurity 

fundationos application security 

implicit trust versus zero trust 

securing data 

authentication and authorization 

- core principles 

automation 

build security into the delivery process 

-  such as Brakeman for Ruby; there are also tools like Snyk, which among other things can pick up dependencies on third-party libraries 

- userful five part model for various activities in cybersecurity 
identify who your potental attackers 

protect your key assets 

detect if an attack has happend 

respond when you found out something bad has occurred 

recover in the wake of an incident 

- gdpr 

https://pactflow.io/gdpr-dpa/

- elastic stack 

https://www.elastic.co/cn/kibana/

- dynamic threat analysis for containers 

https://www.aquasec.com/products/container-analysis/

- splunk, the data to everything platform 

- secrets 

 secrets that a microservice might need include:

Certificates for TLS

SSH keys

Public/private API keypairs

Credentials for accessing databases

- limiting scope 

- backups 

- implicit trust versus zero trust 

- server to server authentication 

- single sign-on (SSO) solution to ensure that a user has only to authenticate themselves only once per session

- implementation of OAuth 2.0, based on the way Google and others handle SSO. It uses simpler REST calls

- decentralizing authorization 

SAML, JWT 



# Resilience 
- four concepts of resilience 

robustness, absorb expected perturbation 

rebound, ability to recover after a traumatic event 

graceful extensibility, deal with a situation that is unexpected 

sustained adaptability, changing environments, stakeholders, demands 

graceful extensibility 

sustained adaptability 

- timeout, retries, 
- redundancy 
- middleware 
- In idempotent operations, the outcome doesn’t change after the first application, even if the operation is subsequently applied multiple times
- spreading your risk 

- CAP theorem,  consistency, availability, and partition tolerance. Specifically, the theorem tells us that we get to keep two in a failure mode.


# Scaling 
- four axes of scaling 

veritcal, getting a bigger machine 

horizontal duplication, having multiple things capble of doing the same work 

data partitioning, dividing work based on some attribute of the data 

functional decomposition, separation of work based on the type 

- autoscaling 



# Userinterfaces
- page based decomposition 

- widget based decompoistion 

react etc. 

- central aggregating gate way 

- multiple concerns 

- graphql 



# Organization structures 
- loosely coupled organizations 

- conway's law, software's structure is a reflection of the organization's structure 

- Adding manpower to a late software project makes it later.

- small teams has the autonomy to do the job it is responsible for. This means we need to give the teams more power to make decisions,

- Amazon’s famous Two-Pizza Teams, most people miss the point. It’s not about the team’s size. It’s about the team’s autonomy, a small team within an organization to operate independently and with agility.

- No matter how it looks at first, it’s always a people problem.



# evolution architecture 
- summary 

strategic goals    <-     architectural principles    <-    design and delivery practices 

enable scalable business 

support entry into new markets 

support innovation in existing markets