All Things Cloud

Wednesday, October 14, 2020

Why is AWS Number #1 in the Cloud ??

I will let you in on a secret. It's not their gazillion managed services. It's not their 5 year head start in the cloud. Its because of the AWS Workload consumption engine. The AWS engine to drive consumption of cloud services is par none.

AWS does a phenomenal job of sparking consumption .. their key assets being

1. 7Rs framework

2. AWS Migration Hub (Migration Tracking, Migration tooling for discovery and planning, Migration Technology for app profiling and workload discovery) https://aws.amazon.com/migration-hub/

3. AWS Migration Acceleration Program https://aws.amazon.com/migration-acceleration-program/

4. AWS Cloud Adoption Framework https://aws.amazon.com/professional-services/CAF/

5. AWS Well Architected Framework https://aws.amazon.com/architecture/well-architected/

6. AWS Managed Services https://aws.amazon.com/managed-services/

Sparking consumption is a combination of these plays and needs everyone (Pre-sales, Solutions, PSO & Partners) to execute with assets and programs to match.

This is the bar set for all hyperscalars and wannabes :-)

Insights From State Of Spring 2020 Survey

From a strategy perspective the State Of Spring 2020 survey https://tanzu.vmware.com/content/ebooks/state-of-spring-2020 - a

survey of 1024

developers - is a must read. Here is the TL;DRTop 10 insights here as it pertains to ( => ) GTM
-----------------------------------------------------------------------

The majority of the growth in Spring Boot applications, will come from new projects (82%), followed by enhancements to existing apps (56%), and migration of legacy apps (53%).
By far the largest use case for Spring is now development of internal and external APIs
Almost all respondents (95%) will containerize their Spring Boot apps, with 65% already doing so and a further 30% planning to.
Of the 95% of Spring users that are containerizing their apps, 44% have already deployed on Kubernetes, and a further 31% plan to do so within the next 12 months. Migration of containerized Spring Boot apps to Kubernetes is well underway, and most plan to complete the migration in a 12-month time window
Other than core-spring and spring-boot, Spring Security, Spring Data and Spring MVC are the top projects with Spring-webflux getting a honorable mention
Reactive (Project Reactor) and Serverless (Spring Cloud Functions) significantly lag the use of spring for microservices
Spring Cloud Portfolio is popular. 1) Spring Cloud Services 2) Spring Cloud Gateway 3) Spring Cloud Data Flow
Respondents recognize that the scale, strength and diversity of the open source Spring ecosystem is one of the greatest assets of the platform.
Developers struggle to understand what all the components do and how to apply them, while a smaller group are looking for greater runtime efficiencies for their Spring-based applications like reduced memory footprint and start time.
Enthusiasm for GraalVM is growing, with 14% already using it to some degree whilst another 26% plan to use it, enabling a reduction in memory usage and faster startup times.

Wednesday, August 19, 2020

Why Hybrid Cloud ?

# Why Hybrid Multi-cloud

On-prem, AWS, GCP (... alibaba, Ibm cloud, ORCL)
1. Insurance against vendor lockin
2. Leverage the power of hyper cloud providers
  - on-demand IaaS
  - value added services
3. Regulatory
4. Disaster Recovery across providers
5. ORCL, VMware, IBM legacy DC incumbency reasons
  - vmc for aws, vmc for gcp, vmc for azure
  - zero change VM migration
  - vmc control plane for VMs
6. 2-10% penetration of public cloud
7. Proliferation of platforms - SINGLE Control multi-cloud plane
  - CF
  - Mesosphere
  - k8S (anthos, tmc, arc, crossplane, ..... )
  - SINGLE CONTROL PLANE

# Replication Edge <=> Central

  - Edge workloads will 75% cloud workloads
  - Call back home - Edge Architecture
  - Replicate -> offline-online
    - Data constraints
    - Network constraints
  - STATE Management
  - Cell Towers, POS, Autonomous Robots

# Stateful workloads - Databases

- Operational pain - engineers
- containers, VMs (legacy)
- Operators
- STATEFUL Sets
- CSI
- Portworx
- Performance reasons for staying on a VM
- self-service, faster changes, choice
- Data resiliency, backup, DR (BCDR)
- Gemfire replication (pattern)

# Best Practices on active-active for data

https://tanzu.vmware.com/content/white-papers/multi-site-pivotal-cloud-foundry-deployment-topology-patterns-and-practices

Monday, August 17, 2020

Modernizing Powerbuilder Apps

If you have a massive data window system baked into a Powerbuilder client application persisting to a Oracle database you are likely in a world of pain. This is a business critical system that generates billions of dollars of revenue. Multiple attempts to modernize the system have cratered due to big-bang vendor product driven, technology infatuated soirees. You want to dip your toes into the modern cloud native microservices developer friendly world but have been burnt twice already. You have no interest or inclination in fielding yet another rewrite the system from scratch in 2 years sales pitch. What the hell do you do ? Hit the Bottle ? Pay ORCL another 25M$ in licensing fees. Put a mask on and upgrade Powerbuilder to 2019 R2 ? Is there a way out of this black hole ?

Yes. But it ain't easy.

First acknowledge that this is a tough task. The short-cuts were already taken which is why we are at this F*#$ed up point.

Here is a path that can be trodden. A couple of the smart guys in data Alastair Turner and Gideon Low have heavily influenced my thinking on this topic.

First figure out what is the primary driver for modernization Cost or Productivity. Depending on the answer a different set of strategies needs to be followed.

Cost

Let's assume all your customer support professionals are well versed in the screens and actually love the data window UI. The application is functional and can be enhanced with speed in production. The only issue the licensing and cost associated with running Powerbuilder. In such a scenario perhaps migration is a better option i.e. migrate all the data to Postgres. Check to see if your SAP or Appeon version of Powerbuilder supports PostgresSQL as a backend. You might be so bold as to consider migrating your DB to the cloud with Amazon Database Migration service.

Depending on the cost you may choose to use code generators that auto generate Java or C# code from Powerbuilder libraries. Both Appeon and Blu Age have such tools; however buyer beware. Any tool that charges for modernization by LOC is immediately suspect. Code Generators are like Vietnam - easy to get in, hard to get out.

Productivity

You want to develop new features and microservices and expose APIs to newer channels and other consumers of the service. Here you have a fork in the road.

1. Upgrade to the latest GA version of GA Powerbuilder 2019 R2 and begin an expensive project to RESTify Existing DataWindows as webservices. The limitation with using conversion tools is that you don't really get a chance to identify and fix various classes of important problems—you don't pay-down your technical debt. This is like trading one form of debt for another like replacing your high interest rate debt from Visa with a slightly lower interest rate debt from Capital One. What is in your wallet ?

2. The RIGHT Way. Start by cataloging the business use-cases, and rebuild each one. The legacy system's only real role is to validate that behavior hadn't changed. If you can't get the business rules from the business you will need to reverse-engineer the stored procedures and persistence using tools like ER Diagrams, SchemaSpy or leverage Oracle Dependency Catalog utilities to determine the object dependency tree. Visualization tools can't hurt, but SO MUCH is usually trapped in the stored procedure logic that their use can be as much a hindrance as anything. A good way to visualize the entire business process is to leverage techniques from the app world like Event Storming or User Journey, Workflow, Interaction mapping.

There is no substitute for hard work. There is no real substitute for case by case refactoring. Start with documenting the problem, visualising it and identifying the starting points is of value. Thereafter pick a particular steel thread aka end to end slice and identify the right set of APIs. Leverage the tactical patterns listed below from Sam Newman's Book Monoliths To Microservices (chapter #4) for decomposing data and pulling out services. .

Start the journey of 10,000 miles with the first step. Start small, iterate and demonstrate value to business at the end of each iteration. This is the only way to be successful in the long term with any modernization of a critical complex system at scale.

Good Luck !!

References

See how ZIM Shipping moved from .NET/PowerBuilder development to Spring Boot with Tanzu Application Service

Common Process Challenges When Breaking Monoliths

3 difficult challenges that we often come across when our team works with clients as they try to break monoliths:

1. Silver Bullets - Enterprises have been burnt by vendor solutions that promise migration with a tool like BPM or some silver bullet methodology that promises seamless migration. The Truth is that disentangling your monolith's code and data is going to get messy. The entire business process will need to be disaggregated, visualized and seams will need to be identified to create a blueprint for a target architecture. Other than Pivotal/VMware there is no one else who has done this at enterprise scale. Our approach modernizes the monolith incrementally with demonstrated business value in weeks not years.

2. Over Engineering - It is common to get distracted by technology choices and deployment options rather than focus on the difficult work of understanding the core domain. Identifying the business capabilities and assessing if they are core or supporting domains. Do what is sustainable and what your average (not rockstar) software engineers can support and focus on the outcomes.

3. Pressure Cooker: When clients continuously keep changing their priorities or lose faith in the progress or micromanage the design then it subverts the process and the target architecture looks like the old system. Breaking monoliths with Domain Driven Design is like landing the plane from 30,000 feet. You cannot skip phases and go straight to user story backlog generation with somebody else's domain model or DDD process. Don't short circuit steps in the process. It is critical to follow the strategic and tactical patterns and land the plane with gradual descent to an organized user story backlog.

also checkout Eight Pitfalls Of Application Modernization (m11n) when Practicing Domain Driven Design (DDD)

Saturday, August 15, 2020

Eight Pitfalls Of Application Modernization (m11n) when Practicing Domain Driven Design (DDD)

1. High Ceremony OKRs

An application modernization effort is necessary broad in scope. Overly constraining the goals and objectives often results in premature bias and could lead to solving the wrong problems. Driving out premature Key Results means that you are tracking and optimizing for the wrong outcomes. If you are uncertain of the goals of modernization or if this is your first time refactoring and rewriting a critical legacy system, it is best to skip this step and come back to it later.

2. Mariana Trench Event Storming

There are multiple variants of Event Storming (ES). ES can be used for big picture, detailed system design, greenfield exploration, value stream mapping, journey mapping, event modeling etc., Beware of going too deep into existing system design with ES when modernizing a system. If the intent of ES is to uncover the underlying domains and the services it is counterproductive to look at ALL the existing constraints and hot spots in the system. Model enough to get started to understand and reveal the seams of the domain. If you go deep with Event Storming you will bias or over-correct the design of the new system with the baggage and pain of the old.

They key to success with Event Storming is to elicit the business process i.e. how the system wants to behave and not the current hacked up process. When there is no representation from the business there is no point in doing event storming.

3. Anemic Boris

When modeling relationships between bounded contexts it is critical to fully understand the persistence of data, flow of messages, visualization of UI and the definition of APIs. If these interactions are not fully flushed out the Boris diagram becomes anemic and this in turn reflects an anemic domain model.

4. Picking Thin Slices

As you start modeling the new system design the end to end happy path of the user workflow first. Pick concrete scenarios and use cases that add value to the business. Those that encounter the maximum pain. The idea here is to validate the new design cheaply with stickies instead of code and not get stuck in error or edge cases. If you don't maintain speed and cycle through multiple end to end steel threads you may may be prematurely restricting the solution space.

5. Tactical Patterns - The Shit Sandwich

As you start implementing the new system and co-existing with the new there are a few smells to watch for the biggest one being the Shit Sandwich. “Shit Sandwich” - When you are stuck doing DDD on a domain that is sandwiched between upstream and downstream domains/services that cannot be touched thereby overly constraining your decomposition and leading to anemic modeling since you cannot truly decompose a thin slice. You spend all your time writing ACLs mapping data across domains.

So the Top bun is the downstream service, the bottom bun is the upstream service, then there are two layers of cheese - which are the ACLs and then your core domain is the burger patty in the middle - and now you have the :shit: sandwich. Watch for this when you are called in to modernize ESBs and P2P integration layers.

6. Over Engineering

Engineers are guilty of this all the time.

Ooooh ... we modeled the system with Event Storming as events ergo - we should implement the new system in Kotlin with Event Sourcing on Kafka and deploy to Kubernetes.
Yikes!!

Do what it is sustainable and what your average (not rockstar) software engineers can support. My colleague Shaun Anderson explains this best in The 3 Ks of the Apocalypse — Or how awesome technologies can get in the way of good solutions.

7. Pressure Cooker Stakeholder Management

When the stakeholders of the project continuously keep changing their priorities or they lose faith in the progress or if the domains are pre-decided then it is time to push back and reassert control. The top down process of Domain Driven Design is like landing the plane from 30,000 feet. You cannot skip phases and go straight to user story backlog generation with somebody else's domain model or DDD process. Don't short circuit steps in the process. It is critical to follow the strategic and tactical patterns and land the plane with gradual descent to an organized user story backlog.

8. Faith

The biggest way to fail with DDD when doing system design or modernization is when you lose faith and start questioning the process. Another guaranteed failure mode is when you procrastinate and don't start the modeling activities i.e. you are stuck in analysis and paralysis phase. The only prerequisite to success is fearlessness and curiosity. You don't need to have a formal training to be a facilitator. Start the journey, iterate and improve as you go.

Wednesday, August 12, 2020

Mainframe Modernization Is Hard

Mainframe modernization is like a balloon payment on your 7 year ARM mortgage that has come true or an unhedged call option that has been called. [bad-code-isnt-technical-debt-its-an-unhedged-call-option](https://www.higherorderlogic.com/2010/07/23/bad-code-isnt-technical-debt-its-an-unhedged-call-option/). All code is technical debt and it is critical to understand the risk profile of your debt as you embark on a mainframe modernization project. [risk-profile-of-technical-debt](https://tanzu.vmware.com/content/intersect/risk-profile-of-technical-debt). Also see derivatives of technical debt for a detailed treatment of this topic.

Sticking with the financial analogy your payment is due, your option has expired and you are due a large amount. Our natural instinct is to to find an easy way out. Perhaps get a payday loan or swap out one kind of debt for another. Unfortunately none of these options work in the long term. To avoid bankruptcy we have to go through a debt restructuring or program where we retire and payout gradually the debt owed over time or in extreme cases declare bankruptcy. So what does all of this have to do with mainframe modernization ?

There are many enticing options when it comes to mainframe modernization like offloading work to cheaper processors on the mainframe, getting volume discounts from your mainframe provider, slapping REST APIs on top of the mainframe systems, COBOL to Java Code code generators or outsourcing the refactoring and rewrite of code (debt) outside the company. These efforts generally are well intentioned and start well but quickly get stuck in the mud because they don't scale or the complexity and sustainability of the solution does not work.

At VMware Pivotal Labs we acknowledge that mainframe modernization is hard. The implementation of the program gets worse before it becomes better as concurrent development work streams have to be maintained both for the legacy and net new. Having helped multiple customers on this journey we have come up with a iterative phased approach to mainframe modernization that scales and yields ROI in days and weeks and not months and years.

1. Start with the end in mind. What is the critical business event or situation that has triggered the modernization. It is very important to understand why the modernization program is being funded so that we can create the right set of goals, objectives and key results. Are you doing because you cannot add features fast enough to the critical system running on the mainframe. Are you doing this because you need a new digital 360 degree experience for your customers ? What are the key business drivers for the modernization. This alignment needs to be driven by both business and technology executives and refinforced by all the product owners, product managers and Tech. Leads. **The outcome of this phase is clearly articulated set of goals and objectives with quantified key results that provide the journey markers to understand if the program is on track.**

2. After goal alignment it is time to take an inventory of the business processes of the critical systems running on the mainframe. The business domain has to be analyzed and broken down into discrete independent business capabilities that can be modeled as microservices. This process of analyzing the core business domain and deriving modeling its constituent parts is called Event Storming. Event Storming enables decomposing massive monolith systems into smaller more granular independent software modules aka microservices. It allows for modeling new flows and ideas, synthesizing knowledge, and facilitating active group participation without conflict in order to ideate the next generation of a software system. Event Storming a group collaborative modeling exercise is used ot understand the top constraints, conflicts, inefficiencies of the system and reveal the underlying bounded contexts The seams of the current system will tell us about how the new distributed system should be designed. We also weaven in aspects of XP here like UCD, Design Thinking and interviews to ensure that our understanding of the system to keep it real. **The outcome of this phase is a set of service candidates also called as bounded contexts that represent the business capabilities of the core, supporting and generic business domain.**

3. A critical system on the mainframe like commercial loan processing or Pharmacy Management has multiple end to end workflows. It is critical to understand all the key business flows across the event storm that provide a steel thread for modernization. We need to prioritize these key flows as they will drive out the system design. The first thin slice picked should be a happy path flow that provides end to end value and demonstrates incremental progress and redoubles the faith in the whole process. **The outcome of this phase is a set of prioritized thin slices that encompass the Event Storm.**

4. We have all the lego pieces, now its a matter of putting them together with flows of messaging, data, APIs and UIs so that we fulfill the needs fo the business. We now have to wire up all our domain services. We use a process called Boris invented at Pivotal for this phase [SWIFT](https://www.swiftbird.us/). Boris provides a structured way to design synchronous API driven and asynchronous event-driven service context interactions. We identify relationships between services to reveal the notional target system architecture and record them using SNAP. SNAP takes the understanding from the Boris diagram to understand the specific needs of every bounded context under the new proposed architecture. The SNAP exercise is done concurrently with the Boris exercise. We call out APIs, data needed, UIs, and risks that would apply to that bounded context as the thin slice is modeled across services. **The outcome of this phase is a Notional Target Architecture of your new system with external interaction modes mapped out in terms of messaging, data and APIs.**

5. In some ways this process is like bringing a plane at 30000 feet to the ground. We are in full descent now and at the 10K feet level. At this point we have a target architecture. It is critical now to understand how we will develop these the old and new systems without disruption i.e. change the engine while the plane is descending to the ground. We employ a set of key tactical patterns for modernization like Anti-Corruption Layer, Facade, Data driven strangler to carve out a set of MVPs and user stories mapped to the MVP. These stories will realize the SNAP built earlier and implement the thin slices that were modeled. Creating a road map of all the quantum of work is critical as we start to make sure we are going in the right direction with speed. **The outcome of this phase is a set of user stories for modernization implementing the tactical pattern of co-existence with the monolith and a set of MVPs to track the key results.**

6. We no have a backlog of user stories and we are less than 1000 feet from the ground. At this point it is important to identify the biggest spikes and risks to the technical implementation like latency, performance, security, tenancy etc., and resolve them. We start building out contracts for our APIs so that other teams and dependencies may get unblocked. The stories are organized into Epics at Inception, product managers and engineers are allocated and the first iteration begins. The feedback loop from Product - Engineering - Business is set in motion. **This phase encompasses the first sprint or iteration of development. It is critical to establish demos, Iteration Planning Meeting, retrospectives and feedback loops in this phase as this will set a tone for the rest of the project. **

The six steps of mainframe modernization outlined here are not implemented like a waterfall. Six steps are sometimes run multiple times for different areas of a complex domain for a large domain and the results are stitched together. Steps or phases may be skipped altogether if we already know parts of the domain well. This six step process is what we call SWIFT. It is not dogmatic. Do what works at velocity to modernize the system in increments with a target architecture map in hand. Mainframe modernization is hard and there is no easy way out. Internalize this and start the journey of thousand miles swiftly with the first step.

Saturday, August 1, 2020

Designing Good APIs

Part 1 -- Principles of Designing & Developing APIs (for Product Managers, Product Designers, and Developers)

Part 2 -- Process of Designing APIs (for Product Managers, Product Designers, and Developers)

Part 3 -- Tips for managing an API backlog (for Non-Technical PMs)

Part 4 -- Developing, Architecting, Testing, & Documenting an API (for Developers)

API First Development Recipe

How To Document APIs

Shopify APIs

API Design

Friday, July 24, 2020

Don't Use Netflix Eureka For Service Discovery

So … there are multiple issues with Eureka

Deprecated by Netflix and no longer maintained by Pivotal/VMware. So no long term future maintainer.
The whole world seems to have moved on to either the service discovery provided natively by the platform like Kube DNS (Kubernetes) or Bosh DNS (Cloud Foundry) or service meshes (Istio) or service networking product like Consul. See replacing-netflix-eureka-with-kubernetes-services and polyglot-service-discovery-container-networking-cloud-foundry
Eureka does not work with Container to Container networking
Stale Service Registry Problem - Eureka & Ribbon don’t react quickly enough to apps coming up and down due to aggressive caching see making-service-discovery-responsive

Monday, June 22, 2020

The Eight Factors of Cloud Native Gemfire Applications

If after moving the in memory JVM cache to Gemfire your pain still hasn't gone away take a look at this checklist of factors for getting you app to perform efficiently with Gemfire. Thanks to my buddy Jeff Ellin who is a expert ninja at Gemfire and PCC.

TL;DR

Look at queries and data. If you have a read problem then the query is not constructed correctly or the data is not partitioned correctly. If its a write problem it's probably a cluster tuning issue, network issue or some bad synchronous listeners that have been implemented. The best way to triage this is to take some stats files you can be loaded up into vsd. Gemfire keeps a lot of statistics you can visualize. You can see throughput of various operations.

Here are some of the other things you should probably look at.

Data Partitioning
Non index lookups
Serialization
Querying on the server they should be using PDX for OQL Queries. If you aren’t using PDX the server needs to deserialize the object to do the query. Any time you query for data that isn’t the key you are using OQL
Data Colocation strategy of data (customer orders should be on the same partition as customer record, reference data should be replicated)
Leveraging Transactions Incorrectly (all objects must be in the same partition). Make your operations idempotent instead of relying on transactions.
Excessive GC activity due to data changing too frequently.
If networking sucks performance will suck due to the amount of replication - In rare cases you may need to enable Delta Propagation if the objects being serialized are big . also read When to avoid delta propogation. For each region where you are using delta propagation, choose whether to enable cloning using the delta propagation property cloning-enabled. Cloning is disabled by default. See Delta Propagation Properties. If you do not enable cloning, review all associated listener code for dependencies on EntryEvent.getOldValue. Without cloning, GemFire modifies the entry in place and so loses its reference to the old value. For delta events, the EntryEvent methods getOldValue and getNewValue both return the new value.

About Me

Wednesday, October 14, 2020