About Me

My photo
Rohit is an investor, startup advisor and an Application Modernization Scale Specialist working at Google.

Wednesday, October 14, 2020

Why is AWS Number #1 in the Cloud ??

I will let you in on a secret. It's not their gazillion managed services. It's not their 5 year head start in the cloud. Its because of the AWS Workload consumption engine. The AWS engine to drive consumption of cloud services is par none.

AWS does a phenomenal job of sparking consumption .. their key assets being 

1. 7Rs framework 

2. AWS Migration Hub (Migration Tracking, Migration tooling for discovery and planning, Migration Technology for app profiling and workload discovery) https://aws.amazon.com/migration-hub/ 

3. AWS Migration Acceleration Program https://aws.amazon.com/migration-acceleration-program/ 

4. AWS Cloud Adoption Framework https://aws.amazon.com/professional-services/CAF/ 

5. AWS Well Architected Framework https://aws.amazon.com/architecture/well-architected/ 

6. AWS Managed Services https://aws.amazon.com/managed-services/

Sparking consumption is a combination of these plays and needs everyone (Pre-sales, Solutions, PSO & Partners) to execute with assets and programs to match.


This is the bar set for all hyperscalars and wannabes :-)





Insights From State Of Spring 2020 Survey

From a strategy perspective the State Of Spring 2020 survey https://tanzu.vmware.com/content/ebooks/state-of-spring-2020 - a


:spring-1844:  survey of 1024  :spring-1844: developers -  is a must read.   Here is the TL;DRTop 10  insights here as it pertains to :kubernetes: (:java: => :container_runtime: ) GTM
-----------------------------------------------------------------------

  • The majority of the growth in Spring Boot applications, will come from new projects (82%), followed by enhancements to existing apps (56%), and migration of legacy apps (53%).
  • By far the largest use case for Spring is now development of internal and external APIs
  • Almost all respondents (95%) will containerize their Spring Boot apps, with 65% already doing so and a further 30% planning to.
  • Of the 95% of Spring users that are containerizing their apps, 44% have already deployed on Kubernetes, and a further 31% plan to do so within the next 12 months. Migration of containerized Spring Boot apps to Kubernetes is well underway, and most plan to complete the migration in a 12-month time window
  • Other than core-spring and spring-boot, Spring Security, Spring Data and Spring MVC are the top projects with Spring-webflux getting a honorable mention
  • Reactive (Project Reactor) and Serverless (Spring Cloud Functions) significantly lag the use of spring for microservices
  • Spring Cloud Portfolio is popular. 1) Spring Cloud Services 2) Spring Cloud Gateway 3) Spring Cloud Data Flow
  • Respondents recognize that the scale, strength and diversity of the open source Spring ecosystem is one of the greatest assets of the platform.
  • Developers struggle to understand what all the components do and how to apply them, while a smaller group are looking for greater runtime efficiencies for their Spring-based applications like reduced memory footprint and start time.
  • Enthusiasm for GraalVM is growing, with 14% already using it to some degree whilst another 26% plan to use it, enabling a reduction in memory usage and faster startup times.

Wednesday, August 19, 2020

Why Hybrid Cloud ?

 # Why Hybrid Multi-cloud

On-prem, AWS, GCP (... alibaba, Ibm cloud, ORCL)
1. Insurance against vendor lockin
2. Leverage the power of hyper cloud providers
  - on-demand IaaS
  - value added services
3. Regulatory
4. Disaster Recovery across providers
5. ORCL, VMware, IBM legacy DC incumbency reasons
  - vmc for aws, vmc for gcp, vmc for azure
  - zero change VM migration
  - vmc control plane for VMs
6. 2-10% penetration of public cloud
7. Proliferation of platforms - SINGLE Control multi-cloud plane
  - CF
  - Mesosphere
  - k8S (anthos, tmc, arc, crossplane, ..... )
  - SINGLE CONTROL PLANE

# Replication Edge <=> Central

  - Edge workloads will 75% cloud workloads
  - Call back home - Edge Architecture
  - Replicate -> offline-online
    - Data constraints
    - Network constraints
  - STATE Management
  - Cell Towers, POS, Autonomous Robots

# Stateful workloads - Databases

 - Operational pain - engineers
 - containers, VMs (legacy)
 - Operators
 - STATEFUL Sets
    - CSI
    - Portworx
 - Performance reasons for staying on a VM
 - self-service, faster changes, choice
 - Data resiliency, backup, DR (BCDR)
 - Gemfire replication (pattern)

# Best Practices on active-active for data

https://tanzu.vmware.com/content/white-papers/multi-site-pivotal-cloud-foundry-deployment-topology-patterns-and-practices

Monday, August 17, 2020

Modernizing Powerbuilder Apps

If you have a massive data window system baked into a Powerbuilder client application persisting to a Oracle database you are likely in a world of pain. This is a business critical system that generates billions of dollars of revenue. Multiple attempts to modernize the system have cratered due to big-bang vendor product driven, technology infatuated soirees. You want to dip your toes into the modern cloud native microservices developer friendly world but have been burnt twice already. You have no interest or inclination in fielding yet another rewrite the system from scratch in 2 years sales pitch. What the hell do you do ? Hit the Bottle ? Pay ORCL another 25M$ in licensing fees.  Put a mask on and upgrade Powerbuilder to 2019 R2 ? Is there a way out of this black hole ?

Yes. But it ain't easy.

First acknowledge that this is a tough task. The short-cuts were already taken which is why we are at this F*#$ed up point. 

Here is a path that can be trodden. A couple of the smart guys in data Alastair Turner and Gideon Low have heavily influenced my thinking on this topic.

First figure out what is the primary driver for modernization Cost or Productivity. Depending on the answer a different set of strategies needs to be followed. 

Cost

Let's assume all your customer support professionals are well versed in the screens and actually love the data window UI. The application is functional and can be enhanced with speed in production. The only issue the licensing and cost associated with running Powerbuilder. In such a scenario perhaps migration is a better option i.e. migrate all the data to Postgres. Check to see if your SAP or Appeon version of Powerbuilder supports PostgresSQL as a backend. You might be so bold as to consider migrating your DB to the cloud with Amazon Database Migration service

Depending on the cost you may choose to use code generators that auto generate Java or C# code from Powerbuilder libraries. Both Appeon and Blu Age have such tools; however buyer beware. Any tool that charges for modernization by LOC is immediately suspect. Code Generators are like Vietnam - easy to get in, hard to get out. 

Productivity

You want to develop new features and microservices and expose APIs to newer channels and other consumers of the service. Here you have a fork in the road. 

1. Upgrade to the latest GA version of GA Powerbuilder 2019 R2 and begin an expensive project to RESTify Existing DataWindows as webservices. The limitation with using conversion tools is that you don't really get a chance to identify and fix various classes of important problems—you don't pay-down your technical debt. This is like trading one form of debt for another  like replacing your high interest rate debt from Visa with a slightly lower interest rate debt from Capital One.  What is in your wallet ?

2. The RIGHT Way. Start by cataloging the business use-cases, and rebuild each one.  The legacy system's only real role is to validate that behavior hadn't changed.  If you can't get the business rules from the business you will need to reverse-engineer the stored procedures and persistence using tools like  ER Diagrams, SchemaSpy or leverage Oracle Dependency Catalog utilities to determine the object dependency tree. Visualization tools can't hurt, but SO MUCH is usually trapped in the stored procedure logic that their use can be as much a hindrance as anything. A good way to visualize the entire business process is to leverage techniques from the app world like Event Storming or User Journey, Workflow, Interaction mapping. 

There is no substitute for hard work. There is no real substitute for case by case refactoring. Start with  documenting the problem, visualising it and identifying the starting points is of value. Thereafter pick a particular steel thread aka end to end slice and identify the right set of APIs. Leverage the tactical patterns listed below from Sam Newman's Book Monoliths To Microservices (chapter #4) for decomposing data and pulling out services. .


Start the journey of 10,000 miles with the first step. Start small, iterate and demonstrate value to business at the end of each iteration. This is the only way to be successful in the long term with any modernization of a critical complex system at scale. 

Good Luck !!


References

See how ZIM Shipping moved from .NET/PowerBuilder development to Spring Boot with Tanzu Application Service

Common Process Challenges When Breaking Monoliths

3 difficult challenges that we often come across when our team works with clients as they try to break monoliths:

1. Silver Bullets - Enterprises have been burnt by vendor solutions that promise migration with a tool like BPM or some silver bullet methodology that promises seamless migration. The Truth is that disentangling your monolith's code and data is going to get messy. The entire business process will need to be disaggregated, visualized and seams will need to be identified to create a blueprint for a target architecture. Other than Pivotal/VMware there is no one else who has done this at enterprise scale. Our approach modernizes the monolith incrementally with demonstrated business value in weeks not years. 

2. Over Engineering - It is common to get distracted by technology choices and deployment options rather than focus on the difficult work of understanding the core domain. Identifying the business capabilities and assessing if they are core or supporting domains. Do what is sustainable and what your average (not rockstar) software engineers can support and focus on the outcomes.

3. Pressure Cooker:  When clients continuously keep changing their priorities or lose faith in the progress or micromanage the design then it subverts the process and the target architecture looks like the old system.  Breaking monoliths with Domain Driven Design is like landing the plane from 30,000 feet. You cannot skip phases and go straight to user story backlog generation with somebody else's domain model or DDD process. Don't short circuit steps in the process.  It is critical to follow the strategic and tactical patterns and land the plane with gradual descent to an organized user story backlog.  

Saturday, August 15, 2020

Eight Pitfalls Of Application Modernization (m11n) when Practicing Domain Driven Design (DDD)

1. High Ceremony OKRs

An application modernization effort is necessary broad in scope. Overly constraining the goals and objectives often results in premature bias and could lead to solving the wrong problems. Driving out premature Key Results means that you are tracking and optimizing for the wrong outcomes. If you are uncertain of the goals of modernization or if this is your first time refactoring and rewriting a critical legacy system, it is best to skip this step and come back to it later. 


2. Mariana Trench Event Storming

There are multiple variants of Event Storming (ES). ES can be used for big picture, detailed system design, greenfield exploration, value stream mapping, journey mapping, event modeling etc., Beware of going too deep into existing system design with ES when modernizing a system. If the intent of ES is to uncover the underlying domains and the services it is counterproductive to look at ALL the existing constraints and hot spots in the system. Model enough to get started to understand and reveal the seams of the domain. If you go deep with Event Storming you will bias or over-correct the design of the new system with the baggage and pain of the old. 

They key to success with Event Storming is to elicit the business process i.e. how the system wants to behave and not the current hacked up process. When there is no representation from the business there is no point in doing event storming. 

3. Anemic Boris

When modeling relationships between bounded contexts it is critical to fully understand the persistence of data, flow of messages, visualization of UI and the definition of APIs. If these interactions are not fully flushed out the Boris diagram becomes anemic and this in turn reflects an anemic domain model

4. Picking Thin Slices

As you start modeling the new system design the end to end happy path of the user workflow first. Pick concrete scenarios and use cases that add value to the business.  Those that encounter the maximum pain. The idea here is to validate the new design cheaply with stickies instead of code and not get stuck in error or edge cases. If you don't maintain speed and cycle through multiple end to end steel threads you may may be prematurely restricting the solution space. 

5. Tactical Patterns - The Shit Sandwich

As you start implementing the new system and co-existing with the new there are a few smells to watch for the biggest one being the Shit Sandwich. “Shit Sandwich” - When you are stuck doing DDD on a domain that is sandwiched between upstream and downstream domains/services that cannot be touched thereby overly constraining your decomposition and leading to anemic modeling since you cannot truly decompose a thin slice. You spend all your time writing ACLs mapping data across domains. 

So the Top bun is the downstream service, the bottom bun is the upstream service, then there are two layers of cheese - which are the ACLs and then  your core domain is the burger patty in the middle - and now you have the :shit:  sandwich. Watch for this when you are called in to modernize ESBs and P2P integration layers. 



6. Over Engineering

Engineers are guilty of this all the time.
Ooooh ... we modeled the system with Event Storming as events ergo - we should implement the new system in Kotlin with Event Sourcing on Kafka and deploy to Kubernetes.
Yikes!!
Do what it is sustainable and what your average (not rockstar) software engineers can support. My colleague Shaun Anderson explains this best in The 3 Ks of the Apocalypse — Or how awesome technologies can get in the way of good solutions.


7. Pressure Cooker Stakeholder Management

When the stakeholders of the project continuously keep changing their priorities or they lose faith in the progress or if the domains are pre-decided then it is time to push back and reassert control. The top down process of Domain Driven Design is like landing the plane from 30,000 feet. You cannot skip phases and go straight to user story backlog generation with somebody else's domain model or DDD process. Don't short circuit steps in the process.  It is critical to follow the strategic and tactical patterns and land the plane with gradual descent to an organized user story backlog. 

8. Faith

The biggest way to fail with DDD when doing system design or modernization is when you lose faith  and start questioning the process. Another guaranteed failure mode is when you procrastinate and don't start the modeling activities i.e. you are stuck in analysis and paralysis phase. The only prerequisite to success is fearlessness and curiosity. You don't need to have a formal training to be a facilitator.  Start the journey, iterate and improve as you go.  

Wednesday, August 12, 2020

Mainframe Modernization Is Hard

Mainframe modernization is like a balloon payment on your 7 year ARM mortgage that has come true or an unhedged call option that has been called. [bad-code-isnt-technical-debt-its-an-unhedged-call-option](https://www.higherorderlogic.com/2010/07/23/bad-code-isnt-technical-debt-its-an-unhedged-call-option/). All code is technical debt and it is critical to understand the risk profile of your debt as you embark on a mainframe modernization project. [risk-profile-of-technical-debt](https://tanzu.vmware.com/content/intersect/risk-profile-of-technical-debt).  Also see derivatives of technical debt for a detailed treatment of this topic. 

Sticking with the financial analogy your payment is due, your option has expired and you are due a large amount. Our natural instinct is to to find an easy way out. Perhaps get a payday loan or swap out one kind of debt for another. Unfortunately none of these options work in the long term. To avoid bankruptcy we have to go through a debt restructuring or program where we retire and payout gradually the debt owed over time or in extreme cases declare bankruptcy. So what does all of this have to do with mainframe modernization ?

There are many enticing options when it comes to mainframe modernization like offloading work to cheaper processors on the mainframe, getting volume discounts from your mainframe provider, slapping REST APIs on top  of the mainframe systems, COBOL to Java Code code generators or outsourcing the refactoring and rewrite of code (debt) outside the company. These efforts generally are well intentioned and start well but quickly get stuck in the mud because they don't scale or the complexity and sustainability of the solution does not work.

At VMware Pivotal Labs we acknowledge that mainframe modernization is hard. The implementation of the program gets worse before it becomes better as concurrent development work streams have to be maintained both for the legacy and net new. Having helped multiple customers on this journey we have come up with a iterative phased approach to mainframe modernization that scales and yields ROI in days and weeks and not months and years.

1. Start with the end in mind. What is the critical business event or situation that has triggered the modernization. It is very important to understand why the modernization program is being funded so that we can create the right set of goals, objectives and key results. Are you doing because you cannot add features fast enough to the critical system  running on the mainframe. Are you doing this because you need a new digital 360 degree experience for your customers ? What are the key business drivers for the modernization. This alignment needs to be driven by both business and technology executives and refinforced by all the product owners, product managers and Tech. Leads. **The outcome of this phase is clearly articulated set of goals and objectives with quantified key results that provide the journey markers to understand if the program is on track.**

2. After goal alignment it is time to take an inventory of the business processes of the critical systems running on the mainframe. The business domain has to be analyzed and broken down into discrete independent business capabilities that can be modeled as microservices. This process of analyzing the core business domain and deriving modeling its constituent parts is called Event Storming. Event Storming enables decomposing massive monolith systems into smaller more granular independent software modules aka microservices. It allows for modeling new flows and ideas, synthesizing knowledge, and facilitating active group participation without conflict in order to ideate the next generation of a software system. Event Storming a group collaborative modeling exercise is used ot understand the top constraints, conflicts, inefficiencies of the system and reveal the underlying bounded contexts The seams of the current system will tell us about how the new distributed system should be designed. We also weaven in aspects of XP here like UCD, Design Thinking and interviews to ensure that our understanding of the system to keep it real. **The outcome of this phase is a set of service candidates also called as bounded contexts that represent the business capabilities of the core, supporting and generic business domain.** 

3. A critical system on the mainframe like commercial loan processing or Pharmacy Management has multiple end to end workflows. It is critical to understand all the key business flows across the event storm that provide a steel thread for modernization. We need to prioritize these key flows as they will drive out the system design. The first thin slice picked should be a happy path flow that provides end to end value and demonstrates incremental progress and redoubles the faith in the whole process. **The outcome of this phase is a set of prioritized thin slices that encompass the Event Storm.**

4. We have all the lego pieces, now its a matter of putting them together with flows of messaging, data, APIs and UIs so that we fulfill the needs fo the business. We now have to wire up all our domain services. We use a process called Boris invented at Pivotal for this phase [SWIFT](https://www.swiftbird.us/). Boris provides a structured way to design synchronous API driven and asynchronous event-driven service context interactions. We identify relationships between services to reveal the notional target system architecture and record them using SNAP. SNAP takes the understanding from the Boris diagram to understand the specific needs of every bounded context under the new proposed architecture. The SNAP exercise is done concurrently with the Boris exercise. We call out APIs, data needed, UIs, and risks that would apply to that bounded context as the thin slice is modeled across services.  **The outcome of this phase is a Notional Target Architecture of your new system with external interaction modes mapped out in terms of messaging, data and APIs.**

5. In some ways this process is like bringing a plane at 30000 feet to the ground. We are in full descent now and at the 10K feet level. At this point we have a target architecture. It is critical now to understand how we will develop these the old and new systems without disruption i.e. change the engine while the plane is descending to the ground.  We employ a set of key tactical patterns for modernization like Anti-Corruption Layer,  Facade, Data driven strangler to carve out a set of MVPs and user stories mapped to the MVP. These stories will realize the SNAP built earlier and implement the thin slices that were modeled. Creating a road map of all the quantum of work is critical as we start to make sure we are going in the right direction with speed. **The outcome of this phase is a set of user stories for modernization implementing the tactical pattern of co-existence with the monolith and a set of MVPs to track the key results.**

6. We no have a backlog of user stories and we are less than 1000 feet from the ground. At this point it is important to identify the biggest spikes and risks to the technical implementation like latency, performance, security, tenancy etc., and resolve them. We start building out contracts for our APIs so that other teams and dependencies may get unblocked. The stories are organized into Epics at Inception, product managers and engineers are allocated and the first iteration begins. The feedback loop from Product - Engineering - Business is set in motion. **This phase encompasses the first sprint or iteration of development. It is critical to establish demos, Iteration Planning Meeting, retrospectives and feedback loops in this phase as this will set a tone for the rest of the project. **

The six steps of mainframe modernization outlined here are not implemented like a waterfall. Six steps are sometimes run multiple times for different areas of a complex domain for a large domain and the results are stitched together. Steps or phases may be skipped altogether if we already know parts of the domain well. This six step process is what we call SWIFT. It is not dogmatic. Do what works at velocity to modernize the system in increments with a target architecture map in hand. Mainframe modernization is hard and there is no easy way out. Internalize this and start the journey of thousand miles swiftly with the first step. 

Saturday, August 1, 2020

Designing Good APIs

Part 1 -- Principles of Designing & Developing APIs (for Product Managers, Product Designers, and Developers)

Part 2 -- Process of Designing APIs (for Product Managers, Product Designers, and Developers)

Part 3 -- Tips for managing an API backlog (for Non-Technical PMs)

Part 4 -- Developing, Architecting, Testing, & Documenting an API (for Developers)

API First Development Recipe

How To Document APIs

Shopify APIs

API Design 




Friday, July 24, 2020

Don't Use Netflix Eureka For Service Discovery

So  … there are multiple issues with Eureka
  1. Deprecated by Netflix and no longer maintained by Pivotal/VMware.  So no long term future maintainer.
  2. The whole world seems to have moved on to either the service discovery provided natively by the platform like Kube DNS (Kubernetes) or Bosh DNS (Cloud Foundry) or service meshes (Istio) or service networking product like Consul.  See replacing-netflix-eureka-with-kubernetes-services and polyglot-service-discovery-container-networking-cloud-foundry 
  3. Eureka does not work with Container to Container networking
  4. Stale Service Registry Problem - Eureka & Ribbon don’t react quickly enough to apps coming up and down due to aggressive caching see making-service-discovery-responsive

Monday, June 22, 2020

The Eight Factors of Cloud Native Gemfire Applications

If after moving the in memory JVM cache to Gemfire your pain still hasn't gone away take a look at this checklist of factors for getting you app to perform efficiently with Gemfire. Thanks to my buddy Jeff Ellin who is a expert ninja at Gemfire and PCC.

TL;DR

Look at queries and data. If you have a read problem then the query is not constructed correctly or the data is not partitioned correctly. If its a write problem it's probably a cluster tuning issue, network issue or some bad synchronous listeners that have been implemented. The best way to triage this is to  take some stats files you can be loaded up into vsd. Gemfire keeps a lot of statistics you can visualize. You can see throughput of various operations.

Here are some of the other things you should probably look at.
  1. Data Partitioning 
  2. Non index lookups 
  3. Serialization 
  4. Querying on the server they should be using PDX for OQL Queries. If you aren’t using PDX the server needs to deserialize the object to do the query. Any time you query for data that isn’t the key you are using OQL 
  5. Data Colocation strategy of data (customer orders should be on the same partition as customer record, reference data should be replicated) 
  6. Leveraging Transactions Incorrectly (all objects must be in the same partition). Make your operations idempotent instead of relying on transactions. 
  7. Excessive GC activity due to data changing too frequently. 
  8. If networking sucks performance will suck due to the amount of replication - In rare cases you may need to enable Delta Propagation if the objects being serialized are big . also read When to avoid delta propogation. For each region where you are using delta propagation, choose whether to enable cloning using the delta propagation property cloning-enabled. Cloning is disabled by default. See Delta Propagation PropertiesIf you do not enable cloning, review all associated listener code for dependencies on EntryEvent.getOldValue. Without cloning, GemFire modifies the entry in place and so loses its reference to the old value. For delta events, the EntryEvent methods getOldValue and getNewValue both return the new value.

Thursday, June 18, 2020

2020 A Year In Review

  • Leading the delivery of  one-of-a kind Application Modernization Navigator   of a Pharmacy Benefits Management system currently running in the mainframe.
  • Led the first two Remote Application Swift Navigators creating the Playbook for Remote Event Storming and other remote collaborative modeling practices.
  • GoToMarket Remote App Modernization Navigators https://tanzu.vmware.com/content/resources-for-remote-software-teams/how-to-conduct-a-remote-event-storming-session
  • Self Published Book on Practical Microservices 
  • Self Published Book on Emergent Trends on Modern Applications 
  • Improving the quality and number of recipes in our App Modernization Cookbooks and opening our tools like SNAP to broader VMware commnunity.
  • One of the lead Influencers in MAPBU Services Marketing measured in Published articles and blog posts, internal documents and views
  • Top articles including How to Build Sustainable, Modern Application Architectures https://tanzu.vmware.com/content/practitioners-blog/how-to-build-sustainable-modern-application-architectures
  • CKA and CKAD Certified in 2019. Conducted training for both Labs and class AppTx App Services on How to get certified on Kubernetes.
  • Taught broader MAPBU organization on How To  Conduct Remote Scopings. 
  • Taught Multiple (> 5) Swift Modernization Workshops across Pivotal and VMware for Scaling App Modernization
  • Working with R&D to bring Spring Transformation Tooling to market. Created Spring Bootifier with Tim Dalsing which is now used in the implementation of Tanzu Workbench Spring Bootifier Transformer. Currently aligning with Spring R&D effort on automated cloud native remediation of Java applications.
  • Created Kafka battle-card for App Services sales 
  • Anchored One Off a Kind Kafka  Real Time Inventory Engagement and created collateral for App Modernization For Streaming workloads deck 
  • Creating App Services pipeline at our top customers - Over 375+ individual Travelers participants attended the Should This Be A Microservice webinar on Tanzu enablement see  Over 2500 views on linkedin.
  • COVID 19 Project - Facilitator of Remote Swift Event Storm and Boris exercises. This client is a small startup who worked with Labs Seattle 4 years ago on an app to connect in home carers with medicaid recipients. Since COVID 19 they've expanded their platform to now connect essential workers with childcare providers. They're experiencing a huge demand for this service. Unsurprisingly, there is a huge demand for this service. They are currently rolling out to public hospitals in LA.
  • Working across discovery teams to formulate the Discovery scoping and workshop process. 

Tuesday, June 2, 2020

Modernization Myths - Microservices - Explained

Myth - “Microservices Is The Only True Way”

Why are microservices the default way of developing cloud native applications ? Why does application Modernization in the form of decomposing monoliths result in so many microservices ? Why has microservices become the default deployment model for applications in spite of enterprises struggling with the observability, complexity and performance of the distributed system. Heisenbugs abound when you mix sync and async flows further compounded by various forms of async event driven architecture like thin events, thick events, event sourcing, CQRS etc.,
Frustrated by the intractability and cognitive overload of monoliths, I  was one of the first people to ride the monoliths to the microservices wave. However six years after the Microservices article came out from Martin Fowler and James Lewis came out, it is time to retrospect on reality. Have Microservices got you down ? Are you swimming in a mess of brittle microservices that break every night ? Are the number of microservices going up resembling the death-star architecture ? It behooves us to travel upstream and examine the motivations for microservices. How do we peel back the onion and get back to sanity around microservices. How do we rationalize microservices ?
So what is the path forward ? Throw the baby out of the bathwater and swing back to Monoliths. There ought to be a middle way where we can take best of the microservices advantages like domain based composition, enforced module boundaries, independent evolvability, better cognition etc.  It is time to look after life after running microservices architectures in production and learn from the mistakes committed over the past five years.
There is a way that has emerged from working with a number of customers where the value of microservices has not been realized from application modernization despite leveraging Domain Driven Design and doing all things right. This process adds sanity to the process of constructing microservices and provides guidelines and design heuristics on structuring microservices. We need to tackle this problem from a technical, business and social perspective marrying concepts from DDD with Wardley Business Maps and Sociotechnical architecture. Rationalize microservices into modular monoliths based on technical and business heuristics. Employ techniques  which are  a combination of mapping microservices to core technical attributes reduced by affinity mapping and business domain context distillation. This workshop/process called micro2monp has simplified enterprise architectures and improved the operational burden of microservices. You can find more details of the process here and six-factors and post

So How DO you Rationalize your microservices ?  Here is a presentation that walks through all the factors that should be used for rationalizing Microservices.










A Blueprint For Mainframe Modernization

At VMware Pivotal Labs we have cracked the code of Mainframe Modernization. Frustrated with low fidelity code generators migrating COBOL to Java, losing business rules in the process ?. Tired of multi-year modernization projects with no end in sight. Take a look at the seven step punch VMware Pivotal employs to achieve concrete modernization business outcomes for high value critical mainframe OS400 and OS390 workloads.




Modernization in weeks not months and years
Blueprint For Mainframe Modernization VMWare Pivotal Labs


Monday, June 1, 2020

Explain VMware Tanzu To Me Like I am Eight

Me breaking down the VMware Tanzu Portfolio suite with my son Rushil Kelapure.

Modernizing applications feels like an overwhelming job. Maintaining dev and prod environment parity seems a Sisyphean task. Container builds vary wildly, creating snowflakes with every sprint. Even new, promising runtimes that come along face fragmentation as different teams adopt different versions. With a consistent approach, however, the layers between infrastructure and application code become manageable. VMware Tanzu portfolio, brings consistency to building, running, and managing code. VMware is the only company that addresses the challenges of application modernization from both the application and the infrastructure perspective.

Tuesday, May 26, 2020

Modernization Myths Explained 1 & 2

In this blog post we go deeper into the top two myths of Application Modernization. An overview of all the top 10 myths can be found here


Myth 1 - “Application has to be cloud native to land on a PaaS”

The truth is that most Platforms As A Service run applications of different cloud native characteristics just fine. Applications have to progress through a spectrum as they land and flourish in the cloud from Not running in the cloud, to running in the cloud, to running great in the cloud. A PaaS like Cloud Foundry has also evolved features like volume services and multi-port routing to help stateful and not born on the cloud applications run without changes on Cloud Foundry.  In his blog series  debunking Cloud Foundry myths , Richard Seroter authoritatively disproves the notion that  Cloud Foundry can only run cloud-native applications.
Applications do not have to be classic 12 factor or 15 factor compliant to land on PaaS. Applications evolve on the cloud native spectrum. The more cloud native idiomatic changes to an app - the more return on investment you get from the changes. The more cloud native you make the app, the higher the optionality you get since it becomes cloud agnostic allowing enterprises to exact maximum leverage from all the providers. The focus needs to be on the app inside-out to get the best returns. In general the higher you are in the abstraction stack the more performance gains you will get so Architecture changes will yield a 10x more benefit than JVM or GC tuning which will yield a 10x more benefit than tuning assembly code and so on … If it is the database tier that you think is the problem - then you can put in multiple shock absorbers instead of tuning the startup memory and app start times. Apps first, Platform second :-)  


Cloud Foundry Support For Stateful Applications
Myth 2 - “Application have to be refactored to run them on Kubernetes”
It's a fallacy that applications need to be modified by developers before landing them on Kubernetes. In fact an enterprise can get significant cost savings by migrating one factor apps to Kubernetes. A one factor app simply has the capability to restart with no harmful side-effects James Watters the cloud soothsayer has posed the question in the cloud-native podcast - Do you even have a 1-factor application ? 

Most business applications are not ready for refactoring but still want the cost advantages of running in the cloud.  For apps where the appetite for change is zero, starting  small, as in just restarting the application predictably i.e. making it one factor can make it run on a container platform like Kubernetes. As you shift to declarative automation and scheduling, you will want the app to restart  cleanly. There is an application-first movement of being able to do some basic automation of even your monolithic applications. Apps are the scarce commodity right now. With Kubernetes becoming more and more ubiquitous — All the application portfolios need a nano change mindset to adapt to the cloud.

Saturday, May 16, 2020

Top Ten Application Modernization Myths

Sometimes we tell little lies to ourselves. It is always good to take inventory of reality and introspect on what is true and what is not. Here are some of the little lies of application migration and modernization that I have observed over the last five years. 
  1. Application has to be 12/15 factors compliant to land on PaaS. Apps can be modified on the cloud native spectrum. The more cloud native idiomatic changes to an app - the more ROI you get from the changes. See Myth #1 "Cloud Foundry can only run cloud-native, 12-factor apps." - FALSE https://tanzu.vmware.com/content/blog/debunking-cloud-foundry-myths
  2. Applications need to be modified by developers before landing them on Kubernetes (TKG). In fact an enterprise can get significant cost savings by migrating one factor apps to Kubernetes. A one factor app simply has the capability to restart with no harmful side-effects See https://tanzu.vmware.com/content/intersect/vmware-tanzu-in-15-minutes Do you even have a 1-factor application?
  3. Once technical debt on an application becomes unsurmountable the only recourse is to rewrite it. Surgical strikes with an emphasis on understanding the core domain can lead to incremental modernization of the most valuable parts of a big critical legacy system. A FULL rewrite is not the only option.  See technical debt like financial debt https://tanzu.vmware.com/content/intersect/risk-profile-of-technical-debt and https://tanzu.vmware.com/content/webinars/may-6-tech-debt-audit-how-to-prioritize-and-reduce-the-tech-debt-that-matters-most
  4. There is a silver bullet for app migration. There is an increasing bevy of tools that have started promising a seamless migration of VMs to containers in the cloud. Remember in life nothing is free. You get what you put in. Migration is highly contextual and the OPEX and Developer efficiency returns are dependent on the workloads being ported. Migration of apps in VMs to Kubernetes stateful sets or automatic dockerization through buildpacks etc should be evaluated for the desired objectives of the Migration Project.
  5. Microservices and event driven architecture is ALWAYS the right architecture choice for app modernization. Sometimes the answer is to step back simplify the domain and implement a modular monolithic system and sometimes the answer is to decompose the large system into a combination of microservices and functions. Understand the design and operational tradeoffs first before making the choice. Every tech choice like eventing, APIs, streaming etc has a spectrum. The fundamental job of an architect is to understand the sociotechnical factors and make the right choices from a process, people and implementation perspective. see https://tanzu.vmware.com/content/practitioners-blog/how-to-build-sustainable-modern-application-architectures
  6. Decomposing and rearchitecture of an existing system can be done concurrently with forward development with little impact to exisrting release schedules. This is a dream. When working on two branches of an existing system a forward development branch and a rearchitecture branch > the total output often times gets worse before becoming better. WBB - This is because there is a period of time where dual maintenance and dual development and the coordination tax across two teams are levied without getting any of the benefits of modularization and refactoring. See The Capability Trap: Prevalence in Human Systems https://www.systemdynamics.org/assets/conferences/2017/proceed/papers/P1325.pdf https://rutraining.org/2016/05/02/dont-fall-into-the-capability-trap-does-your-organization-work-harder-or-smarter/
  7. The fundamental problems of app modernization are technical. If developers  only had the rigor and discipline to write idiomatic code all problems would be fixed and we won't incur technical debt. Wrong- The fundamental problems of app modernization are team and people related. Incorrect team structure, wrong alignment of resources to core domains and messed up interaction patterns are far more responsible for the snail pace for feature addition rather than technical changes. The answer is team re-organization based on the reverse conway maneuver. See Team Topologies https://www.slideshare.net/matthewskelton/team-topologies-how-and-why-to-design-your-teams-alldaydevops-2017
  8. Mainframe modernization can be accelerated by using lift-n-shift tools like emulators or code generation tools. In our experience a complex mainframe modernization almost always involves a fundamental rethink of the problem being solved and then rewriting a new system to address the core domain divorced from the bad parts of the existing intermingled complex system. Theory of constraints and a systems thinking help us reframe the system and implement a better simpler one.
  9. Engineers, Developers and Technical Architects tend to think from a technical nuts and bolts perspective (the “how”) and, therefore, tend to look at modern technologies such as Cloud Foundry, Spring Boot, Steeltoe, Kafka and containerization as the definition of a modern application. This misses the mark. The Swift Method pioneered by Pivotal  helps bridge the gap in understanding between the non-technical, top down, way of thinking and the technical, bottom up thought process.  The end result is an architecture that maps to the way the system “wants to behave” rather than one that is dictated by the software frameworks of du jour. 
  10. AWS or Azure or GKE/GCP etc provide an all encompassing suite of tools, services and platforms to enable and accelerate modernization and migration of workloads. While it is true that the major cloud providers have ALL the bells and whistles to migrate workloads, the economics of app modernization tend towards the app and not the platform. The more cloud native you make the app, the higher the optionality you get since it becomes cloud agnostic allowing enterprises to exact maximum leverage from all the providers. The focus needs to be on the app inside-out to get the best returns. In general the higher you are in the abstraction stack the more performance gains you will get so Architecture changes will yield a 10x more benefit than JVM or GC tuning which will yield a 10x more benefit than tuning assembly code and so on … If it is the database tier that you think is the problem - then you can put in multiple shock absorbers 1. caches 2. queues 3. partitioning first and focused on instead tuning the startup memory and app start times. Apps first, Platform second :-)  

Friday, April 24, 2020

Java Application Modernization Maturity Model

This is how I think about the Maturity Model for Java Transformers
--------------------------------------------------------------------------

1. Basic Containerization of Stateless apps to TKG - enabled by https://github.com/pivotal/kpack and Cloud Native Buildpacks. - Deploy with vanilla  manifests that maybe helmified   / Basic Containerization to TAS - Using TAS Buildpacks ... some apps require no changes when deploying with JavaBuildpack. **O changes.**

2. TKG - Basic Containerization of Stateful apps possibly using K8s Stateful sets or persistent volumes. / TAS - Extract state out like session replication or databases (not sure how to do this yet).   Some tools purport to do this. Like Google anthos and the POC the tracker team is working on. **Minimal changes.**

3. Invasive S-boot transformer - high vaule - high degree of change and difficulty. Thae automate the transformation recipes to cloud native. Bootifier ones as well simpler ones like boot 2 -> boot 3, XML -> Config Migration.  **Invasive changes.**

4. Microservice generator - looking at the dynamic runtime of the appllication. Determines seams and makes suggestions where apps can be decomposed and used as a starting point for Swift. **Monoliths2Microservices**

Thursday, April 9, 2020

The Balance between Precision and Accuracy when it comes to Performance Benchmarking

Usually when called in to a fire fighting situation where the performance and scale of the system is going to shit, it is critical to understand the right treatment to apply to the problem. Understanding the Precision vs Accuracy tradeoff  is critical in prescribing the right medicine for your problems.  So when should you go for accuracy and when should you pursue precision ?

When you need to be in the right ballpark of the solution within an order of magnitude go for accuracy. When you need to be precise to the individual digit level strive for precision. In most situations you can guess that accuracy comes first, precision comes later. Another critical tradeoff is performance vs scale. Your application may scale great but perform like crap and vice-versa. Scale needs to be achieved at the right level of end user performance.

In general the higher you are in the abstraction stack the more performance gains you will get so Architecture changes will yield a 10x more benefit than JVM or GC tuning which will yield a 10x more benefit than tuning assembly code and so on … If it is the database tier that you think is the problem - then you can put in multiple shock absorbers 1. caches 2. queues 3. partitioning first and focused on instead tuning the startup memory and app start times.

Always understand the top constraint and problem you are solving for. You should always prioritize to solve the top constraint and the best way to determine the top constraint is to take a system level holistic view - draw out a system map and take a look at all the queues and waits in the system.

Visualize the system as a bucket of wait queues - Something like the picture below. An Event Storming exercise can help suss out this system map and visualize the API/Date stream.














Here are the top five performance mitigation suggestions based on past experience
  1. Classpath Bloat: Check to see if the Classpath is bloated. Spring boot autoconfigures stuff that sometimes you don’t need and bloated class path leads to misconfiguration of threadpools and libraries. As a remedy put the dependencies (pom.xml or build.gradle) through a grinder. We have 25 step checklist if you want details. This will also reduce the overall memory footprint due to library misconfiguration. Developers load up  a lot of dependencies and libraries in the app and the full implications of memory bloat and runtime misconfiguration are not understood till later
  2. Startup time: If you want your app to startup quickly see this list https://cloud.rohitkelapure.com/2020/04/start-your-spring-apps-in-milliseconds.html
  3. Memory: If the app is memory constrained ensure that GC is tuned correctly. Employ verbose GC tracing. see https://blog.heaphero.io/2019/11/18/memory-wasted-by-spring-boot-application/#4A7023 and https://heaphero.io/
  4. External integration tuning including outboud DB, HTTP Calls & Messaging. Connection Pool Tuning. See https://github.com/pbelathur/spring-boot-performance-analysis. In general examine any outbound connection/integration to any database, messaging queue. Circuit Breakers& metrics on every outbound call to see the health of the outbound connection
  5. Latency/ Response Time analysis to see where & wether time is spent on the platform /app /network/disk Use the latency-troubleshooter app to hunt down latency hogs.  https://docs.cloudfoundry.org/adminguide/troubleshooting_slow_requests.html and https://community.pivotal.io/s/article/How-to-troubleshoot-app-access-issues
For a detailed checklist see
HAPPY Performance Hunting!