The end of (Stormy) 2022 is coming

October 2022 is bringing with it enhanced degrees of difficulty and uncertainty, the financial climate in general, and especially the high-tech industry is turbulent and unclear.

The beginning of Q4 is usually crunch time for most companies. It was during my tenure at IBM that I took the crash course on this subject. There is no better school for “End of year” strain than the one at a publicly traded company and if it’s an international corporate traded on NASDAQ even better. There is no tomorrow after December 31st.

All the Milestones, releases, demos, POCs that you committed to delivering throughout the year and postponed/forwarded to the next month must be delivered. All 2022 related investments, bonuses, and terminations… are knocking on your door and as always, time and resources’ availability are not your allies.

2022’s end is not business as usual, as in the 1939 “The Wizard of Oz”, Dorothy says to her dog at one point: “Toto, I’ve got a feeling we’re not in Kansas anymore.” Recent management research and publications are addressing the fact that traditional/textbooks/best practices of business strategy (and tactics) are becoming futile once facing the fast pace of change and continuous uncertainty that we’re facing.

October 2022 is bringing with it an enhanced degree of difficulty and uncertainty, the financial climate in general, and especially the high-tech industry is turbulent and unclear. This environment is adding two crucial parameters to the equation. Imminent demand to economize which is corporate slang to cut costs dramatically. The demand to stretch the “runway” is amplified by the second variable which is the lack of visibility to 2023 budgets, information that will enable us to take the necessary actions and investments that will not become obsolete on January the 1st. The new (business) world order requires us to be more agile, resilient, and decisive.

Paving course of action will start with mapping out our situation, starting with our 2022 “commitments” and available resources and transforming them into 2-3 alternative high-level release plans. We should incorporate issues like the order of importance and levels of risk. Once articulated, we will perform a crucial step which is foretelling the worst-case scenario for each alternative. This action will inhibit the Project Planning Fallacy (a common bias that implies, however long you think you need to do a task, you will actually need longer. Regardless of how many times you have done the task before, or how deep your expert knowledge is). In regular times this phenomenon has damaging potential. During Q4, it becomes extremely critical, there will be no overtime after December 31st. The conclusion of the process will include the selection of the optimal release plan and the construction of a detailed release plan (Sprints level).

At this point in time, missions are prioritized and the list of the required resources is established. The first choice is usually diverting resources from less crucial activities, if applicable, great. Nonetheless, we’re required to double-check that we’re not inflicting over “less short-term” (Q1 release dates are right around the corner) for the short-term demand. If this solution isn’t feasible the next step is checking our recruitment pipeline to review whether there are candidates in advanced stages and can be signed in the next week or so. The “usual” challenge with this path is that even if there are potential recruits their notice time plus onboarding duration (30 – 45 days) will prohibit any real Q4 contribution. This year “unusually” there is a chance that there is no open position for recruitment chances are slimmer than regular.

Once internal alternatives are exhausted, we should consider an external team enhancement. To those of you that are puzzled by the inconsistency between “external” and “reduced cost,” the answer is: consider a team that is onboarded ASAP and dismissed at the end of the year. No residual effect on other R&D commitments or recruitment/notice costs. If you want to go the extra mile, you may even consider going offshore. The issue with offshore development at the end of the year is that usually they are one week short at the end of December, one very crucial week. On the other hand, there are hybrid Israeli-offshore solutions that will enable us to attain reduced costs and the additional working week of the 25th.

The team is established, the backlog is set, and the first sprint commenced. There is an additional factor that can increase the probability of success. We should establish process transparency with our team. Maximizing knowledge of the situations and the context will enable the team members to both adapt and step up for the occasion.

At times like these, I recommend embracing the legendary basketball coach and leadership guru, John Wooden’s words of wisdom “Do not let what you cannot do interfere with what you can do”.

It’s time for operational efficiency?, click for more information on how CodeValue cab assist you with OPERATIONAL EFFICIENCY

Published by Hanan Zakai

VP Customers & Division Manager @CodeValue

Architecture Next 2022

It has been 2 years since we’ve had the pleasure to meet and greet in person, and the long wait was worth it! over 420 people gathered for a full day of invigorating talks, mingling, and some good food.

What was it all about?

Time brings change, and COVID has played its role as a catalyst in that matter. Recent effects have led to an unprecedented shift to remote-first work culture and accelerated multi-cloud adoption, not only in terms of harnessing tools and platforms to be used to serve one’s business but also in the products and systems that the company builds, demanding elevated requirements regarding hybrid and cross-platform scenarios

In the software industry, we are expected to be agile and adapt to changes quickly and effectively. This isn’t related merely to development but spans the entire business, product, R&D, UX/UI, architecture, technology, development, and DevOps. 

The need for building software that can operate on and/or integrate with different platforms is rising steadily. New levels of productization, process, and automation are required to meet the new challenges at a high pace, as well as find effective ways to scale MVP products and R&D. 

Therefore, our job as leaders, architects, and engineers, to figure out how to build and best use technologies, tools, and platforms to meet such diverse needs and scenarios, has become more and more challenging. 

This was the fifth consecutive year of CodeValue’s Architecture Next conference. Throughout the day we discuss revolutionary and innovative concepts, technologies, and tools while showing how these can be leveraged and applied to make business, processes, and systems better. 

General assembly

The conference day was launched with an introduction by SpeedValue’s & CodeValue’s CEO Tali Shem Tov and Chairman Ayal Zylberman. Tali & Ayal delivered a brief talk about the changes we have all experienced during the last 2 years and what those changes entailed for our industry.

Keynote: Architecture Stories from the Trenches – Alon Fliess

The keynote session was delivered by Alon Fliess, CodeValue’s Chief Architect that shared with the audience his experiences of over 30 years of significant software development, design, and architecture projects for global leading and cutting-edge companies. 

Software Architecture in the Multi-Cloud Era – Amir Zuker, Rotem Barda (Vayyar), Barak Mor

This talk is about building systems with respect to multi-cloud, focusing on architecture and technological concepts in addition to business aspects. One of the key principles to achieve it is to build the right abstractions.

How do you do that though? There are so many options to choose from, what are they and how do you implement them? In this session, we tackled these subjects head-on while sharing real interesting demos in the process. Furthermore, we presented a real-world case study of one of our existing customers, Vayyar, and discussed our journey together in transforming their business, product, and technology towards multi-cloud.

Experts Panel

Alon Fliess, Amir Zuker, Hanan Zakai, Amit Kinor, Tomer Karasik, Nir Dobovizki, Eran Barghil.


Executive Track

The Perfect Host – An Ongoing Story

Alon Fliess, Chief Architect at CodeValue (MVP & Microsoft Regional Director).

Web3: From A to S (Security)

Tal Be’ery, CTO Zengo

From Offshore to Global Delivery

Tali Shem Tov, CEO & Co-owner, CodeValue & Esti Felba Hermesh, Director of Global Delivery, CodeValue.

Artificial Intelligence, Machine Learning (and when not to use them)

Nir Dobovizki, Senior Consultant and Software Architect & Backend Practice Lead  @CodeValue


Technologies Track

Composable Components – Play Application Lego

Tomer Karasik, Technical Lead and Software Architect at CodeValue & Ilya Holtz, Senior Full Stack Developer at CodeValue

Building Modern IoT Data Pipelines

Alon Amsalem, Software Architect & CodeValue

Micro Front End – Web-Components in Practice

Yehuda Buzaglo, Senior frontend @CodeValue

Architecture Next 2021

Digital Transformation is one of the most profound changes happening in the technological world around us. More businesses understand that they must level up their tech strategy or be left behind. With a massive amount of cloud, AI/ML, and other emerging technologies, software professionals and decision-makers have difficulty keeping up to date.

How can we achieve Digital Transformation? How can we translate those high-level principles and fancy words to ideas and plans to implement in our software? This is what this year’s Architecture Next was for.

At Architecture Next 2021, we discussed revolutionary concepts and tools for the fourth consecutive year and showed you how they can be applied towards making your next software system a better one. We saw how you could implement Digital Transformation in your software systems and how you could utilize your software architecture to accomplish more.

General assembly

The conference day was launched with an introduction by CodeValue’s CEO, Tali Shem Tov. Tali delivered a brief talk about what is Digital Transformation and where does it meet CodeValue’s offering.

Keynote: the IDF’s Journey to the cloud

The keynote session was delivered by the guest speaker “Merav” An officer in the IDF’s Digital Transformation Directorate who took us along the IDF’s journey to the cloud.


Executive Track

Digital Transformation – Buzzword or Reality

The first session on this track was given by Alon Fliess, Chief Architect at CodeValue (MVP & Microsoft Regional Director). In his session, Alon states that there are only two types of organizations, those that already realized that they are software shops and those that haven’t. This introductory session discusses the digital transformation revolution, what it is? and what any organization should do about it? Alon discusses the analysis process, the effect on the products or services, the human resource, and the technology perspectives.


Designing Products in the Digital Transformation Era

The second session was given by Eyal Livne Senior User Experience Architect at CodeValue. In his talk, Eyal introduces the CodeValue workshop as the flagship ‘getting started’ method for initiating a successful digital transformation. 


Application Evolution Strategy

Eran Stiller, CodeValue’s CTO gave the third session on this track, in which he reviewed the technical methods we have to modernize our software systems. He reviewed the questions that we should ask ourselves and the strategies that we can employ. Starting from lift & shift through containerization to cloud-native apps – He’s taking you on a journey that’s relevant for any modern software’s stakeholder.


The IoT Transformation and What it Means to You

The 4th session on this track we had the pleasure to hear Nir Dobovizky, a Senior Consultant and Software Architect at CodeValue. In his talk, Nir covered why IoT is as important as the hype says and what it means for your business


What Can You Do When Your Release Plan is Being Concluded at the HR Office?

To conclude the Executive track we heard Hanan Zakai CodeValue’s Technology Division Manager shading light on the lessons learned from Andi grove (the legendary Intel’s former CEO), the competition between Netflix & Blockbuster, and the Challenger’s crash disaster to articulate the real recruitment challenge and its magnitude and establish the means to face them and even create new opportunities.

Modern Technologies For Digital Transformation Track

State in Stateless Serverless Functions

To kick off the “hands-on” track, Alex a Software Architect at CodeValue, talked about how we can manage state in a stateless, serverless environment on Azure, by utilizing Azure Durable Functions and how we can use the eco-system to build entire systems, completely serverless. 


How I Built a ML-human Hybrid Workflow Using Computer Vision

The second session in this track was given by CodeValue’s Amir Shitrit a Software Architect at CodeValue. In this talk, Amir demonstrated how he built business workflows using the joint effort of humans and software to automate those boring tasks, while compensating for the inaccuracy of ML with human intervention.


We Come in Peace: Hybrid Development with WebAssembly

Following Amir, Maayan Hanin a prominent Software Architect at CodeValue examined the relationship between WebAssembly, JavaScript, TypeScript, the browser, and other hosting environments.


Will the Real Public API Please Stand Up?

Amir Zuker is a Senior Software Architect and our Web & Mobile Division Leade. Amir concluded this track with a discussion about authoring Public API’s between systems, be that different parts within the same distributed system or a fully blown real-world public API and everything in between.


Panel -Architecture for Digital Transformation

The topping on the ice cream was the Digital Transformation experts panel, hositng our own experts: Alon, Eran, Amir, Maayan, Nir, Eyal & Hanan . In the panel, the experts talked about all things Digital Trnasformation and answered questions

We are here for you

Need Consulting or development services? Contact us via the form below. in the meantime, thank you and see you next year!

Surviving the talents’ recruitment challenge

Rapyd’s recruitment billboards that popped out recently on prime locations are another remarkable reminder for the mind-blowing resources invested in the efforts for recruiting top development talent.

Are HR/Talent acquisition/Recruiters/Head hunters being “doomed” to this Sisyphus mission…probably yes at least for the coming year or so. However, is there any way to ease the burden of the rock? Decrease the slope? Survive this never-ending “battle”? Maybe there is.

We should start with challenging some of the paradigms regarding talent recruitment.

 woman-pushing-world-rock
Sisyphus mission

We have yet to see this level of challenge. Not so sure? I recall my first year the high-tech industry back on 2000 before he first bubble burst. It was at the center parking area of Herzlyia Pituach at Maskit st. the cars’ windshields were covered by a paper advertisement “If you’re a software developer with three years of experience come work with us for 30,000 ILS a month”, you can draw the fine line from the windshields to the billboards, the challenge of getting top talent is here for a few decades and counting.

We will be able to recruit and preserve our professionals for many years to come. Apparently not. The paradigm that a developer will stay for 5-6 years doesn’t carry its weight, it’s an outcome of endless opportunities, anthropological evolutions (generation Z, millennials…) that narrowed the average life span of a developer’s position to somewhere between 2-3 years, in certain line of expertise like Devops even less. Recent surveys reflected that the share of developers that left their jobs doubled itself from 2008 up to 15% on 2017 and estimated to excel the 20’s at the start of 2021.

However, there are new factors: Covid-19 boost to Digital Transformation, propelled the already “smoking” Israeli high-tech ecosystem which is being reflected by an outstanding number of around 50 Unicorns, 400% increase from 2019. More jobs, more money, many more deadlines. On the other hand, remote & hybrid working models enlarged the developers’ potential working radius and enabled recruitment of peripheral habitants to companies in mid-town Tel-Aviv and co.

Bottom line, I don’t envy my talent acquisitions friends.

Dan Heath in his book ” Upstream, the quest to solve problems before they happen” addresses solutions for this type of challenges where one should exercise an “Upstream mindset” that will enable him to proactively diminish a problem and not just continuously react to its outcomes.

It’s a downstream state of mind to hectically source potential employees and offer them the moon and stars and then start looking for their replacements one or two years later, whereas it’s an Upstream activity to shift our resources to the roots of this situation. In order to analyze it, we should focus on one of or the most crucial factor, no company wants to “grow” developers, everybody wants and experienced developer that will provide immediate impact on their code source. After all a major investment usually reflects grave stakeholders’ pressure and tight release plans. Upstream thinking will point us to bridging the preliminary and biggest gap of migrating an entry level developer to an efficient productive, somewhat experienced contributor. Implementing the right planning and resources we can create a quicker “Time to Productivity”.

We exercised this way of thinking in CodeValue’s bootcamps. 10-12 entry level developers, fresh academic graduates, that we turned into smooth “coding machines”. The process starts with Bootcamp’s screening project which is different than the regular ones since we’re not looking for efficient/clean code but for top university alumni with sharp minds, positive attitude and extreme self-learning skills. To the few that passed the process we provided targeted, multi exercised training by our elite architects. After conclusion of this phase, we allocated them to projects in which a Senior Codevalue developer led and mentored them. These bootcamps enable us to cut “time to productivity” and provide quality code contributors to our clients. 

Throughout this time, we continued with most of our regular recruiting efforts, since the other part of the equation is keeping the fragile equilibrium between Bootcamp graduates and senior developers.

The cynical person will ask, but just four paragraphs above you wrote that other companies will target these developers, that’s correct however with lower kick off salaries and the quicker time to productivity eventually increase the net cost of contribution of each developer and with the right contracting, engagement and pin pointed employee preservation we can pick the “fights” for the developers that we best fit our DNA and standards.

And in a wider high-tech ecosystem perspective addition of dozens and hundreds of capable developers may somewhat flatten the unbalanced supply and demand curves and hopefully return some sanity to the Israeli high-tech scene and a few hours of relief to the recruiting personnel.

Do you want to scale-up your development team, click for more information on CodeValue’s Dedicated Bootcamps

Published by Hanan Zakai

Technology Division Manager @CodeValue

.NET Conf Israel 2020

Thank you for joining us at the CodeValue sponsored local Israel event following the global .NET Conf 2020.

.NET 5!

You heard it right. Released on 10/11/2020, .NET 5 is the next version of .NET. As the successor of .NET Core 3.1, this milestone release signifies a significant release in the journey to the .NET platform unification between .NET Core, Xamarin, and Mono. Along with exciting features coming in C# 9, these are thrilling times in the .NET space.

.NET Conf is an annual online event showcasing many of these advancements and capabilities. Following the global event, on Dec 2020, CodeValue hosted the local Israeli event, in Hebrew, where attendees were able to ask questions and get them answered. CodeValue experts highlighted the critical news and exciting stuff that .NET has to offer this year. See all 5 sessions from the event bellow and learn about the new release!


What’s New in C# 9

Moaid Hathot, Senior Architect and Consultant @ CodeValue, Azure MVP


Porting Projects to .NET 5

Nir Dobovizki – Senior Architect and Consultant @ CodeValue


C# Source Generators

Alon Fliess– Chief Architect @CodeValue, Azure MVP, Microsoft Regional Director


Blazor in .NET 5

Alex Pshul – Software Architect and Consultant @ CodeValue


Developing and Deploying Microservices with “Tye”

Eran Stiller– Chief Technology Officer @CodeValue, Azure MVP, Microsoft Regional Director


Panel – Q&A

Alon Fliess, Eran Stiller, Moaid Hathot, Alex Pshul, Nir Doboviski

Want to stay up to date? Follow us on Social Media

Planning for Microservices

Recently, we hosted a half-day online event where our experts shared their understanding of what Microservices are all about.

Sometimes it feels like everybody is creating Microservices Architectures. Everyone’s building a new system with Microservices, decomposing old monoliths, and generally giving us the feeling that Microservices is the only way to go. But is it the only option? What should we consider when approaching Microservices? When should or shouldn’t we use Microservices? And if we do decide to take the approach, how should we handle Microservices?

In this half-day online event Alon, Eran & Tomer shared their understanding of what Microservices are all about, when we should use them, what we should avoid, and how to implement them correctly. If you’re a novice to Microservices, or even if you’ve already heard quite a bit about them, you’ll find these talks beneficial. This workshop was intended for decision-makers, software architects, DevOps architects, senior developers, and senior DevOps engineers.


To Microservice or Not to Microservice? How?

Alon Fliess, Chief Architect @CodeValue

Do more with less, the pain of the modern architect. High cohesion & low coupling, high availability & scale, ease of DevOps. Our systems need to support all these quality attributes, while providing more functionality with less resources. We need to be agile, we need to embrace changes, we need to have a better way! Micro-Service-Architecture (MSA) promises to bring cure to the architect’s pains, but does it really deliver?

This lecture presents the essence of MSA, how does it answer the main concerns of modern distributed systems, how to get started, how to migrate current solutions to MSA by adopting an evolution migration path. What to be careful about and the signs that we are on the right track. We will talk about SA evolution, the CAP theorem and eventual consistency, MSA principles, hosting. containers, versioning, orchestrators & decupling business processes. By the end of this lecture, the participant will have a better understanding of why, when, and how to embrace MSA.


6 Lessons I Learned on My Journey from Monolith to Microservices

Eran Stiller, CTO @CodeValue

For the past couple of years, Microservices is all the rage. We want to use Microservices, we want to decompose into Microservices, and we want Microservices to be a part of our world. While modern tools and platforms such as Docker, Kubernetes, Service Mesh, and the public cloud help in implementing and maintaining such systems, the reality is that many fail even before the first line of code was written.

This can happen for many reasons; Perhaps you chose a Microservices architecture for the wrong reasons? Maybe the organization wasn’t ready for it? Or just possibly – perhaps the proposed architecture didn’t embrace the true meaning of Microservices?

As the CTO of CodeValue, I get to tackle these questions a lot. Join me in this session as I provide my perspective on transitioning from Monolith to Microservices through lessons learned in the real world while architecting and implementing multiple Microservices based software systems at various customers.


A Recipe for Pickled Microservices

Tomer Shamam, Senior Software Architect @CodeValue

Microservices are actually small and self deployed apps which can be distributed and scaled. The best recipe to “pickle” micro-services and harness their true power, is to isolate them from others, putting them inside a container. A container is a standard unit of deployment that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.

In this session, we will discuss app containers in general and we will learn how easy and beneficial is to containerize micro-services using Docker, the lead app container solution in the market.


Panel – Microservices Open Q&A

Alon Fliess, Eran Stiller & Tomer Shamam discuss and answer Microservices-related questions raised from the audience.


Want to stay up to date? Follow us on Social Media

Architecture Next 2020

One full day, over 500 participants, 3 tracks, 13 lectures, and one cloud experts panel!
Wow, we had a blast.

The software development world is developing at a tremendous pace. New technologies and platforms are abundant, and things that were brand new a year ago sometimes suddenly seem like ancient history. The software architect’s job is to figure out how to best use these technologies and platforms to his/her advantage, and that job is getting harder and harder. At Architecture Next 2020 we discussed revolutionary concepts and tools and demonstrated how they can be applied towards making your next software system a better one.

Due to Coronavirus related restrictions, this year’s conference, held for the third consecutive year, was all virtual. But as usual, was packed with great content and insightful speakers.

General assembly

The conference day was launched with an introduction to CodeValue’s new CEO, Tali Shem Tov. Tali delivered a brief talk about models & technology of the new era.

The keynote session was delivered by guest speaker Magnus Mårtensson the Founder & CEO of Loftysoft, a Microsoft Azure Most Valuable Professional and a Microsoft Regional Director. Based on his extensive experience of helping numerous (small to enterprise) customers, Magnus highlighted some areas with important learnings and common challenges to target early optimization paths on the way to the cloud.

Keynote: The Cloud challenge is more than just technical – people are involved

Upon completion of the opening lecture, the day was divided into 3 different tracks:


Executive Track

The first session on this track was given by Alon Fliess, Chief Architect at CodeValue (MVP & Microsoft Regional Director). In his session, Alon elaborated on the essence of the APM systems, the good, the bad, and the vision about their future.

APM – What Is It, and Why Do I Need It?


The second session was given by Erez Pedro, co-founder and head of product & UI/UX at CodeValue. In his talk, Erez demonstrated how together we evolve the system from a technical device to a full product in a process including analysis, design with rapid prototyping.

Product Thinking 101


Nir Dobovizk, a Software Architect and a Consultant at CodeValue gave the third session on this track, in which he told us the tragic story of the microservices-based, modular, fully automatic, next-generation, totally buzzword-compliant, multi-satellite ground station that wasn’t.

In Space, No One Can Hear Microservices Scream – a Microservices Failure Case Study


To conclude the Executive track we had the pleasure to hear Alex Pshul, a software architect, consultant, speaker, and tech freak. Alex shared with us what can be learned from testing the execution of 300K messages per second in a totally serverless system.

What We Learned by Testing Execution of 300K Messages/Min in a Serverless IoT System


Cloud & Back-End Track

To kick off the Cloud & Back-end track, Michael Donkhin a Software Architect at CodeValue, talked about all things Java. He started with a retrospective of the Java platform history. Next, was a review of some of the most popular frameworks around Java. And finally, Michael concluded with a review of ongoing efforts to improve the platform further and extend its reach, like project Valhalla and GraalVM. 

Java Turns 25 – How Is It Faring and What Is Yet to Come


The second session in this track was given by CodeValue’s Co-Founder & CTO Eran Stiller. Eran is recognized as a Microsoft Most Valuable Professional (MVP) on Microsoft Azure since 2016 and as a Microsoft Regional Director (MRD) since 2018. In his talk, Eran reviewed today’s most popular API formats and their relative strengths and weaknesses. From REST, through OpenAPI, via gRPC and to the rising star of AsyncAPI. 

API Design in the Modern Era


Following Eran, Moaid Hathot a prominent Software Consultant at CodeValue. Moaid introduced Dapr and demonstrated how we can use it to build a distributed, cloud-native, microservices application using various programming languages and frameworks, that can run virtually anywhere.

Dapr: The Glue To Your Microservices


Ronen Levinson is a DevOps Engineer and consultant at CodeValue. Ronen concluded the Cloud & Back-End track with a discussion about what is OPA, He explored OPAs’ integrations with all the levels of the cloud-native stack, along with on-stage demos.

Centralized Policy Governance With OPA


Front-End Track

Amir Zuker, a Co-Founder of CodeValue and its Web and Mobile division leader, is a senior software architect specializing in .NET and Web-related toolchain and technology stack. Amir opened the track with a session covering the emergence of WebAssembly into the app world while using Blazor and C#.

Building Web Apps With WebAssembly and Blazor


The second session in this track by Vitali Zaidman, a Web Architect and Blogger from Welldone Software, demystified the different approaches and discussed the trade-offs while exploring real-world examples.

Do You Need Server Side Rendering? What Are The Alternatives?


Eyal Ellenbogen was our third speaker on that track. Eyal is a Web Developer and Architect at CodeValue. In his session, he explored the process and the decisions involved in building a UI component toolkit and how to get it right the first time.

Building a UI Foundation for Scalability


Ending this track was Vered Flis, a Senior Software Engineer at CodeValue. In her session, she tackled the big questions head-on and unravel different approaches and practices that will assist you in writing highly performant web apps as is expected today.

Because Performance Matters!


Panel – Public Cloud, Hybrid Cloud, Israeli Cloud, Microservices, PaaS, SaaS, and Everything in Between

The topping on the ice cream was the Cloud experts panel, where our own cloud experts: Alon, Eran, Amir & Hanan hosted Tomer Simon (Ph.D.) the National Technology Officer in Microsoft Israel. in the panel, the experts talked about all things Cloud and answered questions such as: How should you approach the move to the cloud? What are the risks of an on-prem requirement? Should you use PaaS & SaaS, or is IaaS king? Which cloud vendor should you use? And many more pressing issues.

We are here for you

Need Consulting or development services? we’re here for you .

Micro Frontends Patterns

Thank you all who attended our webinar, delivered by Amir Zuker on Micro Frontends – “extending the microservice idea to frontend development”.

So, what does it really mean? Is it just another hype? should you consider it? How should one approach it?

These are just some of the questions one might ask when presented with this notion. Long story short – it’s possible! However, it is not for everyone, and especially to the full degree.

View this session where Amir demystifies the concept of micro frontends and tackles the subject head-on.

Micro Frontends Pattern – Replay

We are here for you

Need Consulting or development services? we’re here for you .

gRPC Health Checks with .NET Core & Kubernetes

Where were we

In a previous post, we saw what health checks are, why they are so important, and how we can incorporate them into our ASP.NET Core Web API application.

In this post, we are going to see how to use the same health checks, but with gRPC services rather than Web API.

black earbuds
Photo by Wesley Tingey on Unsplash

One health check model to rule them all?

Ok, so adding health checks to a Web API project is very easy, but what about our gRPC services? I mean, we could use regular HTTP endpoints just as we would for normal HTTP APIs. After all, gRPC is based on HTTP/2. However, while this is a viable option, it is far from ideal, for various reasons. First of all, being able to reach a service over HTTP/2 is not a sufficient indication, because we also want to ensure we can understand the gRPC protocol. The second reason is that gRPC has an official health check protocol and we’ll be wise to use it.

The official gRPC health check protocol is a simple service with two RPC methods: Check, for polling for the service’s health status in a pull manner, and Watch, for receiving a stream of health status notifications in a push manner.

The .proto file for the health check services looks like this:

syntax = "proto3";

package grpc.health.v1;

message HealthCheckRequest {
  string service = 1;
}

message HealthCheckResponse {
  enum ServingStatus {
    UNKNOWN = 0;
    SERVING = 1;
    NOT_SERVING = 2;
  }
  ServingStatus status = 1;
}

service Health {
  rpc Check(HealthCheckRequest) returns (HealthCheckResponse);

  rpc Watch(HealthCheckRequest) returns (stream HealthCheckResponse);
}

The important thing to notice here is the HealthCheckRequest message, which contains the optional service parameter. The service parameter is used to indicate the service for which we’d like to query the health status. If not specified, the system (or our microservice) as a whole is assumed.

Also notice that there is no mention of whether the check should be for liveness or readiness, and that’s because the gRPC’s health check protocol is generic and has no such concept of liveness and readiness which are purely Kubernetes concepts. To solve this we could use the service parameter as a means to communicate the check type we’d like to query, as we’ll see soon.

This gap between the gRPC’s health check protocol and the health check model employed by ASP.NET Core poses a problem for us, and while we are not obligated to implement the official gRPC health check protocol, we would be wise to do so because of two reasons; one being that we’d like to confirm to the defacto standard of health checks in gRPC services. The other reason is that the gRPC ecosystem already offers tools that know how to query for a service’s health status, based on that protocol, as we’ll see soon.

Ok, so we’ve decided to go along with the official gRPC health check protocol. What do we actually need to do? In .NET Core 3.1, gRPC has a first-class support and by using the existing .NET Core health check service, all we need to do is map the health check service’s endpoint  just as would for every other gRPC service:

endpoints.MapGrpcService<HealthServiceImpl>();

And we’re good to go. Almost.

While simple to use, there are two issues with the HealthServiceImpl implementation. The first, due to no fault of its own, is that it confirms to the official gRPC health check protocol. That’s a problem because it works by maintaining an internal dictionary of services and their respective health status, which we need to update manually and periodically rather than having it query registered checks upon being invoked.

The second problem is that it is completely ignorant of health check registrations offered natively by ASP.NET Core, as we saw in the HTTP Web API example above.

To deal with this problem we can implement our own Health Check service by inheriting the HealthBase class, just as the HealthServiceImpl does, and override the Check method by simply invoking the existing ASP.NET Core HealthCheckService service, which does make use of the registered health checks. This would look like this:

public class GrpcHealthCheckService : Health.HealthBase
{
    private readonly HealthCheckService _healthCheckService;

    public GrpcHealthCheckService(HealthCheckService healthCheckService)
    {
        _healthCheckService = healthCheckService;
    }

    public override async Task<HealthCheckResponse> Check(HealthCheckRequest request, ServerCallContext context)
    {
        Func<HealthCheckRegistration, bool> GetHealthCheckPredicate()
        {
            string[] tags = request.Service?.Split(";") ?? Array.Empty<string>();

            static bool PassAlways(HealthCheckRegistration _) => true;

            if (tags.Length == 0)
            {
                return PassAlways;
            }

            bool CheckContainsTags(HealthCheckRegistration healthCheck) =>
            healthCheck.Tags.IsSupersetOf(tags);

            return CheckContainsTags;
        }

        var predicate = GetHealthCheckPredicate();

        var result = await _healthCheckService.CheckHealthAsync(predicate, context.CancellationToken);

        var status = result.Status == HealthStatus.Healthy ? ServingStatus.Serving : ServingStatus.NotServing;

        return new HealthCheckResponse
        {
            Status = status
        };
    }
}

In this implementation, we use the service parameter of the request to indicate the type of probe (or tags, in general) we’d like to check: liveness, readiness or, if unspecified, both. Also, note that we obtain the HealthCheckService instance in the constructor and delegate the health check request to it when being invoked. The GetHealthCheckPredicate local method is used to determine whether to check for tags at all or to simply use all registered checks if no tags are specified in the form of semicolon-separated values.

Now, all we need to do is to register that service, just as would for any other gRPC service:

app.UseEndpoints(endpoints =>
    {
        endpoints.MapGrpcService<GrpcHealthCheckService>();
        endpoints.MapGrpcService<GreeterService>();
    });

What’s missing?

At this point, we’ve managed to expose a health check gRPC endpoint and now we’re ready to wire it up to Kubernetes, but there’s one problem: we can’t. As of the time of writing this post, Kubernetes doesn’t support gRPC for health probe endpoint definition. Only command, HTTP, and TCP probes are supported.

To account for this limitation, we can use a well-known tool named grpc_health_probe. This tool is a simple command-line tool that can be used as a command probe. When invoked, the tool will invoke the health check gRPC endpoint of our service, as defined in the probe definition, which looks like this:

livenessProbe:
  exec:
    command: ["/bin/grpc_health_probe", "-addr=:5000"]
  initialDelaySeconds: 10

Basically, what this means is that the grpc_health_probe command will be used as the liveness probe and when invoked, it will try to invoke the Check RPC method on a local service on port 5000. It’s also possible to pass a non-default value for the service argument if we wish to distinguish between liveness and readiness probes using the method described above.

This figure below demonstrates how this works.

One important detail to notice here is that in this example, the grpc_health_probe is being deployed in the same container as our service. This is very convenient, but not fully indicative of our service’s availability because it says nothing about our service being reachable from outside our container, where various challenges, like timeouts and blocked ports, might manifest.

Closing thoughts

ASP.NET Core greatly simplifies the effort of defining and using health checks in HTTP Web APIs, but doesn’t yet offer the same convenience to gRPC services, which only recently have gained built-in support in .NET Core. To make things worse, Kubernetes itself doesn’t support gRPC endpoints as probes.

With a relatively little effort, we were able to bridge the gap between the existing gRPC health service implementation for .NET Core and the health check model of ASP.NET Core and expose a health check endpoint that uses proven and battle-tested health check libraries.

We also saw how the grpc_health_probe tool can compensate for the lack of Kubernetes’s support for gRPC health probes. With that, we have a working health check endpoint.

All of the sample code in this post can be found in this repository.

I hope you benefit from the ideas in this post and health to us all!

Implementing Health Checks in ASP.NET Core 3 & Kubernetes

What are health checks and why do we need them?

Glad you asked. When developing distributed applications, there is a multitude of reasons for your services to become unavailable. These reasons include, but are not limited to:

  • Various communication problems, such as connection or request timeouts, blocked ports, protocol versions mismatch, and other network appliances failures.
  • Resource saturation problems, such as overloaded CPU, insufficient memory or disk space, and an overloaded network interface.
  • Cascading failures due to unavailable dependencies, such as database, message queue, etc.
Photo by Hush Naidoo on Unsplash

Each of these phenomena, separately or combined, might lead to your service becoming unavailable to process up-stream requests.

While some services are of lower priority, others might be mission-critical. If those become unavailable, something needs to be done. As we all know, the first step to fixing a problem is to become aware that the problem exists in the first place, and this is where health checks come into play.

A health check, as its name implies, is a check issued by a stakeholder against your service in order to determine whether your service is healthy (or available) or not. The way this is usually done is by having your service expose an end-point (e.g. TCP, HTTP or gRPC) and the stakeholder sending a request to that end-point. If a healthy and timely response was received, your service will be considered healthy. If, on the other hand, a response hasn’t been received, or if the response reported an unhealthy status, your service will be considered unhealthy.

What’s a stakeholder? By stakeholder, I mean another service or tool that needs to know your service’s health. In modern distributed systems, such stakeholders typically include APM/monitoring tools (e.g. New Relic, AppDynamics and Prometheus) – which need to show your service’s status and alert if there’s something wrong, load-balancers – which need to know whether or not to direct traffic to your service, and last but not least, orchestration tools such as Kubernetes as explained below. In this post, I’m going to focus on ASP.NET Core and Kubernetes being that it’s the most popular container orchestrator out there.

Kubernetes health probes

When defining a pod in Kubernetes, it is possible to also specify three probes (the Kubernetes term for health-check) for your service; a liveness probe, a readiness probe and a startup probe, which I’ll ignore for now.

A liveness probe is a check that Kubernetes uses in order to determine whether a pod is alive/available. If it’s not, depending on the pod’s restart policy, Kubernetes may decide to restart it.

A readiness probe, on the other hand, is a check that Kubernetes uses in order to determine whether your service is ready to accept traffic.

Both liveness and readiness probes can be specified in different methods, of which HTTP calls and command-line tools are the most common.

Here’s an example of a livenes probe implemented as an HTTP request:

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-http
spec:
  containers:
  - name: liveness
    image: k8s.gcr.io/liveness
    args:
    - /server
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
        httpHeaders:
        - name: Custom-Header
          value: Awesome
      initialDelaySeconds: 3
      periodSeconds: 3
      timeoutSeconds: 2
      failureThreshold: 5

In this example, Kubernetes will be checking the liveness of the pod by issuing an HTTP GET request to the /healthz path on port 8080 with a custom header and waiting for at most 2 seconds before declaring the check as failed. After 5 failures, the pod will be considered unhealthy. Kubernetes will perform this check every 3 seconds with an initial delay of 3 seconds to account for cold startups, although Startup probes can also be used for that purpose.

HTTP API health checks in ASP.NET Core

In ASP.NET Core HTTP APIs, we would use the built-in support for defining and exposing health check endpoints as can be seen in the following example:

using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Diagnostics.HealthChecks;
using Microsoft.AspNetCore.Hosting;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;

namespace HttpApiWithHealthChecks
{
    public class Startup
    {
        private const string Liveness = "Liveness";
        private const string Readiness = "Readiness";

        public Startup(IConfiguration configuration)
        {
            Configuration = configuration;
        }

        public IConfiguration Configuration { get; }

        public void ConfigureServices(IServiceCollection services)
        {
            services.AddControllers();

            string dbConnectionString = Configuration.GetConnectionString("OperationalDB");
            string redisConnectionString = Configuration.GetConnectionString("Cache");

            services.AddHealthChecks()
                .AddSqlServer(dbConnectionString, tags: new[] { Liveness, Readiness })
                .AddRedis(redisConnectionString, tags: new[] { Readiness });
        }


        public void Configure(IApplicationBuilder app, IWebHostEnvironment env)
        {
            app.UseRouting();

            app.UseEndpoints(endpoints =>
            {
                endpoints.MapHealthChecks("/liveness", new HealthCheckOptions
                {
                    Predicate = check => check.Tags.Contains(Liveness)
                });

                endpoints.MapHealthChecks("/readiness", new HealthCheckOptions
                {
                    Predicate = check => check.Tags.Contains(Readiness)
                });

                endpoints.MapControllers();
            });
        }
    }
}

The important parts to note in this example are the definition of the health checks and the installation of the health checks within the request processing pipeline.

The health checks definition part looks like this:

services.AddHealthChecks()
    .AddSqlServer(dbConnectionString, tags: new[] { Liveness, Readiness })
    .AddRedis(redisConnectionString, tags: new[] { Readiness });

Here we have to check types: one for making sure our operational DB is reachable and a second one for ensuring our cache server is reachable. While the first check is relevant to both the readiness and liveness probes, the second one is relevant only to the readiness probe.

The health checks usage part looks like this:

endpoints.MapHealthChecks("/liveness", new HealthCheckOptions
{
    Predicate = check => check.Tags.Contains(Liveness)
});

endpoints.MapHealthChecks("/readiness", new HealthCheckOptions
{
    Predicate = check => check.Tags.Contains(Readiness)
});

Here we expose two HTTP health check endpoints: one for liveness, using all checks tagged with the “Liveness” tag, and one for readiness, using all checks tagged with the “Readiness” tag. These are the endpoints specified in our pod’s .yaml file as demonstrated above.

Note how this service exposes the health check endpoints over the same port as the regular API. This is important because if the API port is blocked by a firewall, this will affect the health checks as well and that’s exactly what we want.

Summary

Health checks are an important pattern to employ when developing distributed applications. Among other systems, Kubernetes makes special use of them when starting containers and directing traffic to them.

ASP.NET Core offers a comprehensive model for defining and using health checks in HTTP Web APIs. Using existing NuGet libraries, such as those found in the AspNetCore.Diagnostics.HealthChecks repository, we can easily express our service’s health as the aggregated health of its various dependencies.

In a followup post, we’ll see how to incorporate health checks in gRPC services in ASP.NET Core. Stay tuned.