Microservices workflow orchestration

A recurring pattern in software architecture is the need to trigger a process or workflow that is implemented across multiple microservices and then report to the user the results when the process completes.

In a previous project, I faced this issue when building a SaaS application in the Intelligent Document Processing (IDP) space. The application was supposed to take a collection of scanned pages, split it in documents, and for each document perform several document understanding tasks. There is a mix of per-page-bundle, per-page and per-document processing steps.

Given the desire to develop each step independently and be able to scale the processing independently (e.g. page OCR consumes more resources than other tasks) I designed a system around a message bus (RabbitMQ) and individual workers that pull requests from message queues.

Unfortunately there aren’t a whole lot of easy to use solutions available for this type of design. Googling for “rabbitmq workflow orchestration” the most helpful link I get is for an article that recommends the use of BPMN for this type of design. That is rather centered in the Java ecosystem. For my use case I needed something that worked well in python and would be preferably language agnostic. I ended up building a custom solution for this company.

However as I have design conversations around system architecture topics I do often end up seeing scenarios where a similar tool would be desirable. That motivated me to start working on a new workflow orchestration project. It is still a work in progress but it is able to execute the kind of workflows that I’ve encountered in the past.

Workflows can be declared in simple yaml syntax as a set of steps with dependencies; sub-tasks can be triggered (as in my original requirement of processing a collection of pages, per page steps and documents) and workflows can mix services programmed in different programming languages.

There are existing open source orchestration tools to manage batch workflows such as Apache Airflow and Argo Workflows. In these systems, for each batch job, a new process instance is created and passed command line arguments that specify the workflow parameters.

This project provides similar functionality for online micro-services. For instance, in machine learning use cases it is common that loading the inference process takes in the order of 30s – 1m while processing a single user request is an operation in the order of 10-100ms. A system that is designed to process 10s or 100s of requests per minute can’t afford to use a batch approach for this type of system.

Instead all micro-services are pre-loaded and managed as standard online services, except that instead of receiving REST operations from a load-balancer, they receive requests and post responses from/to an AMQP message queue. This is done so that the logic of determining the next step in the workflow does not have to be distributed in the individual services. Debugging is also simplified as the workflow-manager is tracking the state of the user request.

I’m looking forward to getting some feedback. Drop me a line if you think it can be useful to any problem you are working on or have feature requests.

4 thoughts on “Microservices workflow orchestration

    • Rudrajit, thank you for your comment. I’ve updated the post to clarify. The short answer is that Airflow/Argo are designed to manage batch workflows. For each job, processes are started and stoped. For online services, the solutions are typically a BPM orchestrator. I couldn’t find one that suited my needs thus this project.

  1. Hi Pedro,
    Good work. Coincidently, I am working on a similar platform for telecom Revenue Assurance with the notion of Data Processing(DP) Chain and DPLinks as tasks. We used Kafka. Have you assessed Kafka vs RabbitMQ?

    • I’ve not used Kafka myself. As far as I can tell from the docs a AMQP queue and a Kafka topic would be similar in this type of application. It should be rather straightforward to have the workflow-manager use Kafka as a message bus. I’ll add that to the list of topics to investigate.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s