Our customer, packages delivery service provider, continuously invests and evolves its internal analytics solutions to increase end-client delivery services quality and empower business units by enabling data-driven solutions like incidents root cause analysis, delivery network monitoring and optimization, missed packages monitoring, trends detection that might cause accidents / injuries, etc.
Historically, the organization has IT groups dedicated to business units to support analytical teams in terms of data acquisition and storage. These IT groups operate autonomously within business units thus leading to data silos, data duplications, increased storage costs and eventual extra costs for analytical solutions on organizational level.
The customer would like to enable a more cohesive environment where data is shared across business departments, follows standardized data acquisition processes to provide cost effective, performant, secured and extensible analytical platform with integrated data to be used by analysts for exploration or by downstream systems for analytical needs.
Data silos, lots of data duplication, absence of standard format.
Lost of processes and coordination to acquire data depending on source and format.
Massive landscape of batch and streaming data sources within organization hosted on-prem:
Source applications would like to have autonomy in terms of on-boarding their data into the platform with minimum IT involvement.
Chosen approach is to setup Azure-based data platform leveraging Bring Your Own Data approach to consolidate data from streaming and batch data sources hosted on-prem into the platform storages that expose data at rest and in motion consumption endpoints by end-users and downstream systems.
Below are the main highlights of approaches used within a data platform:
Data publishing frameworks with a goal to support Bring Your Own Data (BYOD) approach through preparing libraries / codebase to be used by teams owning on-prem hosted source application to autonomously deliver batch data into platform data landing zones based on ADLS Gen2 or streaming data into streaming gateway represented as on-prem Kafka cluster.
Streaming Data Ingestion utilizing Kappa architecture, on-prem hosted Kafka clusters, Azure Event Hubs, Azure Functions and Spark Streaming to acquire and process continuously arriving messages from JMS-based data sources following by data harmonization and making data available to downstream systems at both in-motion and at-rest.
Batch Data Ingestion utilizing Azure Data Factory, Azure Databricks to initiate processing when data pushed by source applications arrives to the platform. Includes data ingestion and harmonization processes to make data available to downstream processes at rest.
As-a-service capabilities to automatically create configuration-driven data ingestion and harmonization processes within the platform for batch and stream data pushed by source applications.
Centralized Data Lake based on ADLS Gen2 and Data Warehouse based on Delta Lake that are organized around data domains and subdomains following data segregation principles and federated ownership.
Data Catalog based on Azure Purview to establish automated capturing of technical metadata, support processes of enrichment technical metadata with business glossaries to increase data discoverability, trust and reusability within organization.
MVP version of the platform was delivered in 6 months.
First business unit successfully migrated to new platform for on-boarding data sources from it’s operational systems with minimal IT involvement.
We are well-versed in the dynamic world of development across a variety of industries.
Electrical grid control center, replacement of legacy scada systems
Algorithmic and manual power trading platform to boost efficiency
Gas Logistics, supplies, capacity planning
Electricity auctions
FinOps Solutions, cloud infrastructure cost optimization
Healthcare information management system to streamline clinical workflows
Improving customer engagement
Data landscape consolidation
Brand tracking analytical product
Healthcare monitoring system modernization
Road safety improvement
Cloudera data platform migration
Analytical data exposure
Data intelligence system migration
Managing director: Mikhail Anfimau
Mergenthalerallee 15-21 65760 Eschborn, Germany
+49 6196 7008475
040 228 55754
DE345344498
HRB 123580