Server in data center, image by Kvistholt Photography on Unsplash

Learn how to plan for disaster recovery for data services in Azure

0. Introduction

  • Regional disaster: Outage in multiple data centers in a region caused by a natural disaster, e.g. …

Using Azure DevOps, Databricks Spark, Cosmos DB Gremlin API and Azure Data Factory

A. Introduction

  • 1. Setup an Azure DevOps project for contineous deployment
  • 2. Deploy Azure resources of data pipeline using infrastructure as code
  • 3. Run and monitor data pipeline

The code from the project can be found here, the steps of the modern data pipeline are depicted below. …

Manage access to your app with identities, roles and permissions


  • 1. Azure AD login with user role: basic or user role: premium
  • 2. Access to MS Graph using delegated permissions of signed-in user
  • 3. Access to backend using application permissions and app role

The code of the project can be found here, architecture can be found below. …

Learn to use identities and tokens in web apps and Azure SQL

1. Introduction

  • 1a: User logs in to web app and acquires a token
  • 1b: User calls a REST API to request a dataset
  • 2: Web app uses claims in token to verify user access to dataset
  • 3: Web app retrieves data from Azure SQL. Web app can be configured such that either the a) managed identity of the app or b) signed-in user identity is used for authentication to the…

Learn to write graph data in Cosmos DB using Gremlin and then to read/analyze data in Azure Databricks with GraphFrames.

0. Introduction

0. Architecture (image by Author)

In the remaining of blog, the following is done:

  • OLTP: write graph data to Cosmos DB using the Gremlin API and Python
  • OLAP: read data from Cosmos DB, analyze data in Azure…

Learn to create an Azure Function using a custom docker image to run a Selenium web scraper in Python

A. Introduction

  • Create and deploy docker image as Azure Function with Selenium
  • Scrape websites periodically and store results

The architecture of web scraper is depicted below.

A. Architecture to build a Selenium web scaper (image by Author)

In the remaining the steps are discussed to deploy and run your web scraper in Azure Functions. For details how to secure your Azure Functions, see this blog. For details how to create a custom docker image with OpenCV in Azure Functions, see here and DockerFile here. …

Learn how to automatically backup your data lake using blob snapshots and Data Factory incremental backups

1. Azure Storage backup - Introduction

  • Snapshot creation: In case a blob is added or modified, a snapshot is created from the current situation. Because of the nature of blobs, this is an efficient O(1) operation. Snapshots can be restored quickly, however, restoring cannot always be done (e.g. …

Secure Azure Functions with Azure AD, Key Vault and VNETs. Then connect to Azure SQL using firewall rules and Managed Identity of Function.

A. Azure Functions Security - Introduction

  • Init: Retrieve state from Storage Account
  • Request: Endpoint is called by another application/user
  • Processing: Data is processed using other Azure resources
  • Response: Result is replied to caller

Pattern is depicted below, in which data is retrieved from Azure SQL and returned to application/user. …

1. Introduction

  • Azure Active Directory (AAD) access control to data and endpoints
  • Managed Identity (MI) to prevent key management processes
  • Virtual Network (VNET) isolation of data and endpoints

In the remainder of this blog, it is discussed how an ADFv2 pipeline can be secured using AAD, MI, VNETs and firewall rules. For more details on security of Azure Functions, see my other blog. …

1. Introduction

  • Business metadata: Data owner, data source, privacy level
  • Technical metadata: Schema name, table name, field name/type
  • Operational metadata: Timestamp, size of data, lineage

In the remainder of this blog, it is discussed how an Azure Data Lake can be set up and how metadata is added. For more details how to secure data orchestration in your Azure Data Lake, see my follow-up blog here. For a solution how to prevent data loss in your Data lake using snapshots and incremental backups, see this blog. …


René Bremer

Data Solution Architect @ Microsoft, working with Azure services as ADFv2, ADLSgen2, Azure DevOps, Databricks, Function Apps and SQL. Opinions here are mine.

