Microsoft DP-200 / 201 links by David Papkin

This David Papkin page contains links on Microsoft Azure DP-200 course.

Azure Bot

Bots provide an experience that feels less like using a computer and more like dealing with a person – or at least an intelligent robot.

About Azure Bot Service

Quickstart – Create a bot with Azure Bot Service

Azure Cosmos DB

Azure Cosmos DB documentation

Azure Cosmos DB introduction

Azure Cosmos DB Quickstart

Azure Cosmos DB vs Azure SQL

Calculating Cosmos DB Request Units (RU) for CRUD and Queries

CosmosDB Capacity Calculator



Different types of Databases

Document storedocument-oriented database systems, are characterized by their schema-free organization of data.
Graph DBMS – represent data in graph structures as nodes and edges, which are relationships between nodes. They allow easy processing of data in that form, and simple calculation of specific properties of the graph, such as the number of steps needed to get from one node to another node.
Key-value storesimplest form of database management systems. They can only store pairs of keys and values, as well as retrieve values when a key is known.

These simple systems are normally not adequate for complex applications. On the other hand, it is exactly this simplicity, that makes such systems attractive in certain circumstances. For example resource-efficient key-value stores are often applied in embedded systems or as high performance in-process databases.

Wide column store (column base) – store data in records with an ability to hold very large numbers of dynamic columns. Since the column names as well as the record keys are not fixed, and since a record can have billions of columns, wide column stores can be seen as two-dimensional key-value stores.

Azure CosmosDB – Azure Cosmos DB is a fully-managed database service with turnkey global distribution and transparent multi-master replication.

Azure SQL – relational DB

Azure Data Bricks

What is Azure Data Bricks?

Azure Databricks is the latest Azure offering for data engineering and data science. Databricks’ greatest strengths are its zero-management cloud solution and the collaborative, interactive environment it provides in the form of notebooks.

Databricks is powered by Apache Spark and offers an API layer where a wide span of analytic-based languages can be used to work as comfortably as possible with your data: R, SQL, Python, Scala and Java. The Spark ecosystem also offers a variety of perks such as Streaming, MLib, and GraphX.

Data can be gathered from a variety of sources, such as Blob Storage, ADLS, and from ODBC databases using Sqoop.

Tutorial: Extract, transform, and load data by using Azure Databricks

These notebooks show how to convert JSON data to Delta Lake format, create a Delta table, append to the table, optimize the resulting table, and finally use Delta Lake metadata commands to show the table history, format, and details.

Manage Notebooks

You can manage notebooks using the UI, the CLI, and by invoking the Workspace API. This topic focuses on performing notebook tasks using the UI. For the other methods, see Databricks CLI and Workspace API.

Introductory Notebooks

Connecting to SQL Databases using JDBC

JDBC – Introduction

JDBC To Other Databases

Read and Write Apache Parquet file in Spark

How to view Apache Parquet file in Windows?

Azure Data Catalog

What is Azure Data Catalog?

Azure Data Catalog documentation

Azure Data Factory

Introduction to Azure Data Factory

Azure Data Factory

Create Azure Data Factory from Cloudshell

Azure Data Lake

Azure Data Lake is an on-demand scalable cloud-based storage and analytics service. It can be divided in two connected services, Azure Data Lake Store (ADLS) and Azure Data Lake Analytics (ADLA). ADLS is a cloud-based file system that allows the storage of any type of data with any structure, making it ideal for the analysis and processing of unstructured data.

Azure Data Lake Analytics

Azure Data Lake Analytics is a parallelly-distributed job platform that allows the execution of U-SQL scripts on the Cloud. The syntax is based on SQL with a twist of C#, a general-purpose programming language first released by Microsoft in 2001.

Azure Data Lake Storage

Azure Data Lake Storage Gen1 (previously known as Azure Data Lake Store) is an enterprise-wide hyper-scale repository for big data analytics workloads. Data Lake Storage Gen1 lets you capture data of any size, type, and ingestion speed. The data is captured in a single place for operational and exploratory analytics.

Data Lake Storage Gen2 is the result of converging the capabilities of Microsoft two existing storage services, Azure Blob storage and Azure Data Lake Storage Gen1.

Azure Data Lake Storage Docs

Quickstart: Analyze data in Azure Data Lake Storage Gen2 by using Azure Databricks

What is Apache Hadoop?

Azure HDInsight

Azure HDInsight is a cloud service that allows cost-effective data processing using open-source frameworks such as Hadoop, Spark, Hive, Storm, and Kafka, among others.

Using Apache Sqoop, we can import and export data to and from a multitude of sources, but the native file system that HDInsight uses is either Azure Data Lake Store or Azure Blob Storage.

What is Azure HDInsight?

Azure HDInsight documentation

Azure SQL Data Warehouse

What is Azure SQL Data Warehouse?

SQL Data Warehouse Documentation

Quickstart: Create and query an Azure SQL Data Warehouse in the Azure portal

What is Polybase

Polybase Tutorial

Azure Stream Analytics

What is Azure Stream Analytics?

Azure Stream Analytics documentation

Azure Event Hub Stream Analytics and Power BI demo (Lab 6 concepts)

Comparison of Databricks vs HDInsight vs Data Lake Analytics

Cloud Analytics on Azure: Databricks vs HDInsight vs Data Lake Analytics

Updated DP-200 Labs

Demo video useful for Lab6b

Azure Event Hub Stream Analytics and Power BI Demo

Extract Lab6B into E:\Allfiles\Labfiles\Starter\DP-200.6 folder.

New DP-200 Lab 6B

Extract Lab7 into E:\Allfiles\Instructions folder.

Updated DP-200 Lab 7


Lambda Architecture implementation using Microsoft Azure

Big data architectures

Azure Cosmos DB: Implement a lambda architecture on the Azure platform

Databricks Lambda Architecture


Dynamic Data Masking

\AXA feedback

End of David Papkin page containing links on Microsoft Azure DP-200 course.

Helpful Azure  learning links

Microsoft Azure Forums  The Azure forums are very active. You can search the threads for a
specific area of interest. You can also browse categories like Azure Storage, Pricing
and Billing, Azure Virtual Machines, and Azure Migrate.

Azure Architecture Center  Gain access to the Azure Application Architecture Guide,
Azure Reference Architectures, and the Cloud Design Patterns.

Microsoft Learning Community Blog  Get the latest information the certification
tests and exam study groups.  Channel 9 provides a wealth of informational videos, shows, and

Azure Tuesdays With Corey  Corey Sanders answers your questions about
Microsoft Azure – Virtual Machines, Web Sites, Mobile Services, Dev/Test etc.

Azure Fridays  Join Scott Hanselman as he engages one-on-one with the engineers
who build the services that power Microsoft Azure as they demo capabilities,
answer Scott’s questions, and share their insights.

Microsoft Azure Blog  Keep current on what’s happening in Azure, including what’s
now in preview, generally available, news & updates, and more.

End of David Papkin Microsoft Azure page.

David Papkin favorite movies

Robert Deniro in GoodFellas