This David Papkin page contains links on Microsoft Azure DP-200 course.
Bots provide an experience that feels less like using a computer and more like dealing with a person – or at least an intelligent robot.
Azure Cosmos DB
Different types of Databases
Document store – document-oriented database systems, are characterized by their schema-free organization of data.
Graph DBMS – represent data in graph structures as nodes and edges, which are relationships between nodes. They allow easy processing of data in that form, and simple calculation of specific properties of the graph, such as the number of steps needed to get from one node to another node.
Key-value store – simplest form of database management systems. They can only store pairs of keys and values, as well as retrieve values when a key is known.
These simple systems are normally not adequate for complex applications. On the other hand, it is exactly this simplicity, that makes such systems attractive in certain circumstances. For example resource-efficient key-value stores are often applied in embedded systems or as high performance in-process databases.
Wide column store (column base) – store data in records with an ability to hold very large numbers of dynamic columns. Since the column names as well as the record keys are not fixed, and since a record can have billions of columns, wide column stores can be seen as two-dimensional key-value stores.
Azure CosmosDB – Azure Cosmos DB is a fully-managed database service with turnkey global distribution and transparent multi-master replication.
Azure Data Bricks
Azure Databricks is the latest Azure offering for data engineering and data science. Databricks’ greatest strengths are its zero-management cloud solution and the collaborative, interactive environment it provides in the form of notebooks.
Databricks is powered by Apache Spark and offers an API layer where a wide span of analytic-based languages can be used to work as comfortably as possible with your data: R, SQL, Python, Scala and Java. The Spark ecosystem also offers a variety of perks such as Streaming, MLib, and GraphX.
Data can be gathered from a variety of sources, such as Blob Storage, ADLS, and from ODBC databases using Sqoop.
These notebooks show how to convert JSON data to Delta Lake format, create a Delta table, append to the table, optimize the resulting table, and finally use Delta Lake metadata commands to show the table history, format, and details.
You can manage notebooks using the UI, the CLI, and by invoking the Workspace API. This topic focuses on performing notebook tasks using the UI. For the other methods, see Databricks CLI and Workspace API.
Azure Data Catalog
Azure Data Factory
Azure Data Lake
Azure Data Lake is an on-demand scalable cloud-based storage and analytics service. It can be divided in two connected services, Azure Data Lake Store (ADLS) and Azure Data Lake Analytics (ADLA). ADLS is a cloud-based file system that allows the storage of any type of data with any structure, making it ideal for the analysis and processing of unstructured data.
Azure Data Lake Analytics
Azure Data Lake Analytics is a parallelly-distributed job platform that allows the execution of U-SQL scripts on the Cloud. The syntax is based on SQL with a twist of C#, a general-purpose programming language first released by Microsoft in 2001.
Azure Data Lake Storage
Azure Data Lake Storage Gen1 (previously known as Azure Data Lake Store) is an enterprise-wide hyper-scale repository for big data analytics workloads. Data Lake Storage Gen1 lets you capture data of any size, type, and ingestion speed. The data is captured in a single place for operational and exploratory analytics.
Data Lake Storage Gen2 is the result of converging the capabilities of Microsoft two existing storage services, Azure Blob storage and Azure Data Lake Storage Gen1.
Azure HDInsight is a cloud service that allows cost-effective data processing using open-source frameworks such as Hadoop, Spark, Hive, Storm, and Kafka, among others.
Using Apache Sqoop, we can import and export data to and from a multitude of sources, but the native file system that HDInsight uses is either Azure Data Lake Store or Azure Blob Storage.
Azure SQL Data Warehouse
Azure Stream Analytics
Comparison of Databricks vs HDInsight vs Data Lake Analytics
Updated DP-200 Labs
Demo video useful for Lab6b
Extract Lab6B into E:\Allfiles\Labfiles\Starter\DP-200.6 folder.
Extract Lab7 into E:\Allfiles\Instructions folder.
End of David Papkin page containing links on Microsoft Azure DP-200 course.
Helpful Azure learning links
Microsoft Azure Forums The Azure forums are very active. You can search the threads for a
specific area of interest. You can also browse categories like Azure Storage, Pricing
and Billing, Azure Virtual Machines, and Azure Migrate.
Azure Architecture Center Gain access to the Azure Application Architecture Guide,
Azure Reference Architectures, and the Cloud Design Patterns.
Microsoft Learning Community Blog Get the latest information the certification
tests and exam study groups.
https://channel9.msdn.com/ Channel 9 provides a wealth of informational videos, shows, and
Azure Tuesdays With Corey Corey Sanders answers your questions about
Microsoft Azure – Virtual Machines, Web Sites, Mobile Services, Dev/Test etc.
Azure Fridays Join Scott Hanselman as he engages one-on-one with the engineers
who build the services that power Microsoft Azure as they demo capabilities,
answer Scott’s questions, and share their insights.
Microsoft Azure Blog Keep current on what’s happening in Azure, including what’s
now in preview, generally available, news & updates, and more.
End of David Papkin Microsoft Azure page.
David Papkin favorite movies