In addition to making multiple copies of files to guard against any unexpected failures, Data lake spreads parts of a file over a number of individual storage servers. Data is stored durably by making multiple copies and there is no limit on the duration of time that the data can be stored in the Data Lake. Data Lake enables you to capture data of any size, type, and ingestion speed in one single secure location for operational and exploratory analytics.ĭata Lake Store does not impose any limits on account sizes, file sizes, or the amount of data that can be stored in a data lake. Azure role-based access control (Azure RBAC) to control access using Azure Active Directory users and groups.Īzure Data Lake Store is an enterprise-wide hyperscale repository for big data analytic workloads.Disaster recovery and high availability options.Other features that make Azure Storage a good choice are: Azure Blob storage can also be accessed via Azure Synapse Analytics using its PolyBase feature.
Through a Hadoop distributed file system (HDFS) interface provided by a WASB driver, the full set of components in HDInsight can operate directly on structured or unstructured data stored as blobs. HDInsight can use a blob container in Azure Storage as the default file system for the cluster.
For more information, see Azure Blob Storage: Hot, cool, and archive storage tiers.Īzure Blob storage can be accessed from Hadoop (available through HDInsight). It provides hot, cool, and archive storage tiers for different use cases. A storage account can contain an unlimited number of containers, and a container can store an unlimited number of blobs.Īzure Storage is a good choice for big data and analytics solutions, because of its flexibility, high availability, and low cost. A container provides a grouping of a set of blobs. Blobs are stored in containers, which are similar to folders. They store pictures, documents, HTML files, virtual hard disks (VHDs), big data such as logs, database backups - pretty much anything. The most flexible option for storing blobs from a number of data sources is Blob storage. There are various Azure Storage services you can use to store data. Azure Storage is the most ubiquitous storage solution Azure provides, due to the number of services and tools that can be used with it. Microsoft takes care of maintenance and handles critical problems for you. There are several options for ingesting data into Azure, depending on your needs.Īzure Storage is a managed storage service that is highly available, secure, durable, scalable, and redundant.
What are your options when choosing data storage in Azure? This topic compares options for data storage for big data solutions - specifically, data storage for bulk data ingestion and batch processing, as opposed to analytical data stores or real-time streaming ingestion.