In the past few days, several people have asked me to explain the difference between Data Lakes and Data Warehouses (and the use cases for these, especially when you can build analytics around Transactional Databases directly). So, I decided to write a blog post to capture the concepts and use cases around these.

While all have a storage aspect to it, but to keep the post simple, we will assume that each of these is a database, albeit the underlying database technologies usually differ.

What is a DataWarehouse?

A Data Warehouse is a collection of business data organized in a “structured” format to make…


Introduction:

Oracle ADW is one of the fastest cloud warehousing platforms available in the market these days. With the expansion and innovation of the cloud platforms, the need to integrate these with various data sources has expanded. While Oracle provides various integration utilities to source data from and loads data into Oracle ADW, there is a lot of Integration and analyst work done in Python, R, etc., that provides engineers and analysts cheap alternatives to connect ADW.

I usually use Python and PySpark for my data analyst, data engineering, and ML needs. There is a difference between the configuration setup of…


MARKET BASKET ANALYSIS (PRODUCT AFFINITY)

Photo by Scott Evans on Unsplash

Objective

  1. Deliver core concepts on Market Basket Analysis.
  2. Identify Frequent Itemsets
  3. Learn Association rules mining

Theoretical Problem Statement

Acme Incorporated is a small local grocery store, and the owners want to discover and understand:

  1. Best-selling products
  2. Identify Cross-selling opportunities. Do the customers who buy Kidney Beans, also buy Yogurt?
  3. Establish an ongoing product placement initiative.

The problem statements represent the essence of Market Basket Analytics. In this blog, we have kept the problem statement very straight-forward. Still, we can apply these same principles and techniques to a variety of related and more complex problems such as Twitter feed mining, Netflix/Spotify Recommendations, Census Data, etc.

Dataset

Vipul Bhatia

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store