Results for "data-engineering"

6 / 6 posts

Search: data-engineering

SQL Python PySpark Data Science Misc.Power BI Data World NumPy

What are Data warehouse, Data Lake ,data mining and DataMart and MetaData

Why a Data Warehouse? (Data Warehouse ki zarurat kyu hoti hai?) Aaj ke time me companies ka data multiple sources me store hota hai, jaise: • SQL Server da…

ai data data mart

Match in tags

Mar 19, 2026 7 min read

SQL

What is Data Lake

A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. You can store your data as-is, wit…

azure cloud data-engineering

Match in tags

Mar 19, 2026 2 min read

PySpark

What is Resilient Distributed Datasets (RDDs)

Resilient Distributed Datasets (RDDs) are a data structure of Apache Spark. They represent an immutable, distributed collection of objects that can be proc…

ai artificial-intelligence data-engineering

Match in tags

Mar 19, 2026 3 min read

PySpark

Spark Transformations, Actions and Lazy Evaluation and DAG.

Apache Spark RDD supports two types of Operations: Transformations Actions A Transformation is a function that produces new RDD from the existing RDDs but …

Apache Spark azure cloud

Match in tags

Mar 19, 2026 4 min read

PySpark

Schema and Handling Corrupt data in PySpark

A schema in PySpark (and generally in data processing) defines the structure of a DataFrame, including the names and data types of each column. It serves a…

comma saparate data-engineering database

Match in tags

Mar 19, 2026 4 min read

PySpark

What is Managed and External table in Spark

In Apache Spark, both Managed and External tables are used to store the data. However, there are significant differences in how Spark manages the data for …

azure data-engineering data-science

Match in tags

Mar 19, 2026 3 min read