Overview
The dataset contains 50,000 records with 11 attributes that describe BMW’s sales, pricing, and model characteristics across multiple regions and fuel technologies between 2010 and 2024. The absence of missing values indicates high data integrity — ideal for advanced statistical and predictive modeling.
Dataset Composition
| Attribute | Type | Description | Example Values |
|---|---|---|---|
Model |
Categorical (11 unique) | BMW models (Series 1–8, X-series, i-series) | “X5”, “3 Series”, “i4” |
Year |
Numeric | Model year (2010–2024) | 2018 |
Region |
Categorical (6 unique) | Global market regions | Asia, Europe, Africa, North America |
Color |
Categorical (6 unique) | Dominant car colors | Black, White, Blue, Red |
Fuel_Type |
Categorical (4 unique) | Type of fuel technology | Petrol, Diesel, Hybrid, Electric |
Transmission |
Categorical (2 unique) | Transmission mode | Automatic, Manual |
Engine_Size_L |
Numeric | Engine capacity in liters | 1.5 – 5.0 L |
Mileage_KM |
Numeric | Cumulative mileage (usage) | 3 – 199,996 km |
Price_USD |
Numeric | Retail or resale price in USD | 30,000 – 119,998 |
Sales_Volume |
Numeric | Annual sales volume per model | 100 – 9,999 |
Sales_Classification |
Categorical (2 unique) | Market performance tier | High, Low |
Key Statistical Highlights
| Metric | Mean | Std. Dev. | Interpretation |
|---|---|---|---|
| Year | 2017 | ±4.32 | The Data covers both early ICE and modern EV years, enabling trend analysis across transition phases. |
| Engine Size (L) | 3.25 | ±1.01 | BMW’s lineup remains performance-oriented, averaging 3.2L engines. |
| Mileage (KM) | 100,307 | ±57,942 | Balanced dataset: includes new and moderately used cars, good for depreciation or wear analysis. |
| Price (USD) | 75,034 | ±25,998 | Typical luxury bracket; interquartile range (~$52K–$98K) confirms premium positioning. |
| Sales Volume | 5,067 | ±2,856 | Average annual unit sales per record indicate strong global brand activity. |
Interpretation & Analytical Value
-
Data Completeness:
Zero missing values make this dataset highly reliable for predictive modeling (e.g., price prediction, sales forecasting). -
Balanced Representation:
With multiple models, colors, and fuel types equally distributed, it supports comparative market analytics without heavy bias. -
Temporal Coverage:
The range (2010–2024) captures BMW’s transition from combustion engines to hybrid and electric, providing a perfect lens to study technological evolution and market adaptation. -
Policy & Industry Relevance:
Governments and automotive economists can use such structured data to:-
Assess green mobility adoption.
-
Track market penetration of EVs.
-
Support fiscal policy design (e.g., carbon taxation, import duties).
-
Acknowledgments
-
Data Source: BMW Sales Data (2010–2024), processed within the DatalytIQs Academy Analytics Framework.
-
Technical Tools: Python (pandas, matplotlib, seaborn, scipy.stats).
-
Contributors:
-
Collins Odhiambo Owino — Lead Analyst & Author, DatalytIQs Academy
-
Kaggle Open Data Contributors — Data structure and design inspiration
-
BMW Group Market Insights — Contextual reference for interpretation
-
Author’s Note
Written by Collins Odhiambo Owino
Founder & Lead Researcher, DatalytIQs Academy
Empowering learners and professionals in Mathematics, Economics, and Finance through data-driven insight.
Leave a Reply
You must be logged in to post a comment.