for Arkansas Data Science — Data Science with Python
Total Standards: 78Mapped: 36Completion: 46%
1.1.1
Identify the key stages of a data science project lifecycle.
1.2 Gathering Data
1.1.2
Identify key roles and their responsibilities in a data science team (e.g., business stakeholders, define objectives; data engineers, build pipelines; data scientist, develop models; domain experts, provide expertise).
1.1 What is Data Science?
4.1 Data Science for Business
4.7 Business Report
1.1.3
Define and create project goals and deliverables (e.g., problem statements, success metrics, expected outcomes, final reports, summary presentations).
Apply masking operations to filter and select data.
2.3 Importing and Filtering Data
2.4 Conditional Filtering
3.3.7
Handle missing and invalid data values using appropriate methods (e.g., removal, imputation, interpolation).
2.5 Data Cleaning
3.3.8
Identify and handle outliers using statistical methods.
3.7 Trends and Correlations
3.4.1
Examine data structures using preview and summary methods (e.g., head, info, shape, describe).
1.7 Pandas DataFrames
3.4.2
Create new data frames by merging or joining two data frames.
4.4 Combining Datasets
3.4.3
Sort and group records based on conditions and/or attributes.
1.2 Gathering Data
2.4 Conditional Filtering
3.4.4
Create functions to synthesize features from existing variables (e.g., mathematical operations, scaling, normalization).
1.9 Using Functions
4.1.1
Generate histograms and density plots to display data distributions.
1.6 Measures of Spread
2.6 Exploring with Visualizations
4.1.2
Create box plots and violin plots to show data spread and quartiles.
4.1.3
Construct Q-Q plots to assess data normality.
4.2.1
Generate scatter plots and pair plots to show relationships between variables.
2.6 Exploring with Visualizations
3.3 Data Visualizations
3.4 Line and Bar Charts
3.8 Linear Regression
3.9 Explore Bivariate Data
4.2.2
Generate correlation heatmaps to display feature relationships.
4.2.3
Plot decision boundaries to visualize data separations.
4.3.1
Generate bar charts and line plots to compare categorical data.
3.4 Line and Bar Charts
4.3.2
Create heat maps to display confusion matrices and tabular comparisons.
4.3.3
Plot ROC curves and precision-recall curves to evaluate classifications.
4.4.1
Generate line plots to show trends over time.
3.4 Line and Bar Charts
4.4.2
Create residual plots to analyze prediction errors.
4.4.3
Plot moving averages and trend lines.
3.7 Trends and Correlations
3.8 Linear Regression
4.5.1
Draw conclusions by interpreting statistical measures (e.g., p-values, confidence intervals, hypothesis test results).
4.5.2
Evaluate model performance using appropriate metrics and visualizations (e.g., R-squared, confusion matrix, residual plots).
4.5.3
Identify patterns, trends, and relationships in data visualizations (e.g., correlation strength, outliers, clusters).
1.1 What is Data Science?
2.6 Exploring with Visualizations
3.3 Data Visualizations
3.7 Trends and Correlations
4.5.4
Draw actionable insights from analysis results.
5.1.1
Describe the key characteristics of Big Data (e.g., Volume, Velocity, Variety, Veracity).
5.1.2
Identify real-world applications of Big Data across industries (e.g., healthcare, finance, retail, social media).
2.2 Big Data and Bias
5.1.3
Analyze case studies of successful and unsuccessful Big Data implementations across industries (e.g., recommendation systems, fraud detection, predictive maintenance).
5.1.4
Identify common Big Data platforms and tools (e.g., Hadoop for distributed storage, Spark for data processing, Tableau for visualization, MongoDB for unstructured data).
5.2.1
Describe how organizations store structured and unstructured data.
5.2.2
Compare different types of data storage systems (e.g., data warehouse, data lakes, databases).
6.1.1
Contrast supervised and unsupervised learning.
6.1.2
Differentiate between classification and regression problems.
6.1.3
Evaluate model performance using appropriate metrics (e.g. Accuracy, Precision/Recall, Mean Squared Error, R-squared).
6.2.1
Perform linear regression for prediction problems.
3.8 Linear Regression
6.2.2
Perform multiple regression for prediction problems.
6.2.3
Perform logistic regression for classification tasks.
6.2.4
Implement Naive Bayes Classification using probability concepts.
6.2.5
Perform k-means clustering using distance metrics.
6.3.1
Apply standard methods to split data into training and testing sets.