1. Business Problem Solving
Identify clear hypotheses or research questions:
• Multiple relevant business problems and opportunities for improvement should be identified through data analysis.
• Formulate well-defined hypotheses or research questions.
Apply appropriate data analysis techniques:
• Use techniques that match the data and business problem, with clear rationale for their selection.
Provide clear and impactful insights:
• Deliver clear, impactful insights that facilitate decision-making.
• Translate data analysis results into actionable recommendations.
2. Data Preparation
Identify data quality issues:
• Identify most relevant data quality issues with clear explanations of their impact on analysis.
Implement data cleaning/wrangling techniques:
• Consistently apply appropriate data cleaning techniques to all identified data quality issues.
• Handle null values, duplicates, drop unnecessary columns, manipulate strings, and format data.
• Address missing data and fully justify the imputation strategy.
3. Data Analysis
Apply EDA techniques:
• Employ sophisticated EDA techniques to analyze data, validate hypotheses, draw conclusions, and provide unique insights.
• Demonstrate comprehensive understanding of the data’s characteristics, patterns, and relationships.
• Utilize a range of numerical measures and graphical methods according to data type.
Use inferential statistics:
• Utilize inferential statistics such as hypothesis testing with p-values to check for significant correlations and to check for normality.
• Apply appropriate data transformations to ensure normality when necessary.
4. Data Visualization and Communication
Use appropriate data visualization techniques:
• Create interactive and informative visualizations using Python libraries or visualization tools such as Tableau or Power BI.
• Communicate insights effectively with well-designed visualizations.
Develop a clear and effective dashboard:
• Ensure the dashboard is clear, well-organized, visually appealing, and allows for decision-making.
• Define, measure, and plot KPIs and metrics with great attention to detail.
5. Coding and Data Analysis
Develop proficiency in data analysis:
• Apply rigorous methods for predictive analysis using machine learning or descriptive analysis using statistics and SQL.
• Perform data preprocessing, address all issues, and justify the preprocessing steps.
• Apply appropriate machine learning models, ensure model assumptions are checked, and perform hyperparameter tuning.
• Evaluate model performance using appropriate metrics and make necessary adjustments.
Use SQL and Python:
• Use SQL for data manipulation, advanced queries, and combining with Python for comprehensive data analysis.
6. Clean and Modular Code
Write clean, modular, and efficient code:
• Ensure no unused code is present.
• Write functions that are modular and reusable, saved in .py files.
• Apply naming conventions consistently.
• Avoid hard-coded values and global variables, using config files instead.
• Organize files and folders appropriately.
7. Version Control with Git and GitHub
Track changes in the source code:
• Make at least one commit per project day with clear and precise descriptions.
• Use atomic commits and separate branches for development.