1.9 KiB
1.9 KiB
Smart Infrastructure
General Methodology
- Business understanding: What is the problem to solve
- What is holding Londoners back from cycling?
- Analytics approach: How can I use data to answer Q1?
- Using data, only 2% of trips are done cycling, why? Try to learn causes by the data
- Types of analytics:
- Descriptive: what happened
- diagnostic: Why
- Predictive: What will happen
- Prescritive: How to make it happen
- Data requirements: What Existing Do I need to analyze the problem
- Cyclists’ casualties data
- City data
- Cycle thefts
- Data collection: collect new data
- Try to collect data using sensors
- Data understanding: Verify if the data collected can solve the problem
- Using tools like uni-variate, pairwise correlation, and histogram
- Data preparation (loop back to data collection): If the data is usable, or if
preparation must be done,
- Possible problems:
- Structural error
- Merging of data
- Outlier analysis
- Redundancy
- Data collected contains observations (values), and attributes / features
(keys), can be:
- Continuous or Discrete
- Numeric or nominal (labels like "London" or "Beijing")
- Possible problems:
- Modeling: Visualizing the data to answer questions
- Using ML: split dataset to train, validate and test them
- Train: to fit the model
- Validate: provide unbiased evaluation while training (tuning hyper-parameters)
- Test: provide evaluation on final model fit
- Using ML: split dataset to train, validate and test them
- Evaluation: Does the model answer the question or is change needed
- Deployment: Using the model in practice
- Feedback (loop back to modeling): Use feedback and new data, to possibly re-train or fine-tune the model, and answer the initial question.