# Smart Infrastructure - [Smart Infrastructure](#smart-infrastructure) - [General Methodology](#general-methodology) ## General Methodology 1. Business understanding: What is the problem to solve - What is holding Londoners back from cycling? 1. Analytics approach: How can I use data to answer Q1? - Using data, only 2% of trips are done cycling, why? Try to learn causes by the data - Types of analytics: - Descriptive: what happened - diagnostic: Why - Predictive: What will happen - Prescritive: How to make it happen 1. Data requirements: What Existing Do I need to analyze the problem - Cyclists’ casualties data - City data - Cycle thefts 1. Data collection: collect new data - Try to collect data using sensors 1. Data understanding: Verify if the data collected can solve the problem - Using tools like _uni-variate_, _pairwise correlation_, and histogram 1. Data preparation (loop back to data collection): If the data is usable, or if preparation must be done, - Possible problems: - Structural error - Merging of data - Outlier analysis - Redundancy - Data collected contains observations (values), and attributes / features (keys), can be: - Continuous or Discrete - Numeric or nominal (labels like "London" or "Beijing") 1. Modeling: Visualizing the data to answer questions - Using ML: split dataset to train, validate and test them - Train: to fit the model - Validate: provide unbiased evaluation while training (tuning hyper-parameters) - Test: provide evaluation on final model fit 1. Evaluation: Does the model answer the question or is change needed 1. Deployment: Using the model in practice 1. Feedback (loop back to modeling): Use feedback and new data, to possibly re-train or fine-tune the model, and answer the initial question.