EBU6504_smart_arch_notes/1-intro-to-smart-infra.md
2025-01-07 18:19:28 +08:00

44 lines
1.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Smart Infrastructure
<!--toc:start-->
- [Smart Infrastructure](#smart-infrastructure)
- [General Methodology](#general-methodology)
<!--toc:end-->
## General Methodology
1. Business understanding: What is the problem to solve
- What is holding Londoners back from cycling?
1. Analytics approach: How can I use data to answer Q1?
- Using data, only 2% of trips are done cycling, why? Try to learn causes by
the data
1. Data requirements: What Existing Do I need to analyze the problem
- Cyclists casualties data
- City data
- Cycle thefts
1. Data collection: collect new data
- Try to collect data using sensors
1. Data understanding: Verify if the data collected can solve the problem
- Using tools like _uni-variate_, _pairwise correlation_, and histogram
1. Data preparation (loop back to data collection): If the data is usable, or if
preparation must be done,
- Possible problems:
- Structural error
- Merging of data
- Outlier analysis
- Redundancy
- Data collected contains observations (values), and attributes / features
(keys), can be:
- Continuous or Discrete
- Numeric or nominal (labels like "London" or "Beijing")
1. Modeling: Visualizing the data to answer questions
- Using ML: split dataset to train, validate and test them
- Train: to fit the model
- Validate: provide unbiased evaluation while training (tuning
hyper-parameters)
- Test: provide evaluation on final model fit
1. Evaluation: Does the model answer the question or is change needed
1. Deployment: Using the model in practice
1. Feedback (loop back to modeling): Use feedback and new data, to possibly
re-train or fine-tune the model, and answer the initial question.