diff --git a/pages/总复习2023t1.md b/pages/总复习2023t1.md index 9e2eed3..87205ff 100644 --- a/pages/总复习2023t1.md +++ b/pages/总复习2023t1.md @@ -1132,11 +1132,13 @@ - LATER advantages and disadvantages of distributed databases - DONE XML - LATER XML definition and basic concepts #flashcard + id:: 648974ba-afab-457e-9633-488450e9e16f - eXtensible Markup Language - A meta-language (i.e. a language for describing other languages) that enables designers to create their own customised tags to provide functionality not available with HTML. - LATER Relational model versus XML #flashcard + id:: 648974ba-d417-4eef-be28-46cd5894c5c7 - SQL - is a special-purpose programming language - You can: manage data in a relational databases. @@ -1145,6 +1147,7 @@ - You can: design ways of describing information (text or data), usually for storage, transmission, or processing by a program (you can use it in combination with a programming language). - It says nothing about what you should do with the data (although your choice of element names may hint at what they are for). - LATER Well-formed XML, Valid XML #flashcard + id:: 648974ba-fb70-4207-8010-a8ddda35ccf7 - Adheres to basic structural requirements - Single root element - Matched tags, proper nesting - Unique attributes within elements @@ -1155,40 +1158,50 @@ - LATER Practice reading and writing XML, XSD - DONE Data Mining - concept #flashcard + id:: 648974ba-bf4c-4046-b7ce-510596ad421a - The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases and using it to make crucial business decisions. - different applications #flashcard + id:: 648974ba-7440-4ac2-8730-b33e9f50570c - Retail / Marketing - Banking - Insurance - Medicine - basic techniques - predictive modelling, #flashcard + id:: 648974ba-a007-420c-87db-1a029c1a39e6 - uses observations to form a model of the important characteristics of some phenomenon - database segmentation, #flashcard + id:: 648974ba-18a0-474e-96de-6a824969d0ec - Uses unsupervised learning to discover homogeneous subpopulations in a database to improve the accuracy of the profiles. - link analysis, #flashcard + id:: 648974ba-0868-469f-9b8f-94a44163c87f - Establishing links, called associations, between the individual records, or sets of records, in a database. - deviation detection. #flashcard + id:: 648974ba-a77e-47ba-9f0d-6ed14e880333 - Identifies outliers, which express deviation from some previously known expectation and norm. - DONE NoSQL - the motivation for NoSQL #flashcard + id:: 648974ba-91af-424f-b392-928e947740de - By giving up ACID constraints, one can achieve much higher performance and scalability. - explain the concepts of NoSQL #flashcard + id:: 648974ba-370b-44a8-9474-5b58d1d0dd28 - NoSQL databases (aka "not only SQL") are non-tabular databases and store data differently than relational tables. NoSQL databases come in a variety of types based on their data model. The main types are document, key-value, wide-column, and graph. They provide flexible schemas and scale easily with large amounts of data and high user loads. - explain the application areas of NoSQL #flashcard + id:: 648974ba-b39b-47b7-8b9f-ca9250bef8ba - NoSQL is an alternative, non-traditional DB technology to be used in large scale environments where (ACID) transactions are not a priority. - CAP theorem: #flashcard + id:: 648974ba-910d-42ae-89a9-5017194f6827 - There are 3 main properties for distributed management: 1. Consistency → A data item has the same value at the same time (to ensure coherency).