Friday, September 28, 2012

Part 4.2 Relational Data Modeling



4.2 Relational Data Modeling

     4.2.1     Relational Data Model Concepts
     4.2.2     ER to Relational Mapping Algorithm
     4.2.3     Relational Database Design
     4.2.3.1 Relational Database Design Approach
     4.2.3.2 Normalization
    
4.2.1 Relational Data Model Concepts

Relation Data Model

  • A relational data model is a representational or implementation or record-based data model
  • The model uses the concept of a mathematical relation - which looks somewhat like a table of values
  • The model has its theoretical basis in set theory and first order predicate logic

 4.2.2 ER to Relational Mapping Algorithm




 
Logical Design


Based upon the conceptual data model

Four key steps

1. Develop a logical data model for each known user interface for the application using normalization principles.

2. Combine normalized data requirements from all user interfaces into one consolidated logical database model

3. Translate the conceptual E-R data model for the application into normalized data requirements

4. Compare the consolidated logical database design with the translated E-R model and produce one final logical database model for the application



Step 1 – (Identify Strong Entities)

Step 2 -- (Identify Weak Entities)
Step 3 -- (Identify Relationships)
Step 4 -- (Identify dependent relationships)
Step 5 -- (Identify dependent relationships if required)
Step 6 -- (Identify multi dependent relationships)
Step 7 -- (Identify for any new sub tables)
Step 8 – (Specialization)


4.2.3 Relational Database Design

4.2.3.1: Relational Database Design Approaches

Database Design Approaches


Bottom-up (Design by synthesis)

  • The basic relationship among individual attributes is used as starting point to build up relations
  • Not very popular in practice
  • Suffers from the problem of collecting a large number of binary attribute relationships as the starting point
Top-down (Design by Analysis)

  • Starts with a number of groupings of attributes into relations that have already been obtained from conceptual design and mapping activities
  • Design by Analysis is then applied to the relations individually and collectively, leading to further decomposition until all desirable properties are met

4.2.3.2 Normalization

Normalization

  • Normalization of data can be looked upon as a process of analyzing the given relational schemas based on their functional dependencies and primary keys to
          - minimize redundancy
          - minimize insertion, deletion, and modification anomalies

Normal Form

  • Normal form of a relation refers to the highest normal form condition that it meets
  • Normal forms, when considered in isolation from other factors, do not guarantee a good design. The normalization through decomposition must confirm the following two properties for a good database design:
          -  The lossless join or nonadditive join property (Mandatory)

          -  The dependency preservation property (Desirable)


 

Wednesday, September 26, 2012

Part 4.1 Conceptual Data Modeling

4. Data Modeling

4.1 Conceptual Data Modeling
      4.1.1 Introduction to Conceptual Data Model
      4.1.2 Stages in Conceptual Modeling
      4.1.3 Components of a Conceptual Data Model
      4.1.4 ER Modeling Basic Concepts
      4.1.5 Enhanced ER Modeling
      4.1.6 Guidelines for ER Modeling
4: 4.1 Conceptual Data Model
Conceptual Data Model
A conceptual schema or conceptual data model is a map of concepts and their relationships. This describes the semantics of an organization and represents a series of assertions about its nature. Specifically, it describes the things of significance to an organization (entity classes), about which it is inclined to collect information, and characteristics of (attributes) and associations between pairs of those things of significance (relationships).
A conceptual data model identifies the highest-level relationships between the different entities.
Features of conceptual data model include:
Includes the important entities and the relationships among them.
No attribute is specified.
No primary key is specified.
4.1.2 Stages in Conceptual Modeling

Main stages in conceptual modeling are as follows:

--  Identification of requirements (done in previous lesson)
--  Designing of solutions
--  Evaluation of solutions
4.1.3 Components of a Conceptual Data Model
  
4.1.4 ER Modeling Basic Concepts
ER Model

The ER model is a conceptual model
Describes data as entities, relationships and attributes
No standard notation for displaying ER diagrams
(We will choose one among several alternatives for this presentation)
Entity
An entity is a “thing” in the real world with an independent existence
An entity may be an object with a physical existence
For example, person, car, house
An entity may be an object with a conceptual existence
For example, company, job, university-course

 
4.1.5 Enhanced ER Modeling
  
Enhanced ER Model, Includes all the modeling concepts of the ER model

In addition, it includes the following concepts:

  • Subclass & super-class
  • Specialization & generalization
  • Category
  • Attribute & relationship inheritance
Subclass & Super-class


  • In many cases an entity type has numerous sub-groupings of its entities that are meaningful and need to be represented explicitly because of their significance to the database application. Each of these sub-groupings are called subclass.
  • The entity type on which these subclasses are formed is called super-class
  • The relationship between a super-class and any one of its subclasses is called a super-class/subclass, a class/subclass or an IS-A (or IS-AN) relationship, e.g., a SECRETARY IS-AN EMPLOYEE
  • A member entity of a subclass represents the same real-world entity as some member of the super-class, but in a distinct specific role
Specialization (Top to Bottom Approach)


Specialization is the process of defining a set of subclasses of an entity type


In addition, it allows us to do the following:
  •  Establish additional specific attributes with each subclasses
  •  Establish additional specific relationship types between each subclass and other entity types or other subclasses
Generalization (Bottom to Top Approach)


Generalization refers to the process of defining a generalized entity type from the given entity types


Generalization process can be viewed as being functionally the inverse of the specialization process   
 


  A specialization lattice with multiple inheritance for a UNIVERSITY database.





Monday, September 24, 2012

Part 3: How to Data Model



3.1: Identify Entity Types

Entity

Object that can be observed and classified by its properties and characteristics

Business definition with a clear boundary

Characterized by a noun


Example: Product, Employee

3.2: Identify Attributes

Attributes


Characteristics and properties of entities

Example :

Book Id, Description, book category are attributes of entity “Book”

Attribute name should be unique and self-explanatory

Primary Key, Foreign Key, Constraints are defined on Attributes

Relationship


3.3: Identify Relationships

Relationship between entities - structural interaction and association

described by a verb

Cardinality

1-1
1-M
M-M


Example : Books belong to Ambarish


3.4: Apply Naming Conventions

Naming Standards


1. Be correct, that is, both functionally and technically accurate.
2. Be clear, avoiding the use of vague terms such as “handle” or "process.”
3. Be concise, using the fewest number of words possible, avoiding articles and needless prepositions.
4. Be unique, avoiding wording similar to that of any other name.
5. Be atomic, representing only a single concept.
6. Contain only letters of the alphabet, numbers, and word separators.
7. Follow the specified format for names
8. Reflect common terminology used throughout Federal Student Aid.
9. Use complete names wherever possible instead of abbreviations or acronyms.
10. Use only approved abbreviations or acronyms when the data modeling tool restricts the length of the name.


3.5: Assign Keys


Keys

Super Key
Primary Key
Candidate Key
Secondary Key

Super Key

A super key is a column or set of columns that uniquely identifies a row within a table.

Given table: EMPLOYEES{employee_id, firstname, surname, sal}

Possible superkeys are:

{employee_id}
{employee_id, firstname}
(employee_id, firstname, surname, sal}

Only the the minimal superkey - {employee_id} - will be considered as a candidate key.










Friday, September 21, 2012

Part 2: Data Model



Advanced Data Modeling and Architecture --  Table of Content


Part 1: Data Modeling Overview



2.1 : Definition of a Data Model


Model
         Representation of a set of business requirements in a standard structured framework understood by the users

Data Model
 
A collection of concepts that can be used to describe the structure of a database
         Most data models include a set of basic operations for query and updates on the database
         It is becoming more common to include concepts in the data model to specify the dynamic aspect or behavior defined operations
Data Model Instances

   Conceptual Model

   Logical Model

   Physical Model


2.2 : Data Model Types

Database Model
  A database model is a theory or specification describing how a  database is structured and used. Several such models have been suggested. Common models include
–  Flat model
–  Hierarchical model
–  Network model
–  Relational model:
–  Object-relational model
–  Dimensional Model














2.3 Data Modeling Stages & Deliverables



2.4: Main Phases of Database Design





2.5: Data Modeling In DWH Environment


























2.6: Data Modeling Approaches





2.7: Data Modeling Life Cycle





2.8: Data Models & Zachman Framework





















Monday, September 10, 2012

Part 1: Data Modeling Overview




1.1: Prerequisites for this Session

Pre-Requisites

  • Should be aware of DBMS ConceptsS
  • Should be aware of Data warehouse concepts
  • Should be aware of Data base concepts (any database)
  • Should have 2+ years of experience on any ETL Tool

1.2 : Expectations from Session 




1.3: Scope of the Session
 

1. Data Model Overview
2. Data Models
3. How to Model Data
4. Data Modeling
Conceptual Data Model
Relational Data Modeling
Dimensional Data Modeling
Logical Data Modeling
Physical Data Modeling
5. Data Warehouse Database Administration and Performance Improvements
6. Data Modeling for Data Warehouse Environment
7. Introduction to Erwin Tool

Data Model Overview




1.4: Introduction to Data Models

Fundamentals of Database Systems
 
Data
A known fact that can be recorded and that have implicit meaning

Database
  A collection of related data with the following implicit properties
A Database represents some aspect of real world, sometimes called the Universe of Discourse (UoD)
A database is a logically coherent collection of data with some inherent meaning
A Database is designed, built, and populated with data for a specific purpose
                                   
 Database Management System (DBMS)
A collection of programs that enables users to create and maintain a database
A general purpose software system that facilitates the process of defining, constructing and manipulating database for various applications
Database System
Database and DBMS software together forms a database system

What is Data Modeling?

Data modeling is a technique for exploring the data structures needed to support an organization’s information need.
It would be a conceptual representation or a replica of the data structure required in the database system.
A data model focuses on which data is required and how the data should be organized.
At the conceptual level, the data model is independent of any hardware or software constraints.
 Why Use Data Modeling?
        Leverage
Data model serves as a blueprint for the database system.
Conciseness
Data model functions as an effective communication tool for discussions with the users.
Data Quality
        Data model acts as a bridge from real-world information to database storing relevant data content. 

 What Makes a Good Data Model?
Completeness
Ensure that every piece of information required for a System is recorded and maintained.
Non-Redundant
One fact should be recorded only once. Repetition may result in inconsistency and increased storage requirements.
Adherence to Business Rules
The collected data is to be recorded by considering all business rules. It should not violate any rule.
 Data Reusability
Design a data structure to ensure re-usability.
Stability and Flexibility
A model needs to be flexible enough to adopt to new changes without forcing the programmer to re-write the code.
Elegance
A data model should neatly present the required data in the least possible number of groups or tables.
Communication
A model should present the data in a manner understandable to all stakeholders.
Integration
A good model is compatible with the existing and future systems.
1.5: Sample ER Data Model

1.6: Challenges, Benefits &  Opportunities
Challenges
Organizations today have vast quantities of data. Although this data contains information that is useful to the business, it can be extremely difficult to gather and report on this information. There are several key challenges that need to be addressed:
Ø  Discovering, collecting, and transforming data into a single source of record.
Ø  Ensuring that data is relevant and accurate for business reporting.
Ø  Storing historical data in a format that enables fast searches across large amounts of data.
Opportunities
If the data that a business holds can be unlocked and provide meaningful information to business users, there are several opportunities:
Ø   Business users can have access to relevant information to enable them to    make informed decisions.
Ø   Fast queries make data more accessible and historical data enables trends to be identified.
Ø   Data can be assessed at every level -- from an individual purchase to the total sales of a multinational corporation.
 
Benefits
Designing data warehouses correctly by using a data model will help meet many of today’s data challenges. Key benefits include:

Ø    Designing structures specifically to enable fast querying for business-centric reporting.
   
Ø   Ensuring that business requirements are met, and reports are accurate and meaningful.
Ø  Documenting source and target systems correctly to aid development, ensure effective version control, and enhance understanding of the systems.
 1.7: Why data modeling Tool
 
Why Data modeling tool
Ø   Improves productivity among developers when database designs are divided, shared, and reused
Ø   Establishes corporate modeling standards
Ø  Creates good documentation (metadata) in variety of useful formats
Ø   Ensures consistency, reuse, and integration of enterprise data
Ø   Enable creation the data model in one notation and conversion it to another notation without losing the meaning of the model
Ø    Saves time by accelerating the creation of high-quality, high-performance physical database from logical model
Ø    Conserves resources and improves accuracy by synchronizing model and database
1.8: Requirements for a good data modeling Tool
Requirements for a good data modeling tool
Ø   Diagram Notation
  Both ER & dimensional modeling notation must be available
Ø   Reverse Engineering
   Creation of a model based on the source data in the operational environment as well as from other external sources of data
Ø   Forward Engineering
    Creation of the data definition language (DDL) for the target tables in the data warehouse databases
Ø    Source to Target Mapping
     Linking of source data in the operational systems and external sources to the data in the databases in the target data warehouse
Ø    Data Dictionary or Repository
     Contains metadata that describes the data model

1.9: Overview of Data Model Tools

 
Data Modeling tools available in market
AllFusion ERwin Data Modeler (www.ca.com) An industry-leading   data modeling solution that can help you create and maintain databases,   data warehouses and enterprise data models
PowerDesigner(www.sybase.com) Designs and generates the database   schema through true bi-level (conceptual and physical) relational   database modeling supports data warehouse specific modeling   techniques
Oracle Designer (www.oracle.com)
ER/Studio (www.embarcadero.com)
IBM VisualAge DataAtlas (www.software.ibm.com)
Popkin System Architect (www.popkin.com)
CAST DB-Builder (www.castsoftware.com)
For Data Warehouse Data Modeling tools, please refer to
http://www.databaseanswers.org/modelling_tools.htm

Summary
 
In this module, you learned about the following:
Data Model Overview
Prerequisites for this Course
Expectations from this training
Scope of the training
Introduction to Data Models
Fundamentals of database Systems
Definition of a Model
What is Data Modeling
Why use Data Modeling
What Makes a Good Data Model
Sample Data Models
Data Model Tools