• No results found

DATABASE DESIGN I -

N/A
N/A
Protected

Academic year: 2022

Share "DATABASE DESIGN I -"

Copied!
42
0
0

Loading.... (view fulltext now)

Full text

(1)

DATABASE DESIGN I - 1DL300

Fall 2011"

An introductory course on database systems

http://www.it.uu.se/edu/course/homepage/dbastekn/ht11

Erik Zeitler

Uppsala Database Laboratory

Department of Information Technology, Uppsala University, Uppsala, Sweden

(2)

Introduction to the Relational Model 


Elmasri/Navathe ch 3,8


Padron-McCarthy/Risch ch 5, 6"

Anne - Kathrin Peters Room 2404

Department of Information Technology Uppsala University, Uppsala, Sweden

course member in course

persnr

gives

n m n

1

trainer name

id

name

max personnr

address

e-mail

tel-nr street

nr post code

city

payed

price

member in trainer

pnr id tid pnr

Conceptual Schema Conceptual

Design

Logical Design

Conceptual Model e.g. UML, ERM

Implementation Model (Data Model of DBMS here RM)

Logical Schema

(3)

Overview"

1.  Key aspects of last lecture (L1) 2.  Goal for today’s lecture (L2)

3.  Important terms (Erik didn’t present)

4.  Comparison and discussion of ER (assignment)

1.  In groups

2.  With audience

5.  ERM à RM – how to?

6.  Little assignment for L3

(4)

Key aspects of lecture L1"

•  Why lecture?

–  Databases everywhere –  Databases mainly tables –  Problems might occur

•  Systematic design process

–  Overall: ERM à RM –  ERM components

(5)

Outline of a database system"

Database System (DS)

Applications

procedures / statements

Users’

interactive queries

DB schema DB

Database Management System (DBMS)

“DS: DBMS & DB”

(6)

“Just do a table!???”"

Member Street PCod City E-Mail Tel-Nr Course Max Jan

Johannson

Kyrkogårds gatan 10

75313 Uppsala johannson@g mail.com

076429384 Aquafit 50

Kalle Berglund

Eklundshov svägen 1

75237 Uppsala k.berglund@h otmail.com

0713284938 Soccer 25

Anna

Andersson

Kungsgatan 20

75332 Uppsala andersson@g mail.com

074628394 Climbing 8

Anna Anderson

Kungsgatan 20

75332 Uppsala anderson@gm ail.com

074628394 Aquafit 50

Anna

Andersson

Kungsgatan 20

75332 Uppsala andersson@g mail.com

074628394 Swimming 30

Insertion Update

Anomalies on… Deletion

(7)

Solution: Split"

Name Street PCode City E-Mail Tel-Nr Jan

Johannson

Kyrkogårdsg atan 10

75313 Uppsala johannson@

gmail.com 0764293 84

Kalle Berglund

Eklundshovs vägen 1

75237 Uppsala k.berglund@

hotmail.com 0713284 938

Anna

Andersson

Kungsgatan 20

75332 Uppsala andersson@

gmail.com 0746283 94

Name Max Aquafit 50 Soccer 25 Climbing 8 1

2 3

1 2 3 Id

Id m_id k_id

1 1

2 2

3 1

3 2

3 3

Member:

Course: Goes_to:

(8)

course

Systematic Design Process"

Sport club

member in course

persnr

gives

n m n

1

trainer name

id

name

max personnr

address

e-mail

tel-nr street

nr post code

city

payed

price

member in trainer

pnr id tid pnr

Miniworld

Conceptual Model e.g. UML, ERM

Implementation Model (Data Model of DBMS

here RM) Conceptual

Schema Conceptual

Design

Logical Schema Logical

Design

Requirements Collection & Analysis Requirements

Physical Design

Internal Schema

Member, courses…

(9)

ER-notation

(Elmasri/Navathe Figure 7.14)"

(10)

Assignment"

1.  Compare your ERMs

2.  Discuss differences: Are all different versions possible?

3.  Formulate questions.

(11)

Requirements for Social Network"

Mark Zuckerberg and his fellow students brainstorm the following requirements:

•  For every user a profile is created. There, the following data should be stored –  name of the user (first, last and username)

–  favorite Bands (a list of Bands that the user likes) –  political views

–  looking for (e.g. friendship, relationship, same sex relationship, open relationship,…)

•  A user (profile) can be friends with other users. The other user needs to agree or decline.

•  A user (profile) can have a pin board. There, the user can make pin board entries- either for his/

herself or for other users. These entries have a text and/or a date of creation. These entries can have a photo and they can be commented. The comments have a position (order) on the pin board, depending on the time, that it was created.

•  When a photo is uploaded, the url through what it can be reached is stored, a description of the picture and the date and time. Users can upload a photo for their profile.

•  A user (profile) can create groups. These have a description and pin board entries. Groups are public (open) or not (closed). If they are not, users need to register for a membership, the creator of the group then agrees on or denies the membership. Hence, a “status” of membership (member, not replied, declined) needs to be stored.

•  A user (profile) can create events. Events have a description, date, time and they are public or not.

For each event, the user can invite other users. The answer of the invited user (participating, not participating, maybe, not answered) needs to be stored. An invitation is a requirement for seeing non-public events.

•  Locations (street, nr, postal code, city, country) are needed several times, at least to store the users’

place of living and the location of an event.

(12)

N-ary relationships or relationships of degree >2"

•  A relationship type, R, can mathematically be defined as:

R ⊆ E

1

× E

2

× … × E

n

where each E

j

is an entity type.

•  An instance r ∈ R is written as (e

1

, e

2

, ..., e

n

) or <e

1

, e

2

, ..., e

n

>

(as a tupel) where e

j

∈ E

j

•  Example from Social Network DB:

–  Profile (E1) creates Pinboard entry (E2) for Profile (E3) –  Profile (E1) creates Pinboard entry (E2) for Group (E3)

pin board

entry has group

1

n 1

profile has

1 n

1

(13)

course

Systematic Design Process


GOAL for L2"

Sport club

member in course

persnr

gives

n m n

1

trainer name

id

name

max personnr

address

e-mail

tel-nr street

nr post code

city

payed

price

member in trainer

pnr id tid pnr

Miniworld

Conceptual Model e.g. UML, ERM

Implementation Model (Data Model of DBMS

here RM) Conceptual

Schema Conceptual

Design

Logical Schema Logical

Design

Requirements Collection & Analysis Requirements

Physical Design

Internal Schema

Member, courses…

(14)

The Relational Model (RM)"

•  Introduced by Dr. Edgar F. Codd (1924-2003) in 1970

•  Many (in practice, all) DBMSs are based on the RM

•  Relations (tables) with atomic values (1

st

NF) only data structure!

•  Query using high-level languages: simple but expressive

•  Operations on Relations (Tables)

–  Queries: Selection, Projection, Cartesian product, set operations (based on relational algebra)

–  Manipulation: Insertion, Deletion, Update

•  Closeness: Operations apply to relations (tables), produce new relations

Q 1 ( ) à Q 2 ( ) à

(15)

Relations as mathematical objects"

•  Let A

1

, A

2

, …, A

n

be attributes with Domains D

1

, D

2

, …, D

n

•  Relation: subset of the Cartesian product of domains D

1

, … D

n

(set of values), i.e. R ⊆ D

1

× D

2

× … × D

n

•  D

1

× D

2

× ... × D

n

: set of all ordered n-tuples (v

1

,v

2

,...,v

n

), v

i

∈ D

i

for all i.

•  Example: n=2, D1={1,2} and D2={P, Q, R} one gets the product sets:

D1

×

D2 = {(1,P),(1,Q),(1,R),(2,P),(2,Q),(2,R)}

D2

×

D1 = {(P,1),(P,2),(Q,1),(Q,2),(R,1),(R,2)}

•  For example, we have the relations:

R1 ⊆ (D1

×

D2) R1= {(1,Q),(1,R),(2,P)}

R2 ⊆ (D1 × D2) R2= {(2,P),(2,Q),(2,R)}

R3 ⊆ (D2

×

D1) R3= {(P, 2), (Q,1)}

•  Members of a relation are called tuples

Members of relations with degree n: n-tuples.

D1,

A1 A2 1 Q 1 R 2 P

(16)

First Normal Form Assumption"

Only simple / atomic attribute values à no composite or multiple values!

The theory for the relational model is based on

the first normal form assumption!

(17)

Steps of Translation: 


From E-R model to relational model"

1.  Translation of Entity types and their attributes

1.  Entity types

2.  Weak entity types

2.  Translation of relationships

1.  1-1 Relationship 2.  1-N Relationship 3.  M-N Relationship

3.  Translation of multivalued attributes and relationships (degree > 2)

1.  Multivalued attributes 2.  Multivalued relationships

(18)

Criteria"

•  Information from ER à RM

•  Data integrity

•  No (or little) redundancy

•  Small nr of tables

Furthermore, it should be logical, understandable

(19)

Translation of ER using Example"

(20)

Translation of strong entities & their attributes"

id description public

or:

Group (id, description, public) sometimes with data types:

Group(id: int, description: string, public: bool)

A strong entity type à table with the same attributes

•  Each attribute gets its own column

•  Key attributes à choose primary key column(s) for the table.

•  Composite attributes are represented by their simple components.

Example: group schema

(21)

Terms: Relation schema, Relation instance"

group!

•  A1, A2, . . ., An are attributes

•  R = (A1, A2, . . ., An) is a relation schema

–  Example: group-schema = (id, description, public)

•  r(R) is a relation on the relation schema R

–  Example: group(id, description, public)

–  Relation instance: The current values of a relation, specified by a table.

•  An element t of r is a tuple - represented by a row in a table customer

a tuple an attribute

a relation

id description public 1 “rocker” 0

2 “dancer” 1 3 “sportfreak” 0

(22)

Keys"

•  Relations are sets à all tuples in the relation are different

•  Superkey: A subset k of the attributes in a relation schema R, i.e. k ⊆ R, that has the characteristic that if the tuples

t1, t2 ∈ r(R) and t1 ≠ t2 then

t1[k] ≠ t2[k]

i.e. the value of k in t1 ≠ the value of k in t2

•  Candidate Key: minimal superkey, i.e. there is no other superkey k' such that k' ⊂ k

•  Primary Key (or simply key): used to identify entities

•  Foreign Key: to refer to a tuples in another relation (later)

(23)

Group(id, description, public)

Profile (id, first name, last name,

username, political views, looking for)

note: Attribute “favourite Bands” is multivalued à later!

id first name

last name

user- name

political views

looking for

Example entity “profile”"

id

(24)

Profile (id, first name, lase name, username, political views, looking for)

Group(id, description, public)

g_id p_id status

Member(gId, pId, status)

Primary Key???

Translation of relationships & their attributes"

•  Always an option: Create a new table

–  One column for every primary key attribute of the participating entity types –  One column for each attribute of the relationship

–  Determine primary Key

•  Sometimes, a separate table for the relationship is not needed!!!

(25)

Profile (id, first name, lase name, username, political views, looking for)

Translation of relationship types, N:M"

•  Always new table

•  Primary key is composed of primary key of participating entities

•  Descriptive Attributes of relationship type

id first name

last name

user- name

political views

looking for

Group(id, description, public)

id description public

g_id p_id status

Member(gId, pId, status)

(26)

Translating relationship . . . cont. . ."

•  M-N Relationship types

–  Always a separate table with columns for the primary keys of the two

participating entity types, and any descriptive attributes of the relationship type.

R! E2!

a1!

pk1!

a2!

pk2!

M! N!

E1!

pk1! a1! f k1! f k2! pk2! a2!

primary key is composed of fk1 and fk2

(27)

Foreign Keys"

Foreign Key: one or several attributes of a relation, so that for every value of the foreign key that is not NULL: it matches one candidate key value of another relation

–  The values in a fk-column point to unique row in the other table, and thus implement the relationship.

–  Referential integrity: for every foreign key value, there must be

an entity

(28)

Null values"

•  An attribute may have the null value

•  null is used

–  when the value of an attribute is unknown

–  when a certain attribute does not have a value –  when an attribute is not applicable

•  Example:

students(personnr, name, birthday, …, home phone,visa_status)

•  Every occurrence of null is unique.

⇒ Two occurrences of null is not considered equal even if they are represented by the same symbol

•  Try to avoid null-values!

(29)

g_id p_id

id first name

last name

user- name

political views

looking for

Translation of relationships, type 1:N "

Primary Key???

Creates(gId, pId)

à Can be merged together with Group:

Group(id, description, public, p_id)

id description public p_id

Group(id, description, public)

id description public

Profile (id, first name, lase name, username, political views, looking for)

(30)

id first name

last name

user- name

political views

looking for

g_id

Map Relationships, Type 1:n cont. "

Why not save group_id in profile?

(31)

•  Rule: 1-N Relationship types

–  Include the primary key of the 1-side as a foreign key on the N-side , (i.e. the foreign key column is placed on the entity on the N-side).

–  Alternatively, an extra table (R) is created whose primary key is a foreign key composed by the primary key from the N-side.

pk1! a1! pk2! a2! f k1!

Alt 1:

Alt 2:

pk1! a1! f k1! f k2! pk2! a2!

R! E2!

a1!

pk1!

a2!

pk2!

1! N!

E1!

Map Relationships, Type 1:n cont. "

(32)

id first name

Last name

user- name

political views

looking for

url

Translation of relationships, type 1:1 "

url p_id

Primary Key???

has(url, pId)

à Either column can be primary key

Photo(url, description, datetime, p_id)

url description datetime pId

Alternative 1:

Profile (id, first name, lase name, username, political views, looking for)

id first name

last name

user- name

political views

looking for

Photo(url, description, datetime)

url description datetime

Alternative 2:

Profile (id, first name, last name, username, political views, looking for, url)

Alternative 1 is better. Why?

(33)

Translation of relationships, type 1:1 (cont.) "

3 Options, to map relationship types between T and S

•  Foreign Key approach:

–  Choose one relation (S) and include the primary key of the other relation (T) as a foreign key.

–  Choice of S: good if total participation

•  Cross-reference

–  Extra table

–  When nr or entities that are related to an entity of the other type is little

•  Merged relation approach:

–  include T and S into one table –  Possible if both entities are total

(34)

Mapping of n-ary Relationship Types"

1.  Make new table

2.  Decide on primary key

HAS(pinboardentry_id, group_id, creator_id)

PINBOARDENTRY(id, text, datetime, group_id, creator_id)

(35)

Weak Entity types"

•  Weak entity:

–  Table with attributes as columns

–  Primary key composed: candidate key of weak entity type and primary key of owner entity type

à “owner relationship” is mapped already

Comment(text, datetime, pinentryId, profileid)

(36)

pk a1 pk k a2

R!

Translating entity types cont. . ."

a1 pk

a2

k

1! N!

•  Weak entity types - a weak entity type becomes a table that includes a column for the primary key of the identifying strong entity type .

E1

(37)

Mapping of multivalued Attributes"

•  Think of it as weak entity:

–  Extra table

–  Attribute A as one column –  Primary key:

•  Attribute A

•  Primary key of owning entity

bands n likes 1 profile name

bands(name, pId)

(38)

Summary of Translations"

ERM RM

Entity type Entity relation (Table)

Simple Attributes Attribute in Relation (Column) Candidate key

attribute

Choose one as primary key (if more than one) Composite Attributes Set of simple component attributes

Multivalued Attributes Extra relation

n:m Relation Relation, primary key

1:n Relation either include in “n-Entity”, else extra relation 1:1Relation Include in in an entity, or E

1

, E

2

, R form one

relation, or extra relation

n-ary Relation Extra table (, merge?)

(39)

Summary: Determining keys from E-R types"

•  Strong entity type. The primary key of the entity type

•  Weak entity type. union of the primary key of the strong entity type and the partial key of the weak entity type

•  Relationship type. The union of the primary keys of the related entity types becomes a super key of the relation.

–  n:m: above super key is also the primary key

–  1:n: primary key of the many entity type becomes primary key

–  1:1: the relation s primary key can be that of either entity type.

(40)

Translation of ER using Example"

(41)

Relational schemas for the example"

•  Schemas for the entity types

PROFILE(id, first name, last name, username, looking for, political views, photo_url) BANDS(name, p_id)

PHOTO(url, description)

PINBOARDENTRY(id, text, datetime)

COMMENT(text, datetime, pinboardentry_id) GROUP(id, description, creator_id)

•  Schemas for relationship types (M:N)

MEMBER(group_id, profile_id)

Schemas for ternary relationship types (N:1:1)

HAS(pinboardentry_id, group_id, creator_id)

(42)

Assignment for next lecture: 


Translation of rest of social network ER - model"

References

Related documents

One key feature of the method is the elicitation of personal constructs in the form of verbal descriptions of sound, subsequently used for development of assessment

Application Developer Framework Another Neat Tool Application Programming Interface Bean Managed Persistence Container Managed Persistence Component Object Model Distributed

The professor we interviewed, who is also a candidate for an administrator, has specific requirements regarding to the search functionality, and to be able to

De olika arbetsgrupperna kundtjänst, kundsupport, försäljare och butik behöver få systemet anpassat efter just deras användningsområde, genom att varje arbetsgrupp får en

Construct validity reflects the extent to which the operational measures represent the study subject. In the present study, practitioners’ views are measured on a numerical

In order to create a long-term successful offshore outsourcing, it is of essence for companies to have guidance in how to establish and maintain an effective and

Studiens syfte är att undersöka förskolans roll i socioekonomiskt utsatta områden och hur pedagoger som arbetar inom dessa områden ser på barns språkutveckling samt

That my aim is to explore a discourse around heteronormativity and family related to governance by a state within a power structure that limits LGB people,