DATABASE DESIGN I - 1DL300
Fall 2011"
An introductory course on database systems
http://www.it.uu.se/edu/course/homepage/dbastekn/ht11
Erik Zeitler
Uppsala Database Laboratory
Department of Information Technology, Uppsala University, Uppsala, Sweden
Introduction to the Relational Model
Elmasri/Navathe ch 3,8
Padron-McCarthy/Risch ch 5, 6"
Anne - Kathrin Peters Room 2404
Department of Information Technology Uppsala University, Uppsala, Sweden
course member in course
persnr
gives
n m n
1
trainer name
id
name
max personnr
address
tel-nr street
nr post code
city
payed
price
member in trainer
…
pnr id tid pnr
Conceptual Schema Conceptual
Design
Logical Design
Conceptual Model e.g. UML, ERM
Implementation Model (Data Model of DBMS here RM)
Logical Schema
Overview"
1. Key aspects of last lecture (L1) 2. Goal for today’s lecture (L2)
3. Important terms (Erik didn’t present)
4. Comparison and discussion of ER (assignment)
1. In groups
2. With audience
5. ERM à RM – how to?
6. Little assignment for L3
Key aspects of lecture L1"
• Why lecture?
– Databases everywhere – Databases mainly tables – Problems might occur
• Systematic design process
– Overall: ERM à RM – ERM components
Outline of a database system"
Database System (DS)
Applications
procedures / statements
Users’
interactive queries
DB schema DB
Database Management System (DBMS)
“DS: DBMS & DB”
“Just do a table!???”"
Member Street PCod City E-Mail Tel-Nr Course Max Jan
Johannson
Kyrkogårds gatan 10
75313 Uppsala johannson@g mail.com
076429384 Aquafit 50
Kalle Berglund
Eklundshov svägen 1
75237 Uppsala k.berglund@h otmail.com
0713284938 Soccer 25
Anna
Andersson
Kungsgatan 20
75332 Uppsala andersson@g mail.com
074628394 Climbing 8
Anna Anderson
Kungsgatan 20
75332 Uppsala anderson@gm ail.com
074628394 Aquafit 50
Anna
Andersson
Kungsgatan 20
75332 Uppsala andersson@g mail.com
074628394 Swimming 30
Insertion Update
Anomalies on… Deletion
Solution: Split"
Name Street PCode City E-Mail Tel-Nr Jan
Johannson
Kyrkogårdsg atan 10
75313 Uppsala johannson@
gmail.com 0764293 84
Kalle Berglund
Eklundshovs vägen 1
75237 Uppsala k.berglund@
hotmail.com 0713284 938
Anna
Andersson
Kungsgatan 20
75332 Uppsala andersson@
gmail.com 0746283 94
Name Max Aquafit 50 Soccer 25 Climbing 8 1
2 3
1 2 3 Id
Id m_id k_id
1 1
2 2
3 1
3 2
3 3
Member:
Course: Goes_to:
course
Systematic Design Process"
Sport club
member in course
persnr
gives
n m n
1
trainer name
id
name
max personnr
address
tel-nr street
nr post code
city
payed
price
member in trainer
…
pnr id tid pnr
Miniworld
Conceptual Model e.g. UML, ERM
Implementation Model (Data Model of DBMS
here RM) Conceptual
Schema Conceptual
Design
Logical Schema Logical
Design
Requirements Collection & Analysis Requirements
Physical Design
Internal Schema
Member, courses…
ER-notation
(Elmasri/Navathe Figure 7.14)"
Assignment"
1. Compare your ERMs
2. Discuss differences: Are all different versions possible?
3. Formulate questions.
Requirements for Social Network"
Mark Zuckerberg and his fellow students brainstorm the following requirements:
• For every user a profile is created. There, the following data should be stored – name of the user (first, last and username)
– favorite Bands (a list of Bands that the user likes) – political views
– looking for (e.g. friendship, relationship, same sex relationship, open relationship,…)
• A user (profile) can be friends with other users. The other user needs to agree or decline.
• A user (profile) can have a pin board. There, the user can make pin board entries- either for his/
herself or for other users. These entries have a text and/or a date of creation. These entries can have a photo and they can be commented. The comments have a position (order) on the pin board, depending on the time, that it was created.
• When a photo is uploaded, the url through what it can be reached is stored, a description of the picture and the date and time. Users can upload a photo for their profile.
• A user (profile) can create groups. These have a description and pin board entries. Groups are public (open) or not (closed). If they are not, users need to register for a membership, the creator of the group then agrees on or denies the membership. Hence, a “status” of membership (member, not replied, declined) needs to be stored.
• A user (profile) can create events. Events have a description, date, time and they are public or not.
For each event, the user can invite other users. The answer of the invited user (participating, not participating, maybe, not answered) needs to be stored. An invitation is a requirement for seeing non-public events.
• Locations (street, nr, postal code, city, country) are needed several times, at least to store the users’
place of living and the location of an event.
N-ary relationships or relationships of degree >2"
• A relationship type, R, can mathematically be defined as:
R ⊆ E
1× E
2× … × E
nwhere each E
jis an entity type.
• An instance r ∈ R is written as (e
1, e
2, ..., e
n) or <e
1, e
2, ..., e
n>
(as a tupel) where e
j∈ E
j• Example from Social Network DB:
– Profile (E1) creates Pinboard entry (E2) for Profile (E3) – Profile (E1) creates Pinboard entry (E2) for Group (E3)
pin board
entry has group
1
n 1
profile has
1 n
1
course
Systematic Design Process
GOAL for L2"
Sport clubmember in course
persnr
gives
n m n
1
trainer name
id
name
max personnr
address
tel-nr street
nr post code
city
payed
price
member in trainer
…
pnr id tid pnr
Miniworld
Conceptual Model e.g. UML, ERM
Implementation Model (Data Model of DBMS
here RM) Conceptual
Schema Conceptual
Design
Logical Schema Logical
Design
Requirements Collection & Analysis Requirements
Physical Design
Internal Schema
Member, courses…
The Relational Model (RM)"
• Introduced by Dr. Edgar F. Codd (1924-2003) in 1970
• Many (in practice, all) DBMSs are based on the RM
• Relations (tables) with atomic values (1
stNF) only data structure!
• Query using high-level languages: simple but expressive
• Operations on Relations (Tables)
– Queries: Selection, Projection, Cartesian product, set operations (based on relational algebra)
– Manipulation: Insertion, Deletion, Update
• Closeness: Operations apply to relations (tables), produce new relations
Q 1 ( ) à Q 2 ( ) à
Relations as mathematical objects"
• Let A
1, A
2, …, A
nbe attributes with Domains D
1, D
2, …, D
n• Relation: subset of the Cartesian product of domains D
1, … D
n(set of values), i.e. R ⊆ D
1× D
2× … × D
n• D
1× D
2× ... × D
n: set of all ordered n-tuples (v
1,v
2,...,v
n), v
i∈ D
ifor all i.
• Example: n=2, D1={1,2} and D2={P, Q, R} one gets the product sets:
D1
×
D2 = {(1,P),(1,Q),(1,R),(2,P),(2,Q),(2,R)}D2
×
D1 = {(P,1),(P,2),(Q,1),(Q,2),(R,1),(R,2)}• For example, we have the relations:
R1 ⊆ (D1
×
D2) R1= {(1,Q),(1,R),(2,P)}R2 ⊆ (D1 × D2) R2= {(2,P),(2,Q),(2,R)}
R3 ⊆ (D2
×
D1) R3= {(P, 2), (Q,1)}• Members of a relation are called tuples
Members of relations with degree n: n-tuples.
D1,
A1 A2 1 Q 1 R 2 P
First Normal Form Assumption"
Only simple / atomic attribute values à no composite or multiple values!
The theory for the relational model is based on
the first normal form assumption!
Steps of Translation:
From E-R model to relational model"
1. Translation of Entity types and their attributes
1. Entity types
2. Weak entity types
2. Translation of relationships
1. 1-1 Relationship 2. 1-N Relationship 3. M-N Relationship
3. Translation of multivalued attributes and relationships (degree > 2)
1. Multivalued attributes 2. Multivalued relationships
Criteria"
• Information from ER à RM
• Data integrity
• No (or little) redundancy
• Small nr of tables
Furthermore, it should be logical, understandable
Translation of ER using Example"
Translation of strong entities & their attributes"
id description public
or:
Group (id, description, public) sometimes with data types:
Group(id: int, description: string, public: bool)
A strong entity type à table with the same attributes
• Each attribute gets its own column
• Key attributes à choose primary key column(s) for the table.
• Composite attributes are represented by their simple components.
Example: group schema
Terms: Relation schema, Relation instance"
group!
• A1, A2, . . ., An are attributes
• R = (A1, A2, . . ., An) is a relation schema
– Example: group-schema = (id, description, public)
• r(R) is a relation on the relation schema R
– Example: group(id, description, public)
– Relation instance: The current values of a relation, specified by a table.
• An element t of r is a tuple - represented by a row in a table customer
a tuple an attribute
a relation
id description public 1 “rocker” 0
2 “dancer” 1 3 “sportfreak” 0
Keys"
• Relations are sets à all tuples in the relation are different
• Superkey: A subset k of the attributes in a relation schema R, i.e. k ⊆ R, that has the characteristic that if the tuples
t1, t2 ∈ r(R) and t1 ≠ t2 then
t1[k] ≠ t2[k]
i.e. the value of k in t1 ≠ the value of k in t2
• Candidate Key: minimal superkey, i.e. there is no other superkey k' such that k' ⊂ k
• Primary Key (or simply key): used to identify entities
• Foreign Key: to refer to a tuples in another relation (later)
Group(id, description, public)
Profile (id, first name, last name,
username, political views, looking for)
note: Attribute “favourite Bands” is multivalued à later!
id first name
last name
user- name
political views
looking for
Example entity “profile”"
id
Profile (id, first name, lase name, username, political views, looking for)
Group(id, description, public)
g_id p_id status
Member(gId, pId, status)
Primary Key???
Translation of relationships & their attributes"
• Always an option: Create a new table
– One column for every primary key attribute of the participating entity types – One column for each attribute of the relationship
– Determine primary Key
• Sometimes, a separate table for the relationship is not needed!!!
Profile (id, first name, lase name, username, political views, looking for)
Translation of relationship types, N:M"
• Always new table
• Primary key is composed of primary key of participating entities
• Descriptive Attributes of relationship type
id first name
last name
user- name
political views
looking for
Group(id, description, public)
id description public
g_id p_id status
Member(gId, pId, status)
Translating relationship . . . cont. . ."
• M-N Relationship types
– Always a separate table with columns for the primary keys of the two
participating entity types, and any descriptive attributes of the relationship type.
R! E2!
a1!
pk1!
a2!
pk2!
M! N!
E1!
pk1! a1! f k1! f k2! pk2! a2!
primary key is composed of fk1 and fk2
Foreign Keys"
Foreign Key: one or several attributes of a relation, so that for every value of the foreign key that is not NULL: it matches one candidate key value of another relation
– The values in a fk-column point to unique row in the other table, and thus implement the relationship.
– Referential integrity: for every foreign key value, there must be
an entity
Null values"
• An attribute may have the null value
• null is used
– when the value of an attribute is unknown
– when a certain attribute does not have a value – when an attribute is not applicable
• Example:
students(personnr, name, birthday, …, home phone,visa_status)
• Every occurrence of null is unique.
⇒ Two occurrences of null is not considered equal even if they are represented by the same symbol
• Try to avoid null-values!
g_id p_id
id first name
last name
user- name
political views
looking for
Translation of relationships, type 1:N "
Primary Key???
Creates(gId, pId)
à Can be merged together with Group:
Group(id, description, public, p_id)
id description public p_id
Group(id, description, public)
id description public
Profile (id, first name, lase name, username, political views, looking for)
id first name
last name
user- name
political views
looking for
g_id
Map Relationships, Type 1:n cont. "
Why not save group_id in profile?
• Rule: 1-N Relationship types
– Include the primary key of the 1-side as a foreign key on the N-side , (i.e. the foreign key column is placed on the entity on the N-side).
– Alternatively, an extra table (R) is created whose primary key is a foreign key composed by the primary key from the N-side.
pk1! a1! pk2! a2! f k1!
Alt 1:
Alt 2:
pk1! a1! f k1! f k2! pk2! a2!R! E2!
a1!
pk1!
a2!
pk2!
1! N!
E1!
Map Relationships, Type 1:n cont. "
id first name
Last name
user- name
political views
looking for
url
Translation of relationships, type 1:1 "
url p_id
Primary Key???
has(url, pId)
à Either column can be primary key
Photo(url, description, datetime, p_id)
url description datetime pId
Alternative 1:
Profile (id, first name, lase name, username, political views, looking for)
id first name
last name
user- name
political views
looking for
Photo(url, description, datetime)
url description datetime
Alternative 2:
Profile (id, first name, last name, username, political views, looking for, url)
Alternative 1 is better. Why?
Translation of relationships, type 1:1 (cont.) "
3 Options, to map relationship types between T and S
• Foreign Key approach:
– Choose one relation (S) and include the primary key of the other relation (T) as a foreign key.
– Choice of S: good if total participation
• Cross-reference
– Extra table
– When nr or entities that are related to an entity of the other type is little
• Merged relation approach:
– include T and S into one table – Possible if both entities are total
Mapping of n-ary Relationship Types"
1. Make new table
2. Decide on primary key
HAS(pinboardentry_id, group_id, creator_id)
PINBOARDENTRY(id, text, datetime, group_id, creator_id)
Weak Entity types"
• Weak entity:
– Table with attributes as columns
– Primary key composed: candidate key of weak entity type and primary key of owner entity type
à “owner relationship” is mapped already
Comment(text, datetime, pinentryId, profileid)
pk a1 pk k a2
R!
Translating entity types cont. . ."
a1 pk
a2
k
1! N!
• Weak entity types - a weak entity type becomes a table that includes a column for the primary key of the identifying strong entity type .
E1
Mapping of multivalued Attributes"
• Think of it as weak entity:
– Extra table
– Attribute A as one column – Primary key:
• Attribute A
• Primary key of owning entity
bands n likes 1 profile name
bands(name, pId)
Summary of Translations"
ERM RM
Entity type Entity relation (Table)
Simple Attributes Attribute in Relation (Column) Candidate key
attribute
Choose one as primary key (if more than one) Composite Attributes Set of simple component attributes
Multivalued Attributes Extra relation
n:m Relation Relation, primary key
1:n Relation either include in “n-Entity”, else extra relation 1:1Relation Include in in an entity, or E
1, E
2, R form one
relation, or extra relation
n-ary Relation Extra table (, merge?)
Summary: Determining keys from E-R types"
• Strong entity type. The primary key of the entity type
• Weak entity type. union of the primary key of the strong entity type and the partial key of the weak entity type
• Relationship type. The union of the primary keys of the related entity types becomes a super key of the relation.
– n:m: above super key is also the primary key
– 1:n: primary key of the many entity type becomes primary key
– 1:1: the relation s primary key can be that of either entity type.
Translation of ER using Example"
Relational schemas for the example"
• Schemas for the entity types
PROFILE(id, first name, last name, username, looking for, political views, photo_url) BANDS(name, p_id)
PHOTO(url, description)
PINBOARDENTRY(id, text, datetime)
COMMENT(text, datetime, pinboardentry_id) GROUP(id, description, creator_id)
• Schemas for relationship types (M:N)
MEMBER(group_id, profile_id)