Learning the Relational Model – Basic Set Theory (Part I)

This blog series on Relational Model is my attempt to understand the foundation of relational model of database. My intention is to learn and understand how Edgar F Codd designed the relational model, the basis of the model and how this is implemented in the relational database.

Introduction

Edgar F Codd built the foundation of relational model of database above two fundamental mathematical concepts viz. Set theory and predicate logic. The basic set theory is explained in this part of the series.

Sets and Relations

A set is a collection of distinct objects. For example a fruit basket is a collection of fruits and can be represented as B = {Apple, Orange, Grape}. Set of employees in a particular firm E = {Alice, Bob, Charlie}. There can be empty sets as well. For example an empty fruit basket, B = {}.

We know that the algebra of numbers include many operators like addition (+), subtraction (-), multiplication (X), division (/) etc. Likewise algebra of sets includes its own collection of useful operators. Like the operators of arithmetic, some of the set operators combine two sets and yield a set. These are the set operators.

  1. Union

Let A and B be sets with the same universe U. The union of A and B, denoted AB, is the set containing those elements of U that are either elements of A or elements of B (or elements of both). The union of {1, 2, 3} and {2, 3, 4} is the set {1, 2, 3, 4}

  1. Intersection

Let A and B be sets with the same universe U. The intersection of A and B, denoted A∩B, is the set containing those elements of U that are both elements of A and elements of B. The intersection of {1, 2, 3} and {2, 3, 4} is the set {2, 3}

  1. Set Difference

Let A and B be sets with the same universe U. The set difference of A and B, denoted A\B, is the set containing those elements of U that both elements of A and non-elements of B. The set difference {1,2,3} \ {2,3,4} is {1}

  1. Cartesian Product

Let A and B be sets with the same universe U. The Cartesian product of A and B, denoted by A X B, is the set whose members are all possible ordered pairs (a,b) where a is a member of A and b is a member of B. The Cartesian product of {1, 2} and {red, white} is {(1, red), (1, white), (2, red), (2, white)}.

  1. Power Set

Power set of a set A is the set whose members are all possible subsets of A. For example, the power set of {1, 2} is { {}, {1}, {2}, {1,2} }

 How can we apply this to data? What it says is everything can be collected to a set. That is a set of employees, a set of customers, a set of orders, a set of addresses, a set of products etc. If we can define an entity as a set of data, then we can also apply the set operators! For example, we can derive a set of vehicle by applying the union operator on a set of cars and set of bikes. We can find a set of both English speaking and Hindi speaking people by applying the intersect operator on the set of English speaking people and the set of Hindi speaking people. Since set operations are also associative and commutative we can combine the sets in many combinations and get different result sets (I am not going in detail to the associative and commutative properties of sets, since we have already studied this for numbers in high school classes).

Multiset

Based on the definition, set do not allow duplicates. However the idea of multiset has been introduced to allow more than one similar object (or duplicate object). Multiset is also called bag. Multiset and operations on multisets are very much similar to set. I will explain the practical importance of multiset later in this series.

From Set to Relation

A relation in mathematics is an association between various objects. Relation can also be represented by a set. A relation is a set of ordered pair and the pair has certain association. For example, let us consider a set of employee, E = {Alice, Bob, Charlie}, and a set of department D = {Sales, Purchase, Marketing}. So a relationship “is of department”, R can be represented as R = {(Bob, Marketing), (Alice, Sales), (Charlie, Sales)} and this can be read in English as ‘Bob is of department Marketing’, ‘Charlie is of department Sales’ etc. There are two elements in the result set, employee name and department name, a set of double. This can be triple, quadruple, quintuple, sextuple, septuple, etc. to n-tuple. This is generalized to a term called tuple. So in the above relation (Bob, Marketing) is a tuple and (Alice, Sales) is another tuple and so (Charlie, Sales).

Next

In the next part of the series we will see the predicate logic, the next fundamental concept of relational model.

 

Advertisements

Leave a comment

Filed under Basics, Theory

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s