Some Practice Problems
The questions appear in this font
Relational Model and SQL Overview
Beer SQL and Relational Algebra
Consider the beer drinkers' database with the tables
FREQUENT(Drinker, Bar)
SERVES(Bar, Beer)
LIKES(Drinker, Beer)
SELLS(Bar, Beer, Amount)
- Write a view PAIRS with schema (Drinker, Friend). The table contains all pairs of Drinker (D) and Friend (F), where F is a Drinker who
- Likes at least one beer that D also likes, and
- Frequents in at least one bar that D also frequents and this bar serves one or more of the beers they both like
- Write a relational algebra expression for the view. (Recall a relational algebra expression for a view is identical to the expression for the query defining the view.) The expression you will write must have the following properties
- No
cartesian products
- Every selection or join operator must be associated with an atomic condition, i.e., a condition of the form A=B or A!=B, but no condition of the form, say, C1 AND C2.
Beer Aggregation
Find the Bar that sells the highest total Amount of Beer. Hint: You can use a nested query in the HAVING clause of an SQL query.
Keys
Is it possible to have two primary keys on a relation?
Storage
Sequential Vs Random Access
Consider a table R which is stored in consecutive pages of the disk. We know that reading pages of the disk in the sequential order in which they are stored is at least ten times faster than reading pages in some random order. Give two reasons why we may still prefer to read in random order instead of reading sequentially.
Indexing
Pointers
Present scenarios where record pointers perform better than block pointers and vice versa.
Dense Vs Sparse Indexes
When is it possible to use a sparse index?
Describe a scenario where it is preferrable to use a dense index instead of a sparse. Assume the index is on a primary key. Make an analysis of the gain that is realized by choosing a dense index.
No More Indexes Please…
Provide reasons why we should not keep an index on an attribute A of some relation R.
Estimating the Cost of Operations on a Dense Index
Consider a relation R(K,A,B,…) where the attribute K is an integer key. The relation is sorted along the key K, i.e., K is a primary key. Then there is a multilevel dense index on K. Assume that the top level of the multilevel index always has just one page.
Each tuple has a block pointer pointing to the next tuple. Also, each index entry has a pointer to the next entry. As usual, the system attempts to "pack" consecutive tuples in contiguous blocks and uses the overflow when this is not possible.
Assume that n tuples are inserted in R.
- The size of a disk page is d.
- The size of a tuple of R is r.
- The size of each K value is k bytes.
- Block pointers are used and each one is p bytes long.
- No use of main memory buffers.
- Records (tuples and index entries) do not span blocks.
Questions
- Assume that initially the tuples with keys m*i, i=1,…,n and the corresponding index entries are stored in consecutive blocks, no overflow is needed and space utilization is 66%. Display the state of the indexes and the relation assuming p=6, r=44, k=20, d=160, m=1000, n=30.
- Assume that at every minute j there's a burst of inserting 100 tuples with keys m*j+k, k=1,…,l in ascending order. If you were the administrator how often would you reorganize this database?
The Weird Properties of C+ Trees
The C+ trees are a variation of B+ trees. They have the following properties:
- Each page/node of the C+ tree has m to 3m/2 elements (m is even). The root may have any number of elements.
- m > 2
- has more than one levels
- All other properties of the C+ tree are identical with the B+ tree.
Write an O(logn) procedure for insertion of a new element e into a C+ tree.