Database
Database related
A collection of database functions and operations. (Currently not maintained)
Database Functions¶
MySQL functions
MySQL functions | HorseIR builtins | Description |
---|---|---|
date_add | date_add | Returns the new date after given an interval |
date_sub | date_sub | Returns the difference in days between two date values |
NOW | time_stamp | 2014-11-22 12:45:34 |
CURDATE | 2014-11-22 |
|
CURTIME | 12:45:34 |
See more about MySQL reference.
Important Database Operations¶
A couple of important database operations are listed as follows.
Join¶
A join may have various kinds of joins which can be implemented with different efficient algorithms. We consider the following joins which are included in our implementation. By default, a typical hash join algorithm is implemented.
SQL joins (StackOverflow)
- Nature join
- Inner join (Equi-join)
- Outer join
- Outer join less
- Left join
- Left join less
- Right join
- Right join less
Order¶
There are a couple of built-in functions designed for handling ordering
problem. If only one order condition is specified, either asc
or desc
is
used. Otherwise, a more general function sort
is used for multiple-column
sorts.
For example, ORDER BY salary DESC
can be represented as follows.
// t0 is a partial or full set of the salary column
t1:i64 = desc(t0);
A list of integers (indices) is returned and saved into t1
;
Group by¶
This clause aggregates items with the same value.
For example, GROUP BY lastname
// t0 is a partial or full set of the lastname column
t1:list<i64> = @group_by(t0);
The returned t1
is a list of lists. The internal list has integers (indices)
which point to their original position in the list t0
.
Limit / Top¶
Limit (or top) is useful in selecting a limited amount of items out of a long list.
For example, LIMIT 10
takes the first 10 elements.
// t0 is a partial or full set of the column
// @take is a primitive function
t1:? = @take(t0, 10:i32);
Another similar function, limit_range
for range queries, can select items
within a range (m,n)
.
// t0 is a partial or full set of the column
t1:i64 = @substract(n,m); // range size
t2:i64 = @range(t0); // iota(t0)
t3:i64 = @add(t2, m); // addition
t4:? = @index(t0, t3); // indexing
Implementing Database Operations Using Array Operations¶
References:
- Zhou, Jingren and Kenneth A. Ross. “Implementing database operations using SIMD instructions.” SIGMOD Conference (2002).
- Govindaraju, Naga K., Brandon Lloyd, Wei Wang, Ming C. Lin and Dinesh Manocha. “Fast computation of database operations using graphics processors.” SIGGRAPH Courses (2004).
Scan-like Operators¶
Problem description:
Supposedly, we have two vectors x
{x1,x2,...,xn} and y
{y1,y2,...,ym},
where n
and m
are the number of elements in x
and y
respectively. A
match operation is to find out the same element y[i]
in x
. There are a
couple of senarios we should consider as follows.
Return the first match
The values of x
elements should be distinct.
t0:list<i32> = index_of(x,y);
For example,
x:i64 = 25 36 15
y:i64 = 36 17 25
t0:list<i64> = @index_of(x,y);
t1:i64 = @print(t0);
> 1 3 0
We see
- y[0] (36) is the 2nd element in
x
(return 1) - y[1] (17) is not in
x
(return 3, the length ofx
) - y[2] (25) is the 1st element in
x
(return 0)
Return all matches
t0:list<bool> = outer(@eq,x,y);
t1:list<i64> = each(@where,t0);
For example,
x:i32 = 25 36 36;
y:i32 = 36 17 25;
t0:list<bool> = @outer(@eq,x,y);
(0 1 1;0 0 0;1 0 0)
t1:list<i64> = @each(@where, t0);
t2:i64 = @print(t1);
(1 2; ;0)
We see
(0 1 1;0 0 0;1 0 0)
is the result of comparing (eq
)x
andy
in an outer-product wayeach(where, t0)
operateswhere
in the each list oft0
Aggregation¶
Basic operations for aggregation
- count
- min and max
- sum and avg
We provide a general function reduce
to aggregate a list T
by an
aggregation function f
. Its syntax is reduce(@f, T)
.
For example,
t0:list<i64> = (1 1 1;2 2);
t1:list<i64> = @reduce(@count, t0);
t2:i64 = @print(t1);
> (3;2)
Aggregation is a perfect candidate for data parallelism despite it may take different aggregation functions.
Examples for database operations¶
GROUP BY¶
Table
CustomerID : i32
CustomerName : str
ContactName : str
Address : str
City : str
PostalCode : str
Country : sym
SQL
SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country
ORDER BY COUNT(CustomerID) DESC;
HorseIR
modeul _default{
c0:dict<sym,i32> = column(Customers:table, `CustomerID:sym);
c1:dict<sym,str> = column(Customers:table, `CustomerName:sym);
c2:dict<sym,str> = column(Customers:table, `ContactName:sym);
c3:dict<sym,str> = column(Customers:table, `Address:sym);
c4:dict<sym,str> = column(Customers:table, `City:sym);
c5:dict<sym,str> = column(Customers:table, `PostalCode:sym);
c6:dict<sym,sym> = column(Customers:table, `Country:sym);
t0:list<i32> = value(c6);
t1:list<i32> = unique(t0);
t2:list<list<i32>,list<i32>> = outer(`equal:fun, t1, t0);
t3:list<i32> = reduce(`plus:fun, t2);
t4:list<i32> = reduce(`plus:fun, t2);
t5:list<i32> = desc(t4);
c7:dict<sym,i32> = dict(`count_customerid:sym;, t3); //group by
c8:dict<sym,i32> = c7[t5];
c9:dict<sym,sym> = c6[t5];
z0:list<?> = list(c8,c9);
z:table = createTable(z0);
}
ORDER BY¶
Note: the same table Customers
in GROUP BY.
SQL
SELECT * FROM Customers
ORDER BY Country ASC, CustomerName DESC;
HorseIR
modeul _default{
c0:dict<sym,i32> = column(Customers:table, `CustomerID:sym);
c1:dict<sym,str> = column(Customers:table, `CustomerName:sym);
c2:dict<sym,str> = column(Customers:table, `ContactName:sym);
c3:dict<sym,str> = column(Customers:table, `Address:sym);
c4:dict<sym,str> = column(Customers:table, `City:sym);
c5:dict<sym,str> = column(Customers:table, `PostalCode:sym);
c6:dict<sym,sym> = column(Customers:table, `Country:sym);
t0:list<i32> = value(c6);
t1:list<str> = value(c1);
t2:list<?> = list(t0, t1);
t3:list<bool>= list(1:bool, 0:bool); // ASC:true; DESC: false
t4:list<i32> = order(t2, t3);
c7:dict<sym,i32> = c0[t4];
c8:dict<sym,i32> = c1[t4];
c9:dict<sym,i32> = c2[t4];
c10:dict<sym,i32> = c3[t4];
c11:dict<sym,i32> = c4[t4];
c12:dict<sym,i32> = c5[t4];
c13:dict<sym,i32> = c6[t4];
z0:list<?> = list(c7,c8,c9,c10,c11,c12,c13);
z:table = createTable(z0);
}
ELI code for order by
x <- 1 2 3 2 2
y <- 5 3 7 1 2
<x,[1.5]y
1 4 5 2 3
x[1 4 5 2 3]
1 2 2 2 3
y[1 4 5 2 3]
5 1 2 3 7
MIN and MAX¶
Table Products
ProductID : i32
ProductName : str
SupplierID : i32
CategoryID : i32
Unit : str
Price : f64
SQL
SELECT MIN(Price) AS SmallestPrice
FROM Products;
HorseIR
modeul default{
import Builtin.*;
def main():table{
c0:i32 = column(Products:table, `ProductID:sym);
c1:str = column(Products:table, `ProductName:sym);
c2:i32 = column(Products:table, `SupplierID:sym);
c3:i32 = column(Products:table, `CategoryID:sym);
c4:str = column(Products:table, `Unit:sym);
c5:f64 = column(Products:table, `Price:sym);
t0:f64 = min(c5);
z0:sym = `SmallestPrice:sym;
z1:list<f64> = list(t0)
z:table = @table(z0, z1);
return z;
}
}