SQL++ versus SQL

The most important difference between SQL++ and SQL is the data model. Other notable differences relate to the projection, selection, and filtering of data.

Data Model

In a relational database, data is constrained to tables with uniform structure. The following example consists of two tables, Employee and Employers, with Name as the primary key.

EMPLOYEE:
Name  | SSN | Wage
-----------------------------------------------------------------------
Jamie | 234 | 123
Steve | 123 | 456

SCHEMA:
Name -> String of width 100
SSN -> Number of width 9
Wage -> Number of width 10

EMPLOYERS:
-----------------------------------------------------------------------
 Name_Key  | Company   | Start | End
 Jamie     | Yahoo     | 2005  | 2006
 Jamie     | Oracle    | 2006  | 2012
 Jamie     | Couchbase | 2012  | NULL

By contrast, SQL++ handles data as free-form documents, gathered into large collections called keyspaces. There is no uniformity; nor is there any logical proximity shared by objects of the same data shape within the keyspace. The relational data from the above example might therefore appear as follows:

 (HRData keyspace)
 {
     'Name': 'Jamie'
     'SSN': 234
     'Wage': 123
     'History':
      [
       ['Yahoo', 2005, 2006],
       ['Oracle', 2006, 2012],
     ]
 },

 {
     'Name': Steve
     'SSN':  123,
     'Wage': 456,
 }

Data Projection

In a relational database, when an SQL query is run, a set of rows is returned; each row consisting of one or more columns, and the columns being the same for each row. A header can be retrieved, to obtain metadata about each column. The SQL projection may also contain arbitrary expressions, which are returned as fields in the result set.

Query:

SELECT Name, Company
    FROM Employee, Employers
    WHERE Name_Key = Name

Result:

	Name  | Company
	----------------
	Jamie | Oracle
	Jamie | Yahoo
	Jamie | Couchbase
	----------------

With SQL++, an arbitrary expression in the projection may contain collection operators or construction operators, which enable you to reshape the data in ways that go beyond merely adding an extra field. Like SQL, SQL++ allows fields to be renamed using the AS keyword.

Query:

 SELECT Name, History, {'FullTime': true} AS 'Status'
      FROM HRData

Result:

	{
	'Name': 'Jamie',
	'History':
		[
		['Yahoo', 2005, 2006],
		['Oracle', 2006, 2012],
		['Couchbase', 2012, null]
		],
	'Status': { 'FullTime': true }
	}
	{
	'Name': 'Steve',
	'Status': { 'FullTime': true }
	}

Data Selection

In SQL++, the FROM clause is used to select between data sources (keyspaces). If HRData is a keyspace, the following statement selects the Name attribute from all documents in the HRData keyspace that have a Name attribute defined.

SELECT Name FROM HRData

Each document can also regard itself as a data source, and run a query over its nested elements. Such nested elements are addressed using the dot (.) operator to descend a level, and the square-bracket ( [ ] ) operator to index into an array element.

  SELECT FullTime FROM HRData.Status
{
     'FullTime': true
}

The selected fields can also be renamed using the AS operator, as in SQL:

SELECT firstjob FROM HRData.History[0] AS firstjob
{
     'firstjob': ['Yahoo', 2005, 2006]
}

SELECT firstjob[2] FROM HRData.History[0] AS firstjob
{
     'firstjob[2]': 2006
}

Data Filtering

SQL++ supports the WHERE clause, but with slight differences from SQL.

Similarly to SQL, the dot ( . ) and the square bracket ( [] ) operators can be used to access nested elements as they are used in SELECT clauses.

SQL++ data can be irregularly shaped: hence, undefined values are recognized as distinct from null. SQL++ provides a complementary set of operators like IS MISSING, in addition to standard operators like IS NULL. New conversions, for example from non-zero integer-values to Boolean value true, are also supported.

Most standard SQL functions (for example, LOWER()) are defined. In addition to the standard filtering-predicates, SQL++ provides new operators to work with arrays in documents: ANY, SOME, and EVERY. ANY and SOME evaluate a condition for each element, and return true if any element meets the condition. EVERY also evaluates a condition for each element; except that it returns true only if all elements matched the condition.