Evolution of SQL in applications

SQL is low-level code

In this post I will show how applications have used SQL and how that has evolved over time, and then show how that related to modern no-SQL approaches.

All the source code for these examples is on Github and is in self-contained projects which run from the command line.

In all these examples, we have a single entity, the Customer, which has fields of name and age, and will show how to search for customers by age. The advantages of the higher-level APIs become even greater as the object models become more complex.

In all cases the Customer entity looks like:

public class Customer {
    
    private long id;
    
    private String name;
    private int age;
    
    // getters, setters, constructors
    
}

Setting up the database and project is simple. We are using Postgres in this example. Become the Postgres user and create the owner, database, table, and data. The database name will be test, owner will be test, and password will be test.

$ cd DatabaseEvolution
$ createuser --pwprompt test
Enter password for new role: 
Enter it again: 
$ createdb -O test test
$ psql -h localhost -U test test < create-tables.sql 
Password for user test: 
NOTICE:  table "customer" does not exist, skipping
DROP TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
INSERT 0 1

All the examples run from the command line.

In the beginning: JDBC

JDBC is the first approach Java has taken to accessing databases in a standard way. JDBC created standard interfaces for creating connections, running queries, and getting results. It also included the PreparedStatement, which make it easy to avoid SQL injection attacks.

What it didn’t do was create a standard query language or language for defining tables. It also didn’t create any way to map SQL rows to Java objects. The application must perform the mapping, one field at a time.

Get a list of all customers over the age of 26:

public final class Sql {

    private static final Logger LOG = Logger.getLogger(Sql.class.getName());
    
    public static void main(String[] args) throws Exception {
        final Connection connection = getConnection("jdbc:postgresql://localhost/test?user=test&password=test");
        final PreparedStatement preparedStatement = 
                connection.prepareStatement("select name, age from customer where age > ? order by age asc");
        preparedStatement.setInt(1, 26); // look for age over 26
        final ResultSet rs = preparedStatement.executeQuery();
        final List<Customer> results = new ArrayList<>();
        while(rs.next()) {
            final Customer customer = new Customer(rs.getString(1), rs.getInt(2));
            results.add(customer);
        }
        LOG.info("Here is the result: " + results);
    }
}

Execute the project as follows:

$ cd OldJDBC
$ mvn exec:java

It works. Major problems are, too much coding, embedded vendor-specific SQL queries, and manual mapping of table columns to object fields. Another key problem is that mappings and queries are not validated by the compiler, but rather are only tested at runtime. Changing an object field definition will result in updates to queries and mappings in numerous locations, and some will only show errors at runtime.

These problems were solved with object-relational mapping (ORM) approaches.

Java Persistence Architecture

The major problems with JDBC were:

  • Columns in tables should map to fields in classes
  • Vendor-specific query language, such as SQL dialects, should not be necessary
  • Ideally, queries should be validated at compile time, not run time

Several systems solved these problems, such as Java Data Objects (JDO), and then Hibernate, and others. Finally the Java Persistence Architecture (JPA) was created to provide a standard approach to ORM.

The most popular implementation of JPA is implemented by Hibernate, which in turn uses JDBC to perform database access.

JPA uses annotations on entity classes to help map to tables. JPA allows queries to be constructed in either its own universal SQL-looking query language called Java Persistence Query Language, or to be constructed in a type-safe and validated way using the CriteriaBuilder API. In this example I’m using CriteriaBuilder to show how validation looks. Here is the entire class:

package chiralsoftware.javajpa;

import java.util.List;
import java.util.logging.Logger;
import javax.persistence.EntityManager;
import static javax.persistence.Persistence.createEntityManagerFactory;
import javax.persistence.criteria.CriteriaBuilder;
import javax.persistence.criteria.CriteriaQuery;
import javax.persistence.criteria.Root;

/**
 * Demonstrate the same query using JPA
 */
public final class Jpa {

    private static final Logger LOG = Logger.getLogger(Jpa.class.getName());

    public static void main(String[] args) throws Exception {
        // Create an EntityManager
        // This loads the necessary properties from META-INF/persistence.xml
        final EntityManager entityManager = createEntityManagerFactory("jpa-test").
                createEntityManager();
        // Using a CriteriaBuilder for building queries with type safety
        final CriteriaBuilder criteriaBuilder = entityManager.getCriteriaBuilder();
        
        // Ready for queries
        final CriteriaQuery<Customer> criteriaQuery = 
                criteriaBuilder.createQuery(Customer.class);
        final Root<Customer> customerRoot = criteriaQuery.from(Customer.class);
        criteriaQuery.select(customerRoot).
                where(criteriaBuilder.greaterThan(customerRoot.get("age"), 25)).
                orderBy(criteriaBuilder.asc(customerRoot.get("age")));
        
        final List<Customer> results = 
                entityManager.createQuery(criteriaQuery).getResultList();
        
        LOG.info("Here are the results: " + results);
    }

}

Run this project the same way as the JDBC project:

$ cd JavaJPA
$ mvn clean compile exec:java

The Customer object has gained two annotations: one to mark it as an @Entity and another to specify an @Id field. Other than that, no changes. Hibernate reads database access configuration values from META-INF/persistence.xml in the classpath.

In this example code, we are not writing any SQL by hand. This code runs on any supported database, which includes all the main databases in use today (Postgres, MySQL, SQL Server, Oracle and others). Queries using the CriteriaBuilder are validated and type safe.

One problem is that queries are long and complex. Why type in so much code to simply find all the objects where age > 25?  The query would be shorted if we used the JPA query language, but then we wouldn’t have the same level of compile-time query validation.

Spring Data repositories

The Spring Framework exists to reduce the amount of code developers need to write to get their projects done. JPA is much better than using JDBC and vendor-specific SQL and hand-made mapping of results to objects. However, it still seems like too much coding.

Spring Data repositories are like magic.

Setup involves configuring database access and a few other properties in the short application.properties file, in the classpath root.

The Customer class needs the same two annotations as were used in the JPA example, because Spring Data JPA is based on JPA.

The magic happens when we define a query. We do it by creating an interface which extends one of Spring Data’s interfaces. This is our complete repository interface:

package chiralsoftware.springdatajpa;

import java.util.stream.Stream;
import org.springframework.data.domain.Pageable;
import org.springframework.data.repository.PagingAndSortingRepository;

public interface CustomerRepository extends PagingAndSortingRepository<Customer, Long> {

    Stream<Customer> findByAgeGreaterThan(int i, Pageable pageable);
    Stream<Customer> findByAgeGreaterThan(int i);

}

Two methods are defined, one taking an extra parameter of a Pageable. Both return a Stream<Customer> object.

Now all we need to perform a query is to let Spring autoinject our repository interface:

@Autowired
private CustomerRepository customerRepository;

and then do a query:

final Stream<Customer> customers = customerRepository.findByAgeGreaterThan(24);
customers.forEach((c) -> LOG.info("Customer: " + c));

Now our query is extremely simple. It’s one line, it’s type safe, it’s easy to read, and it integrated naturally into the rest of Java 8’s stream API. This whole query can go on one line, and we can add sorting and pagination too:

customerRepository.findByAgeGreaterThan(24, of(0, 2,  by("age").ascending())).forEach((c) -> LOG.info("Customer: " + c));

This project is a Spring Boot CLI project and has a different way of executing:

$ cd SpringDataJpa/
$ mvn spring-boot:run

Spring Boot will show a large amount of log messages, and will show a SQLFeatureNotSupportedException exception from Postgres which can be ignored.

That shows all the advantages we’re looking for:

  • No code modifications to access any vendor’s database
  • Type and query safety at compile time
  • Very concise coding style
  • Natural integration with the streams API

We can combine the streams API with repositories to make natural, very concise, and safe operations like this:

final double average = customerRepository.
    findByAgeGreaterThan(24).
    mapToDouble(Customer::getAge).
    average().
    orElse(NaN);
LOG.info("Average age of customers over age 24: " + average);

We have used Spring Data JPA repository and streams to find the average age of customers over age 24 in a very concise way. Real applications would have more complex examples, but this shows how expressive the system is.

How it works

First, only an interface is created. No implementation is created. Spring Data creates the implementation dynamically based on the interface and injects the instance based on the @Autowired annotation.

Spring Data scans the method names to create queries. findBy... will return a collection or stream. Age is recognized as a field in the Customer class. GreaterThan is a query term. Spring Data turns all of these terms into JPA queries and then executes them. Spring Data also uses information from the PageRequest to determine the range desired, sort fields, and sort order. The resulting SQL query is:

select customer0_.id as id1_0_, customer0_.age as age2_0_, customer0_.name as name3_0_ 
from customer customer0_ 
where customer0_.age>? 
order by customer0_.age asc limit ?

We have the type safety and validation of JPA’s Criteria API, but in a concise and natural coding style.

Beyond SQL

Spring Data’s design works with both JPA and no-SQL storage systems. In particular, Spring Data has extensive support for MongoDB. Modifying the Spring Data project to switch from Postgres to Mongo is easy. Super easy. Edit the pom.xml file and change spring-boot-starter-data-jpa to spring-boot-starter-data-mongodb. Add the necessary Customer objects to Mongo:

$ mongo < create-objects.js

Now run the project again the same way:

$ mvn spring-boot:run

and it will perform the same queries but using Mongo. Not only is our application independent of SQL vendors, it’s independent of SQL itself. If we’re packing this as a commercial application we could allow it to support both options.

Overview

We can view the evolution of database access in a table:

Vendor-indepdentCompile-time validation
Concise queries
SQL or no SQL
JDBC
JPA query language
XX
JPA Criteria queries
XX
Spring Data repositories
XXXX
Posted in Uncategorized