CS145 - Introduction to Databases
Bulk-Loading Data into SQLite Databases

Overview

Bulk-loading refers to the process of loading data specified in lines of a file directly into a database system, rather than executing a large series of INSERT statements. We first describe the data file format for bulk-loading into SQLite databases, then the loading process.

The Data File

The data file consists of a sequence of lines, each one specifying one tuple to be loaded into an existing table of the database. Each line lists values for the attributes of the table, in the order the attributes were declared when the table was created. Any string can be used as the separator between attribute values; by default, it is "|".

As an example, suppose we are bulk-loading into a table Student(ID,name), where ID is an integer and name is a string. If we specify "|" as our separator, our data file (call it students.dat) might look like:

    123|Alice
    456|Bob
    789| Carol
As the result of loading file students.dat, the following tuples are inserted into table Student:
   (123,'Alice')
   (456,'Bob')
   (789,' Carol')
Warning: Notice that the third line of students.dat has a blank after separator "|". This blank is not ignored by the loader, i.e., string "Carol" is loaded with a leading blank, as demonstrated by the third tuple above. It is a common (and frustrating!) mistake to leave blanks before or after separators and then wonder why string values are not matching as you expect.

Loading

Bulk-loading can be done via the SQLite special command .import, as follows:

    .separator <separator>
    .import <loadFile> <tableName>
For example, if we want to load a file name.dat into table Name, where attributes are comma-separated, we would execute:
    .separator ,
    .import name.dat Name
Keep in mind that you should use an attribute separator string that will never appear in your data, so separators aren't confused with data fields.

Note that primary keys and other constraints on the table are enforced during bulk-loading; if any tuples violate these constraints, the load will fail.

Loading NULL Values

Unfortunately, there is no way to specify null values in a load file such that SQLite will actually load a null value. Thus, loading data with null values must be done in a two-step process: first load the data with some stand-in value for null (e.g. 'NULL'), then update the table to replace instances of 'NULL' with null.

Consider a Name table with attributes first, middle, and last, and suppose we want to bulk-load the names "Mary J. Blige" and "Lady Gaga". We create our load file name.dat as:

    Mary|J.|Blige
    Lady|NULL|Gaga
where "|" is our attribute separator. We then load this data into table Name and set null values properly by executing:
    .separator |
    .import name.dat Name
    update Name set middle = null where middle = 'NULL';