SQL*Loader:
SQL*Loader is a command line Oracle utility that dumps the Oracle DB with records from file(s). The components of SQL*Loader are
Data File: is the source for the data. For instance, a comma separated file or a tab delimited file or a fixed length file.
Control File: contains the control info for the load process. Has the name of the data file that needs to be loaded to the DB. Also the mapping, as to which column in the file relates to which column in the DB is found here. It can also have the data in it, without using a separate file for data.
Log File: contains the report log of the SQL*Loader operation performed. It has info of the errors, if any, the status of the load, etc.
Bad File: contains the erroneous records which were not loaded nto the database by the SQL*Loader.
Discard File: contains the records rejected by constraints specified. These records are the ones that were not selected at all for loading.
Types of Load:
The SQL*Loader loads data into the DB in two ways, viz.,
Conventional PathThe SQL*Loader loads the data by generating SQL Insert statements. When SQL*Loader uses conventional path, it competes with all other processes for buffer resources which may slow down the process significantly. Also a considerable time is spent in generating the SQL statements and the Oracle engine executing them. SQL*Loader, by default uses the conventional path loading.
Direct PathThe SQL*Loader bypasses all the logic discussed in Conventional Path, and loads the data directly into the Oracle data files. It also writes in the index segment, if the columns have indexes. However, there are some limitations using the direct path load. Few of them are
§ Triggers are not supported.
§ Referential integrity constraints are not supported.
§ Clustered tables are not supported.
§ Loading of remote objects is not supported.
§ LONG columns must be specified last.
Insert Modes:
The SQL*Loader inserts the data in 4 modes, viz.,
§ Insert: mode can be used only when the table is empty while running the SQL*Loader.
§ Append: mode is used when data needs to be appended into the target table which may or may not have records while running the SQL*Loader.
§ Truncate: mode, truncates the target table and then inserts the records.
§ Replace: mode deletes the target tables. However, when choice is to be made between using truncate and replace, go with truncate because truncate is faster than delete. Delete involves, writing the deleted rows in rollback segments to restore data while rollback is issued and a database redo log entry will be written for each database block affected by the issued delete statement.
Sample Control Files:
(For comma delimited file)
LOAD DATA
INFILE *
TRUNCATE
INTO TABLE dept_sqlldr
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY "'"
(src FILLER,
deptno,
dname,
loc)
BEGINDATA
source, deptno,dname,loc
'Mike',10,'Accounts','
'John',20,'Transport','
'Margret',30,'Operations','
In the above control file, the data file is not specified, whereas a '*' is given after INFILE, which means the source data is in the control file itself. The data is found after the keyword BEGINDATA. Another way is to give the name of the file with the path within single quotes, if the path is not the working folder. In the example above, the FILLER clause is used. When a column in the file is not required to be loaded to the database, then the filler clause is used to stop that column from being loaded into the database. Here rhe column 'source' is not loaded.
(For tab delimited file)
LOAD DATA
INFILE *
TRUNCATE
INTO TABLE dept_sqlldr
FIELDS TERMINATED BY '\t'
OPTIONALLY ENCLOSED BY "'"
TRAILING NULLCOLS
(deptno "TRIM (:deptno)",
dname "TO_CHAR (:dname)",
loc "SUBSTR (:loc, 1, 3)",
dt "SYSDATE")
BEGINDATA
Deptnodnameloc
10'Accounts''
20'Transport''
30'Operations''
Any function, Oracle built-in functions or user defined functions can be called. The TRAILING NULLCOLS option is used when there are no source columns in the file to populate some columns of the table that appear last or if the trailing columns in the file are NULL columns which are to be moved to the database. In this example, when SQL*Loader attempts to populate the 'dt' column in the table, it would not find any column in the file, even though it has a function 'SYSDATE' specified. Adding the option would resolve this problem. If a column is to be passed as an argument to a function, it is passed as a bind variable as shown in the above example.
(For fixed length file)
LOAD DATA
INFILE '\\machine123\c\data_file.txt'
TRUNCATE
INTO TABLE dept_sqlldr
WHEN deptno <> '20'
(deptno POSITION (1:3),
dname POSITION (4:13),
loc POSITION (14:22))
Content of data_file.txt:
dno dname loc
10 Accounts
20 Transport
30 Operations Bangalore
Note in the above example, how the WHEN clause is used in a control file. The records rejected by this constraint are loaded in the discard file. Here the data is not in the control file and hence the data file name, with its path is mentioned in the control file. The SQL*Loader can access files from a network location too.
How to call SQL*Loader from the DOS prompt?
c:\> SQLLDR scott/tiger@oracle9i CONTROL = '\\machine123\c\sqlldr_control.ctl' LOG = '\\machine123\c\sqlldr_log.txt' BAD = '\\machine123\c\sqlldr_bad.bad' DISCARD = '\\machine123\c\sqlldr_discard.txt' DIRECT = TRUE SKIP = 1
Thus the SQL*Loader is called and all the files are specified for the SQL*Loader to create in the locations specified. The bad and discard files are created only when there are bad or discarded records respectively. DIRECT = TRUE is given to make the SQL*Loader perform a Direct Path load. SKIP is used to skip the logical record number to be skipped (in case of first row being column names). All the other parameters, except the connection info and the control file, need not be specified. If required, can be specified in the control file itself. Hence, the essential syntax is,
c:\> SQLLDR scott/tiger@oracle9i '\\machine123\c\sqlldr_control.ctl'
The file specified, is understood as the control file. The keyword CONTROL is not required here. The log file is automatically created with the name of the control file and with extension 'log' in the path from where the SQL*Loader is run. The bad file is also automatically created with the name of the data file and with the extension 'bad' in the path where the data file exists. The discard file is not created unless specified. If DIRECT = TRUE is not specified, the SQL*Loader performs the load using Conventional Path load (default).
SQL*Loader can also be used to load the data from more than one file to a target table and from one data file to more than one table. The sample control files are below.
LOAD DATA
INFILE 'c:\first_file.txt'
INFILE 'c:\second_file.txt'
INFILE 'c:\third_file.txt'
APPEND
INTO TABLE dept_sqlldr
(deptno POSITION (1:3),
dname POSITION (4:15),
loc POSITION (16:25))
In this sample control file, it is assumed that the record format in all the three data files are fixed length. It is essential that all the source files need to have the same record format.
LOAD DATA
INFILE 'c:\one_data_file.dat'
REPLACE
INTO TABLE emp10
WHEN empno = '10'
(empno POSITION (1:5),
ename POSITION (6:20),
sal POSITION (21:25),
dept POSITION (26:30))
INTO TABLE emp20
WHEN empno = '20'
(empno POSITION (1:5),
ename POSITION (6:20),
sal POSITION (21:25),
dept POSITION (26:30))
INTO TABLE dept
WHEN empno = ''
(deptno POSITION (26:30),
dname POSITION (31:45),
loc POSITION (46:50))
Thus from one file, more than one table are populated. The data file here has the data of both employees (for department 10 and 20) and department. When 'empno' field is null, then the record is considered to be a record to be inserted into 'dept' table.
Hence, the basic operations of the SQL*Loader are briefed to get a fair idea about the utility.