SQL database import errors: line breaks

When importing huge plain text files or comma separated files, different errors may occur while using the SQL Server import wizard. One of such errors is line breaks in the source files.

SQL Server import and export wizard expects a line break before a new record begins. However, when importing plain text that contains (larger) text columns, you might walk into a record that contains an unexpected line break in the text column. The larger the number of records to be imported, the bigger the chance that a record causes an error.

A line break in a plain text file is a special character. Using the SQL Server import wizard, you can set the line break character(s). But if you import columns that contain lengthier texts, like order descriptions, material descriptions or memo fields, the chances are that these line break character(s) are used in these source files.

In this article, I will first give an example of such a file and which errors you will walk into during the process of import. Next I will give a solution using the SQL DataTool, which uses an algorithm to find such unexpected line breaks. It can even handle multiple line breaks in the same field, and other line break scenarios.

Click here to go directly to the solution.

Import texts with line breaks

First of all, I prepared a file that contains the line breaks. The file contains 4 columns (INT_FIELD, TEXT_FIELD, TEXT_FIELD2, and DESCRIPTION). And the file has 5 records (not counting the header line), starting on lines 2, 4, 5, 7 and 9. The file contains different scenarios of line breaks in a plain text file. The screenshot below shows this prepared text file.

Import file with line breaks - SQL database

When trying to import this file using the SQL Server Import and Export wizard errors will occur. It is not be possible to import this file correctly into SQL Server using the standard available tools.

Import using correct datatypes

First of all, we need to select the 'line breaks.txt' plain text file. This is the file shown in the screenshot above. The line break (Header row delimiter) is {CR}{LF}, which is the default value for line breaks in the SQL Server Import and Export wizard.

Select datasource - SQL Server import wizard

When the settings are entered as above, SQL Server Import and Export wizard assumes all columns to be string - or VARCHAR - columns. However, the INT_FIELD column is an integer (or numeric) column. So in the Advanced section of this screen, the datatype of the INT_FIELD has been changed to be a numeric field.

Column as numeric

During the actual import process an error message pops up. This error breaks the import process. No data is imported in this procedure.

Line break errors

Opening the error message will show us the following text. The bold text in the error message explains what went wrong.

Messages
Error 0xc02020a1: Data Flow Task 1: Data conversion failed. The data conversion for column "INT_FIELD" returned status value 2 and status text "The value could not be converted because of a potential loss of data.".
 (SQL Server Import and Export Wizard)
 
Error 0xc0209029: Data Flow Task 1: SSIS Error Code DTS_E_INDUCEDTRANSFORMFAILUREONERROR.  The "Source - line breaks_txt.Outputs[Flat File Source Output].Columns[INT_FIELD]" failed because error code 0xC0209084 occurred, and the error row disposition on "Source - line breaks_txt.Outputs[Flat File Source Output].Columns[INT_FIELD]" specifies failure on error. An error occurred on the specified object of the specified component.  There may be error messages posted before this with more information about the failure.
 (SQL Server Import and Export Wizard)
 
Error 0xc0202092: Data Flow Task 1: An error occurred while processing file "C:\Temp\line breaks.txt" on data row 3.
 (SQL Server Import and Export Wizard)
 
Error 0xc0047038: Data Flow Task 1: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED.  The PrimeOutput method on Source - line breaks_txt returned error code 0xC0202092.  The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.  There may be error messages posted before this with more information about the failure.
 (SQL Server Import and Export Wizard)

As you can see in the errors, the errors are caused by wrong datatypes. This is something that can be solved by importing all columns as text. This is what is done in the next section: Import as string.

As we will see later, this does not solve the issue of the unexpected line breaks.

Import as string

Ideally we want the INT_FIELD to be imported as a numeric, or integer, column. As seen above, this will result in an error and stops the import. So what if we just import all data as text, ignoring the INT_FIELD datatype.

All columns are strings, except for the INT_FIELD column, which is an integer. But for this scenario, all fields have been set as string, as shown in the screenshot below.

Import numeric as string

When the datatype of the INT_FIELD column is changed to a string, the data can be previewed in the Columns section of this screen. As you can see in this preview, the import is not as we expected it to be. I previously described the plain text file, and it contains 5 lines. The preview contains 10 lines.

Data preview string import

When we finish this import scenario, we will not get an error. The import is successful... At least, that is what the SQL Server Import and Export wizard wants us to believe.

Import as string success

But when we look at the imported data, it does not look as expected. The data is imported as was shown in the data preview, but this was not the way we wanted it to be imported. Instead of 5 lines, we have a table with 10 lines.

Imported data is wrong

Maybe even worse, the design of the table is all wrong. The INT_FIELD is not imported as an integer, but as a VARCHAR field with the length of 50. This is what we have defined in the Advanced section of the data import, but not as it should be.

Imported datatypes are wrong

We end up with some data in the database. But the data and table are not usable. We cannot use it for calculations, nor can we use the imported texts for (e.g.) reporting. Next we will see how we can use the SQL DataTool to help us with this task.

Import flat file with line break issues

As always, we first need the data model. This data model is used to import the data, and while importing, validate the data. Where possible, SQL DataTool will interpret and correct the data. When you have a file that has unexpected line breaks all over the file, this is a very handy feature.

So first the data model. It is already explained in the manual how to setup a data model. The data model for the file that has been prepared is as follows:

Import linebreaks datamodel example

As you can see in the image above, the field INT_FIELD will be imported as an integer. So now we try to import the file. I have selected the file, set the right data model to be used and entered the right separator character.

Import settings - import linebreaked file

When I hit the button 'Get data', almost instantaneous a popup shows that the import was successful.

Import linebreaks plain text successful

When we look at the data and the table, we see that it is imported as we wanted. The data is available in the table and usable. And the data is available in a format (the datatypes) that we expect and we can work with.

Successfully imported data

Imported datatypes as expected

As you can see, SQL DataTool helps you importing files that are impossible to import using the SQL Server Import and Export wizard. The SQL DataTool is built to import data correctly, even when these files will result in errors or wrongly imported data using other methods.

The video below shows a 'How to' to import plain text files with line breaks.

To get your trial version of the SQL DataTool, go to the download section.