Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to: ✅ Warehouse in Microsoft Fabric
This article describes how to troubleshoot ingestion failures in T-SQL ingestion patterns.
Ingestion into a warehouse using COPY INTO, BULK INSERT, OPENROWSET function in CTAS, INSERT, UPDATE, and MERGE statements can fail for several reasons. Source file values might not match the table schema. Required values might be missing. Ingestion options might also be misconfigured.
This troubleshooting guide uses the rejected-row diagnostics information to resolve failures, capture row-level errors, and inspect rejected rows with error metadata.
By examining the error files generated by COPY INTO and other ingestion commands, you can pinpoint exactly which rows failed to ingest and why. This information helps you identify data quality issues or adjust ingestion settings, fix the source data, and rerun the load with confidence.
Important
These instructions apply only to ingesting CSV or JSONL files by using Transact-SQL commands (COPY INTO, BULK INSERT, and DML with OPENROWSET function). Rejected rows output files aren't generated for external ingestion tools (such as pipelines ), Parquet files, or when ingesting data from SQL analytics endpoint.
Create the target table
Before running ingestion commands, create a destination table with strict types and NOT NULL constraints so you catch conversion and data-quality problems early.
In your Warehouse workspace, open your warehouse.
On the Home tab, select New SQL query.
Run the following statement:
DROP TABLE IF EXISTS dbo.TaxiTrips; GO CREATE TABLE dbo.TaxiTrips ( vendorID int NOT NULL, startLat float NOT NULL, startLon float NOT NULL, endLat float NOT NULL, endLon float NOT NULL, passengerCount int NOT NULL, tripDistance float NOT NULL, fareAmount float NOT NULL, mtaTax float NOT NULL, totalAmount float NOT NULL );
You can use multiple supported methods, including ingesting with COPY INTO or ingesting with Transact-SQL. Choose the ingestion method that best fits your data source, format, and automation requirements. The following COPY INTO example illustrates a common ingestion pattern for loading data from external files into a table.
COPY INTO [dbo].[TaxiTrips]
FROM 'https://{storage-path}.blob.core.windows.net/Files/yellow/'
WITH ( FILE_TYPE = 'CSV' );
This statement can fail to ingest data if the source files don't match the destination table schema. Common causes include mismatched column counts, incompatible data types, or values that can't be stored in the target table. If ingestion encounters values that can't be converted to the destination schema, the statement returns an error similar to the following:
Msg 13812, Level 16, State 1, Line 2
Bulk load data conversion error (type mismatch or invalid character for the specified codepage)
for row starting at byte offset 0, column 1 (vendorID).
Underlying data description:
file 'https://....blob.core.windows.net/Files/yellow/tripdata.csv'.
This error indicates that one or more rows can't be converted to the destination column types.
Investigate errors with MAXERRORS and ERRORFILE
Use the following options to continue ingestion when the number of row-level errors is below a defined threshold and to store diagnostic details in a specified location.
MAXERRORSsets the maximum number of tolerated row-level failures during ingestion.ERRORFILEspecifies where the database writes rejected rows and error details.
COPY INTO [dbo].[TaxiTrips]
FROM 'https://{storage-path}.blob.core.windows.net/Files/yellow/'
WITH (
FILE_TYPE = 'CSV',
MAXERRORS = 10,
ERRORFILE = 'https://{storage-path}.blob.core.windows.net/Files/yellow/'
);
Important
Configure ERRORFILE under the same storage location used for source-file reads, not in a different storage account. The identity used to access source data must also have permissions to create folders and files in the configured error path.
The load operation succeeds only when the number of rejected rows is lower than MAXERRORS. When errors are captured, the ingestion operation writes:
error.jsonlfor structured diagnosticsrow.csvfor rejected source rows
Locate and query rejected rows
The database writes error information to a structured folder hierarchy under the configured error location. These folders help you trace a specific execution and correlate diagnostics to one ingestion statement:
ERRORFILE/
+-- _rejectedrows/
+-- <timestamp>/
+-- <statement_id>/
+-- error.jsonl
+-- row.csv or rows.jsonl
Use OPENROWSET to read the structured diagnostics in error.jsonl so you can identify which value failed, which destination column was affected, and where the failing row originated:
SELECT *
FROM OPENROWSET(
BULK 'https://{storage-path}.blob.core.windows.net/Files/yellow/_rejectedrows/*/*/error.jsonl'
);
The result set typically includes one row per rejected record, for example:
| Error | Column | ColumnName | Value | IsOutputted | File | ErrorRowLocation |
|---|---|---|---|---|---|---|
| Data conversion error | 1 | vendorID |
vendorID |
1 | https://.../yellow/tripdata.csv |
0 |
| NULL in non-nullable column | 1 | vendorID |
NULL | 1 | https://.../yellow/ytripdata.csv |
399 |
| Data conversion error | 6 | passengerCount |
N/A | 1 | https://.../yellow/yellow_tripdata.csv |
519 |
The error.jsonl file contains one JSON object per line. Each object includes the properties listed in the preceding table. The following table describes each property in detail.
| Column | Description |
|---|---|
Error |
Provides the error message that explains why the value was rejected during ingestion. |
Column |
Specifies the index of the column in the source CSV file that contains the value that couldn't be ingested. Column indexing starts at 1 for the first column. |
ColumnName |
Specifies the name of the destination table column in which the value couldn't be stored. |
Value |
Source value that couldn't be converted or validated. |
IsOutputted |
Indicates whether the row from the source file that contains the reported error is also written to the rejected rows output file (row.csv or row.jsonl). A value of 1 (or true in JSONL files) means the row is written in error.csv, and a value of 0 (or false in the JSONL files) means it isn't. |
File |
Identifies the source file from which the rejected row originated. This value helps you trace the rejected data back to the original input file for investigation. |
ErrorRowLocation |
Byte offset position in the source file where the failure occurred. |
Review rejected rows
After you review the structured diagnostic information, you can inspect the original source data that the database couldn't ingest. The rejected rows output contains copies of the source records, preserved exactly as they appeared in the input files. The rejected rows diagnostics generate files that contain only the records that failed ingestion:
- If you ingest CSV files by using
COPY INTO (FILE_TYPE = 'CSV'), the rejected output includes arow.csvfile. This file matches the source file structure and contains the original CSV rows with invalid values. - If you ingest JSONL files by using
OPENROWSET(FORMAT = 'JSONL'), the rejected output includes arow.jsonlfile. This file preserves the original JSON objects that caused ingestion failures.
Use these files to validate the root cause of the errors, such as malformed values, unexpected NULL values, or header rows that were incorrectly parsed as data.
SELECT *
FROM OPENROWSET(
BULK 'https://{storage-path}.blob.core.windows.net/Files/yellow/_rejectedrows/*/*/row.csv'
);
The row.csv schema matches the source CSV shape, and it contains only rows that failed ingestion.
Example rejected-row output:
C1 |
C2 |
C3 |
C4 |
C5 |
C6 |
C7 |
C8 |
C9 |
C10 |
|---|---|---|---|---|---|---|---|---|---|
vendorID |
startLat |
startLon |
endLat |
endLon |
passengerCount |
tripDistance |
fareAmount |
mtaTax |
totalAmount |
| NULL | 40.7484 | -73.9857 | 40.7549 | -73.9840 | 2 | 1.40 | 9.00 | 0.50 | 13.20 |
| 1 | 40.7216 | -74.0047 | 40.7359 | -74.0036 | N/A | 1.80 | 11.00 | 0.50 | 15.90 |
Based on this diagnostic information, you can identify the following ingestion issues:
- The header row in the source file is mistakenly parsed as a data row. To resolve, the
COPY INTOstatement should use theFIRSTROW = 2option. - A row in the source file for the
vendorIDcolumn (C1) containsNULLvalues, but the corresponding column in the destinationTaxiTripstable is defined asNOT NULL. - A row in the source file for the
passengerCountcolumn contains an invalid value (N/A) that can't be converted to the destination int column.
Note
The same process applies when you examine rejected rows from JSONL input. Use the row.jsonl file to inspect the rejected records.
Fix ingestion issues and re-ingest data
After you identify the cause of ingestion failures, correct the issue and re-ingest the affected data. The remediation approach depends on where the error originates.
Fix the destination table schema
If the source data doesn't conform to the destination table schema, update the table definition. Common fixes include changing column data types or removing restrictive constraints such as NOT NULL.
In some scenarios, you might need to drop and recreate the destination table before re-ingesting the data.
Correct source data and re-ingest files
If ingestion fails due to invalid or inconsistent values in the source files, correct those values and re-ingest the data. For example, replace placeholder values such as N/A with empty values or valid defaults.
COPY INTO [dbo].[TaxiTrips]
FROM 'https://{storage-path}.blob.core.windows.net/Files/yellow/tripdata_corrected.csv'
WITH ( FILE_TYPE = 'CSV' );
When re-ingesting corrected data, use an explicit file path that points to the new file containing only corrected data, rather than a folder path that references the original files. This approach prevents re-ingesting rows that were previously loaded successfully and avoids duplicate data.
Reprocess rejected rows by using a staging table
You can load rejected rows into a staging table, fix the data by using Transact-SQL data modification statements, and then re-ingest the corrected rows.
The following CREATE TABLE AS SELECT statement loads rejected rows into a table for further processing:
CREATE TABLE TaxiTrip_RejectedRows AS
SELECT *
FROM OPENROWSET(
BULK 'https://{storage-path}.blob.core.windows.net/Files/yellow/_rejectedrows/*/*/row.csv'
);
After you correct the data, insert the cleaned rows into the destination table.