Hi Laurent Delaquis,
It looks like you're encountering a problem when trying to load a parquet file into a database using Azure Data Factory (ADF). The error message indicates that there are more columns in the data than defined, which is usually tied to how your data is structured or how the pipeline is configured. Here’s a step-by-step approach you can try:
Check the Column Count: Since you're receiving an error regarding more columns than expected, double-check the schema and ensure that the number of columns in the source parquet file matches the defined schema in ADF.
File Inspection: You mentioned that opening the CSV version in Excel shows no extra columns. It's worth inspecting the actual parquet file as well—use tools like Azure Data Lake Storage or Azure Synapse Analytics to directly inspect the parquet file's schema.
ADF Pipeline Configuration:
- Ensure that your dataset schema in ADF matches exactly with that of the parquet file.
- Look for any changes in the incoming files; since you mentioned new sites were added, make sure any new attributes are accounted for in the schema.
Here's a set of follow-up questions that might help clarify the situation:
- Have you modified the schema in the target database recently?
- Are there any specific transformations applied within the ADF pipeline that could affect the column count?
- Is there a specific pattern to the rows that return errors, or do they seem random?
I hope this helps! Feel free to reach out with any more details, and we can dig deeper into the issue together. Good luck!