CSV Upload for Bulk Jobs

The second step of the Bulk Processing wizard is uploading your CSV file. This guide covers how to format your CSV for best results and how to map your columns to the operation's inputs.

Step 2 of the bulk job wizard showing the CSV upload zone and column mapper below it
Step 2 of the bulk job wizard showing the CSV upload zone and column mapper below it

CSV format requirements

Before uploading, make sure your file meets these requirements:

  • Encoding: UTF-8. Other encodings (Latin-1, Windows-1252) may cause character corruption, especially with accented characters or non-Latin scripts. Most spreadsheet tools export UTF-8 by default.
  • Header row: The first row must contain column names. These names appear in the column mapper so you can link them to the operation's inputs.
  • No merged cells: Standard tabular format only. Merged cells, pivot tables, and multi-level headers are not supported.
  • Row limit: Maximum 10,000 rows per job (excluding the header row). Files with more rows are rejected at upload.
  • File size: Maximum 50 MB.

Uploading the file

  1. In step 2 of the wizard, drag your CSV file onto the upload zone, or click Choose file.
  2. Hubrix parses the file and displays a preview of the first 5 rows.
  3. Review the preview to confirm columns are read correctly.
  4. Click Next to proceed to configuration.

If columns are split incorrectly in the preview, your CSV likely uses a delimiter other than a comma (for example, semicolons or tabs). Open the file in a text editor and save it with comma delimiters before uploading.

Column mapping

In step 3 (Configure), you map your CSV columns to the operation's input fields:

  • For Prompt: you map columns to variables in your prompt template. For example, if your prompt says Summarise this: {{description}}, you map the description variable to your CSV's description column.
  • For Classify / Extract / Translate: you map the primary text column to the operation's input.
  • For Agent: you map the input column that is sent to the agent as the user message.

Unmapped columns are passed through unchanged to the output file — they appear as-is in the results alongside the AI-generated output columns.

Tips for clean data

  • Remove empty rows — empty rows at the end of the file are processed (and consume credits) as blank inputs. Trim trailing empty rows before uploading.
  • Escape commas in values — values containing commas must be wrapped in double quotes. Most spreadsheet exports handle this automatically.
  • Keep text columns short — very long text in a single cell (more than 4,000 tokens) may be truncated by the AI model. Split very long documents into multiple rows if needed.

You can re-upload a new CSV to an existing completed job by creating a new job with the same settings. Currently, you cannot append rows to an already-completed job.

Was this helpful?