Understanding File Size and Data Limitations in Flatfile

Last updated: December 12, 2025

When working with large datasets in Flatfile, it's important to understand the system's limitations and best practices for optimal performance.

General Limitations

  • Maximum recommended rows per sheet: 1 million records

  • API rate limit: 20 requests per second (For a breakdown of our rate limiting, see this documentation)

  • Maximum enum options: ~5,000 values (for optimal performance)

Best Practices for Large Files

When working with large datasets, follow these guidelines:

Use Bulk Operations

Instead of processing records individually, use bulk operations:

  • Use bulkRecordHook instead of recordHook for validations and transformations

  • Process records in chunks of 5,000-10,000 rows for optimal performance

  • Consider using the streaming JSONL API endpoint for datasets over 500,000 records

Handle Rate Limiting

When making multiple API calls, implement retry logic to handle rate limiting:

  • Add delays between batch operations

  • Handle 429 (Too Many Requests) errors gracefully

  • Increase delay periods if receiving multiple rate limit errors

Performance may degrade significantly when working with sheets containing over 1 million records. Consider splitting very large datasets into multiple sheets or workbooks.

Common Issues and Solutions

Timeouts During Processing

If you experience timeouts when processing large files:

  • Reduce batch sizes to 5,000 records

  • Implement progress tracking using job acknowledgments

  • Consider using the streaming API for better performance

Enum Field Limitations

For fields with many possible values:

  • Keep enum options under 5,000 values

  • Consider using a text field with custom validation instead of an enum for fields with many possible values