File Size and Processing Limits in Flatfile

Last updated: September 23, 2025

When working with large data files in Flatfile, there are several important limits and best practices to keep in mind to ensure optimal performance.

File Size Limits

  • Maximum recommended rows per file: 1 million records

  • While it's technically possible to process more than 1 million records, performance will be significantly impacted above this threshold

Performance Best Practices

API Rate Limits

When making API calls to process records:

  • Maximum API calls: 20 calls per second

  • Use bulk operations instead of processing individual records

  • Implement retry logic with backoff for rate limit (429) errors

Batch Processing

For optimal performance when processing large datasets:

Data Retrieval Limits

When retrieving processed data from Flatfile:

  • The sheet.allData() method has a hard limit of 10,000 records per call

  • For files larger than 10,000 records, you must implement pagination to retrieve all data

  • Use the batching records during data egress approach for larger datasets

Important: Even if you successfully upload and process files larger than 10,000 records, you cannot retrieve all records in a single sheet.allData() call. Pagination is required for complete data retrieval.

  • Use bulk record hooks instead of individual record hooks

  • Recommended batch sizes: 5,000-10,000 records per batch

  • Consider using the streaming JSONL API endpoint for very large datasets (2M+ records)

Pro Tip: When possible, make API calls after space creation or once per commit instead of per record to reduce the number of API calls needed.

JSONL Streaming API

For processing very large datasets, consider using the streaming JSONL API endpoint which allows you to:

  • Stream 2M+ records in a single request

  • Process multiple sheets in a workbook in one request

  • Use a more efficient key-value data format

If you encounter timeouts or performance issues with large files, try breaking the data into smaller batches or implementing the streaming API approach.