Need the #1 custom application developer in Brisbane?Click here →

File Uploads, Storage, and Processing

9 min read

Handling file uploads safely and efficiently is critical. Large files, malicious uploads, storage scalability—each requires careful handling.

Two Approaches to File Uploads

Server-Side Upload

Client uploads to your API. Your server writes to disk or cloud storage.

Advantages: your server handles validation and security checks. Disadvantages: upload load on your application server, file size limits on your server, slow for large files.

Presigned URLs (Client-Direct Upload)

Your server issues a presigned URL (a temporary URL with specific permissions). The client uploads directly to S3/cloud storage. Your server never handles the file.

Advantages: offloads upload to cloud storage (better at scale), reduces server load, allows large files. Disadvantages: requires client-side upload logic, security concerns if misconfigured.

For production, presigned URLs are usually better. For prototypes or simple cases, server-side upload is fine.

Object Storage: S3 and Alternatives

Don't store files on your server's disk. Disks have limited space, don't scale, are hard to back up, and are single points of failure.

Use object storage: AWS S3, Cloudflare R2, Google Cloud Storage, Azure Blob Storage. These are designed for files at scale.

Object storage is cheap, infinitely scalable, globally replicated, and has built-in redundancy. For any production application, use object storage.

File Processing

Images often need processing: resizing to thumbnail, converting formats, removing metadata. Do this asynchronously via background jobs, not during upload.

Flow: user uploads image → save to storage → queue background job → job processes image → save outputs to storage. User gets immediate response; processing happens later.

For on-demand processing, use image CDNs (Cloudflare Images, Imgix) that serve transformed images without pre-processing.

Security: Validating Uploads

Never trust file uploads. Validate:

  • File type: Check by content, not extension. A .jpg extension doesn't mean it's a JPEG—it could be a script.
  • File size: Limit to reasonable sizes. 5MB for avatars, 100MB for documents. Prevent resource exhaustion.
  • Malware: Scan uploads for viruses. Use ClamAV or cloud scanning services.
  • Execution: Store uploads outside your web root or serve with Content-Disposition: attachment so they're not executed as code.

Serving Uploads Safely

If you serve uploaded files, set correct headers:

  • Content-Type: application/octet-stream (download, don't execute)
  • Content-Disposition: attachment; filename=..." (suggests download)

Without these headers, a browser might execute an uploaded .js file as JavaScript or render an uploaded .html file as HTML.

File Metadata: Don't Lose It

Store file metadata in the database: user_id (who uploaded it), filename, size, upload date, content type, URL. This metadata is as important as the file itself.

If a file is deleted, the metadata in the database helps you understand what was there and whether it's safe to delete the object from storage.

Image CDNs and Optimization

Image CDNs (Cloudflare Images, Imgix, Cloudinary) serve transformed images without preprocessing. Upload once, request on-demand with transformation parameters: size, format, quality.

This is more efficient than preprocessing all images. You upload once, but can serve at different sizes/formats on demand.

Presigned URLs: How They Work

Your server generates a presigned URL with a signature and expiry time. The client uses that URL to upload to S3. S3 verifies the signature and allows upload.

Example flow:

  1. Client requests upload: POST /upload with filename
  2. Server validates request (is user authenticated? is filename valid?)
  3. Server generates presigned URL with 15-minute expiry
  4. Server returns presigned URL to client
  5. Client uploads to S3 using presigned URL
  6. S3 stores file, notifies server (webhook)
  7. Server updates database with file metadata

This is secure: presigned URLs are temporary and can only upload to the specified path. The server validates before issuing the URL.

Warning
Presigned URLs must be time-limited. A URL valid for a year is almost as insecure as not having signing. Expiry of 15 minutes to 1 hour is reasonable.

Backup and Retention

Object storage should be backed up (cross-region replication). Set retention policies (how long are old versions kept?). Plan for disaster recovery.

For critical files, implement versioning so you can recover old versions if needed.

The Principle

Files are integral to most applications. Handle them safely: validate uploads, store securely, serve correctly, maintain metadata. Don't reinvent—use object storage and image CDNs. The infrastructure to handle files at scale is complex; use managed services.