Elixir Phoenix Upload Management + S3

Elixir Advanced Phoenix Files

A feature that keeps popping up in every software project is file management. From the simple task of uploading a user profile image to displaying a carousel that illustrates a recipe or a product, you'll face some level of dilemma.

More often than not, uploaded files are related to some objects in the application. For instance, users, products, reports, blog posts etc...

When developing software this translates into saving binary data somewhere (the file), AND keeping track of the file and its association to an object in a database.

Data Model

Let's start with data. I'll make the assumption that you are somewhat familiar with relational databases and some flavour of SQL.

Overview

When it comes to the type of relationship between objects, we can identify classic cases.

  • for a user avatar, there's only one avatar attached to a user (data model):alt text

  • In the case of images illustrating a product, many images might be linked to one product (data model)alt text

Please note that we could relate the document to a model directly, but that would translate poorly to SQL. Abstracting the model type is an Object Oriented concept, not one you will find natively in an SQL-type relational database management system (RDBMS).alt text

This leaves us with only a few ways to cleanly relate a document to an object:

  • a specific "via-table" for many-to-many relationships (a table that contains foreign keys to the two tables you need to join). This is the case for product images.

  • or, a foreign key to document for belongs-to / one-to-many relationships. This is the case for user avatar.

In this article we are going to concentrate on the latter: the belongs-to relationship.

Documents

The documents table should minimally contain attributes that allow the identification of the file uploaded into the system, like a title or filename.

mix phx.gen.schema Documents.Document documents title:string 

User's avatar

Expanding on the user's avatar scenario above, let's start by modifying the user's schema you usually have in user.ex.

We will create a simple association to zero or one document model. If there's already a document associated with the user, we'll recycle the record and update it with the new file info.

schema "users" do
	...

	# if the association is to be replaced, update the existing link
	belongs_to(:avatar, MyApp.Documents.Document, on_replace: :update)
end

File Storage

Files might be stored on the local file system of the Elixir Phoenix server or, alternatively, on a specialised server like a bucket (e.g. AWS S3 Bucket).

To handle both scenarios elegantly, we will rely on the Waffle library.

The File System integration is rather straightforward. We will therefore focus on the specialised server implementation

  • To emulate AWS S3 on our local machine we will use MinIO.

  • To simplify installation and configuration, we'll be using Docker.

Docker Compose

Let's create a docker-compose.yml file to define our services:

version: "3.9"

services:
  database:
    container_name: db-postgre-my-app
    image: postgres:14.4-alpine
    ports:
      - 5432:5432
    environment:
      - POSTGRES_DB=postgres 
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres
      - PGDATA=/var/lib/postgresql/data/pgdata
    volumes:
      - ./.postgres-data:/var/lib/postgresql/data

  minio:
    container_name: minio-s3-my-app
    image: minio/minio
    command: server /data --console-address ":9001"
    hostname: minio
    ports: 
      - 9000:9000
      - 9001:9001
    # environment: (these are defaults)
      # MINIO_ROOT_USER: minioadmin
      # MINIO_ROOT_PASSWORD: minioadmin    
    volumes:
      - .minio:/data

Our application's needs in terms of services are covered here with PostgreSQL for the database and MinIO for the S3 emulation.

With the docker daemon running, start the services defined above:

docker compose up

Note: the volumes exported for persistence will create two folders (don't forget to add them to your .gitignore)alt text

Minio

MinIO is a lightweight Object Storage released under GNU License. It is API compatible with AWS S3 cloud storage service.

You can access MinIO with the AWS CLI.

aws --endpoint-url=http://127.0.0.1:9000 --no-verify-ssl s3 ls
# create an alias to avoid the verbosity
alias awslocal='aws --endpoint-url=http://127.0.0.1:9000 --no-verify-ssl '
awslocal s3 ls
# An error occurred (InvalidAccessKeyId) when calling the ListBuckets operation: The Access Key Id you provided does not exist in our records.

We need to create a service account to access MinIO via AWS CLI or Elixir Phoenix.

Open http://127.0.0.1:9001 in your browser to create one:

alt text

The key and secret created will be used to configure your AWS CLI. It'll also be used in the next steps to configure Waffle.

Configure AWS CLI

aws configure
# ... fill up key, secret and region
# test with:
awslocal s3 ls
# (nothing)

Note: there will be no result in the last call as you don't have any bucket created to list.

AWS CLI as a profile

If you use multiple AWS profiles (sets of key + secret), you might want to add the credentials in your $HOME/.aws/credentials file like this

[minio]
aws_access_key_id = kuJJTvYgNUCyDO7b
aws_secret_access_key = tx2R6919ARSz1Xe74o1wAeyPDQHGzKOk

and your $HOME/.aws/config

[profile minio]
output = text
region = ap-southeast-1

You can then update your awslocal alias to include the profile name

alias awslocal='AWS_PROFILE=minio aws --endpoint-url=http://127.0.0.1:9000 --no-verify-ssl '

Create the Bucket

It's now time to create a test-bucket, or, in AWS API lingo: make a bucket.

awslocal s3 mb s3://test-bucket
awslocal s3 ls
# 2022-07-31 10:31:44 test-bucket
awslocal s3 ls --recursive s3://test-bucket
# (nothing, as )

Your bucket is empty and ready to store files.

At this point, let me introduceWaffle to you.

Waffle

Waffle is a flexible file upload library for Elixir with straightforward integrations for Amazon S3 and ImageMagick.

Let's add the dependencies to our project. Edit mix.exs and add

# waffle
{:waffle, "~> 1.1"},
# waffle + S3:
{:ex_aws, "~> 2.1.2"},
{:ex_aws_s3, "~> 2.0"},
{:hackney, "~> 1.9"},
{:sweet_xml, "~> 0.6"},

Get the deps (mix get.deps) and configure Waffle in dev.exs using the credentials you used for the AWS CLI and the endpoint defined via docker-compose.

config :waffle,
  storage: Waffle.Storage.S3,
  bucket: "test-bucket",
  asset_host: System.get_env("AWS_ENDPOINT", "http://127.0.0.1:9000")

config :ex_aws,
  json_codec: Jason,
  region: System.get_env("AWS_REGION", "ap-southeast-1"),
  access_key_id: System.get_env("AWS_KEY", "kuJJTvYgNUCyDO7b"),
  secret_access_key: System.get_env("AWS_SECRET", "tx2R6919ARSz1Xe74o1wAeyPDQHGzKOk"),
  s3: [
    scheme: "http://",
    host: "127.0.0.1",
    port: 9000,
  ]

Waffle uses definitions to specify how and where to save a file. The first step after configuration is to create one for our avatar picture.

Let's create the avatar definition:

mix waffle.g avatar uploaders/avatar.ex 
# creates uploaders/avatar.ex

Filename and Path

It's a good practice to sanitise filenames and group file types together. When uploading files, it might also be better to guarantee filename unicity, ensuring that existing files are not overwritten by the uploaded one.

Let us start by partitioning our storage by entity type. As the avatar is the user's profile image, we're going to put it in a folder proper to users. More specifically the user whose avatar image is uploaded: users/<user_id>

defmodule MyAppWeb.Uploaders.Avatar do
  use Waffle.Definition
  ...
  # partition avatars as a users/{ID}
  def storage_dir(_version, {_file, %{id: id} = _scope}), do: "users/#{id}"

	# use the filename provided within the scope by the document, prefix with version
  def filename(version, {_entry, %{document_name: filename} = _scope}), do: "#{version}-#{filename}"
end

note version in every function call. Waffle supports multiple versions for one downloaded file (e.g. original and thumbnail). We will use the version to prefix the name of the file stored.

e.g.

s3://test-bucket/users/12/original-hello.jpg
s3://test-bucket/users/12/thumbnail-hello.jpg

It's time to tackle the big one: the document context

Document Context

This is the applicative context and should be located at lib/my_app/documents.ex

Filename

I mentioned filename unicity and sanitisation previously. Let's add readability to the list.

It is probably good to be able to figure out what a file is by its name when listing it.

Something like this

/users/12/original-17668d0f-3684-4c25-a2e7-1f80fbc7573f.jpg

only tells us that this is an original file attached to a user.

We have no way of knowing if the file is the one used by an avatar or if it is, for example, the user's profile background picture. As such, we should be a tad more specific, like this

/users/12/original-avatar-17668d0f-3684-4c25-a2e7-1f80fbc7573f.jpg

This function will provide a unique but humanly readable filename:

@doc """
	generates a new unique filename: {assoc}-{UID}
	e.g.
  	"avatar-17668d0f-3684-4c25-a2e7-1f80fbc7573f"
"""
@spec new_filename(Ecto.Schema.t(), atom()) :: String.t()
def new_filename(assoc), do: Atom.to_string(assoc) <> "-" <> Ecto.UUID.generate()

The version will be handled by Waffle as a prefix. The function above generates the filename that will be saved in our database's documents table.

Store - The Specific Version

The specific code for uploading an avatar image and associating the document to the user is

def upload_user_avatar(path, user) do

	# if there's an existing document attached to the user, delete the file from the store
  document =
	  case Map.get(user, :avatar) do
  		nil ->
  			%Document{}

			document ->
				# realise that as the first part of the tuple is not used 
				# in filename generation, we can put anything we want
				MyAppWeb.Uploaders.Avatar.delete({"any", %{
					id: user.id, 
					document_name: document.title
				}})
				document
		end

	# get a unique name for the new file
	filename = new_filename(:avatar)

	# upload the new file to the store (MinIO / File System)
	MyAppWeb.Uploaders.Avatar.store({path, %{
		id: user.id, 
		document_filename: filename
	}})
  
  # attach the document to the user through the avatar association
	user
    |> Ecto.Changeset.change()
    |> Ecto.Changeset.put_assoc(
      :avatar,
      Ecto.Changeset.change(document, %{title: filename})
    )
    |> Repo.update()
end

The scope used by waffle is a tuple containing a file, or %PlugUpload{} structure, as well as a user-defined scope (i.e. {path, whatever_i_need_to_define_my_file})

The problem with the approach above is that we will need to provide one such function for each object that accepts the upload of a file. Hence, we should make our upload function generic!

Documentable behaviour

Every model that accepts an association with documents should be able to provide the waffle definition module used to store, delete and get a url.

This is the perfect use case for a behaviour.

Let's create the Documentable behaviour:

defmodule MyApp.Documents.Documentable do
  @callback get_def(assoc :: atom()) :: module()
end

Let's add the Documentable behaviour to our User

defmodule MyApp.Accounts.User do
  @behaviour MyApp.Documents.Documentable
  ...
  
  def get_def(:avatar), do: MyAppWeb.Uploaders.Avatar
end

And now, we party!

Store - The Generic Version

Let's take the previous upload function and make it generic.

def upload_file_to_obj_by_assoc(path, obj, assoc) do
	# get the obj's module for the waffle definition
	waffle_def_module = obj.__struct__.get_def(assoc)

	document =
		case Map.get(obj, assoc) do
			nil ->
				%Document{}

			document ->
				# generic way of running a function on a module
        # delete the file, NOT the database record
				apply(waffle_def_module, :delete, [waffle_scope(obj, document)])
				document
		end

	#  upload the NEW pic
	filename = new_filename(assoc)
	apply(waffle_def_module, :store, [{path, %{id: obj.id, document_name: filename}}])

	# update document
	obj
  |> Ecto.Changeset.change()
  |> Ecto.Changeset.put_assoc(
  assoc,
  Ecto.Changeset.change(document, %{title: filename})
  )
  |> Repo.update()
end

def waffle_scope(_obj, nil), do: nil
def waffle_scope(obj, document), do: {"any", %{id: obj.id, document_name: document.title}}

Try it

iex -S mix.phx server
# ...
iex> alias MyApp.Repo
iex> alias MyApp.Users.User
iex> alias MyApp.Documents
iex> u = Repo.get(User, 1)   # get a user by ID
iex> Documents.upload_file_to_obj_by_assoc("./my_image.jpg", u, :avatar)

What's Next?

There are some upgrades, not covered here, that could certainly optimise the process:

  • Verify if a specific file has being uploaded before. Possibly under another name?

  • Take care of many-to-many associations.

  • Configure waffle and ex_aws to be able to access the minio endpoint from anywhere within a local-area network.

  • Carry out all the tests to validate the document management module.