Selenium AWS Python 3.12 - How to Get in a Layer for Lambda Functions

I’ve been working on using Selenium with Python 3.12 in an AWS Lambda layer. While I’ve successfully set this up using Docker, I’m curious if it’s possible to achieve the same with a Lambda layer. I’ve spent hours trying to design a layer that includes Chrome, Chromium, and the necessary packages. However, I’ve run into issues with errors and the restrictive size limits of Lambda layers, even when using S3 for uploads.

I have a Makefile I was working with:

SHELL:=/bin/bash

## ENV NAMES
LAYER_VERSION=dev
PYTHON_VERSION = 3.12

SRC_DIR := $(shell pwd)/src
TESTS_DIR := $(shell pwd)/tests
PACKAGES_DIR := $(shell pwd)/layer

# Define layer names
LAYER_NAME_PYTHON := headless_chrome_python
LAYER_NAME_BINARIES := headless_chrome_binaries

RUNTIME=python$(PYTHON_VERSION)
SELENIUM_VER=4.25.0
_VER=129.0.6668.100
DRIVER_URL=https://storage.googleapis.com/chrome-for-testing-public/$(_VER)/linux64/chromedriver-linux64.zip
CHROME_URL=https://storage.googleapis.com/chrome-for-testing-public/$(_VER)/linux64/chrome-headless-shell-linux64.zip

# Local layer directories
LOCAL_LAYER_DIR_PYTHON=$(PWD)/build/$(LAYER_NAME_PYTHON)
LOCAL_LAYER_DIR_BINARIES=$(PWD)/build/$(LAYER_NAME_BINARIES)

LOCAL_LAYER_REL_DIR_PYTHON=build/$(LAYER_NAME_PYTHON)
LOCAL_LAYER_REL_DIR_BINARIES=build/$(LAYER_NAME_BINARIES)

OUT_DIR_PYTHON=/out/$(LOCAL_LAYER_REL_DIR_PYTHON)/python/lib/$(RUNTIME)/site-packages

TEST_DOCKER_IMAGE_BASE_NAME = test-lambda
TEST_VERSION = 0.0.1
TEST_DEFAULT_FUNCTION = lambda_test.lambda_handler

define generate_runtime
    # Create necessary directories
    mkdir -p $(1)/lib
    mkdir -p $(1)/lib64
    mkdir -p $(2)

    # Download chrome driver binary
    curl -SL $(DRIVER_URL) -o chromedriver.zip && \
        unzip chromedriver.zip -d $(2) && \
        rm chromedriver.zip || echo "Failed to download or extract chromedriver"

    # Download headless chrome binary
    curl -SL $(CHROME_URL) -o headless-chromium.zip && \
        unzip headless-chromium.zip -d $(2) && \
        rm headless-chromium.zip || echo "Failed to download or extract headless chrome"

    # Install libraries needed by chromedriver and headless chrome into the layer
    docker run --rm --platform linux/amd64 -v $(1):/lambda/opt amazonlinux:2023 \
    bash -c "\
        dnf install -y yum-utils upx --releasever=2023.4.20240416 && \
        dnf install -y --installroot=/lambda/opt --releasever=2023.4.20240416 \
            --setopt=install_weak_deps=False \
            --setopt=tsflags="nodocs nocontexts noscripts notriggers" \
            --setopt=override_install_langs=en_US.utf8 \
            atk cups-libs gtk3 libXcomposite alsa-lib \
            libXcursor libXdamage libXext libXi libXrandr libXScrnSaver \
            libXtst pango at-spi2-atk libXt xorg-x11-server-Xvfb \
            xorg-x11-xauth dbus-glib dbus-glib-devel nss mesa-libgbm jq unzip && \
        dnf clean all && \
        rm -rf /lambda/opt/var/cache/dnf && \
        rm -rf /lambda/opt/usr/share/{man,doc,info,gtk-doc,locale} && \
        rm -rf /lambda/opt/usr/lib/{pkgconfig,cmake,gio,systemd} && \
        rm -rf /lambda/opt/usr/include && \
        rm -rf /lambda/opt/usr/lib64/{pkgconfig,cmake} && \
        find /lambda/opt/ -type f -executable -exec strip --strip-all {} \; || true && \
        upx --lzma /lambda/opt/chromedriver && \
        upx --lzma /lambda/opt/headless-chromium || true"
endef

define zip_layer
    pushd $(1) && zip -r ../../layer/layer-$(2)-$(LAYER_VERSION).zip * && popd
endef

define unzip_layer
    mkdir -p $(PACKAGES_DIR)/layer-$(1) && \
        pushd $(PACKAGES_DIR) && unzip layer-$(1)-$(LAYER_VERSION).zip -d layer-$(1) && popd
endef

define merge_layers
    mkdir -p $(PACKAGES_DIR)/merged-layer
    cp -r $(PACKAGES_DIR)/layer-$(LAYER_NAME_PYTHON)/* $(PACKAGES_DIR)/merged-layer/
    cp -r $(PACKAGES_DIR)/layer-$(LAYER_NAME_BINARIES)/* $(PACKAGES_DIR)/merged-layer/
endef

# List all targets
.PHONY: list
list:
    @$(MAKE) -pRrq -f $(lastword $(MAKEFILE_LIST)) : 2>/dev/null | awk -v RS= -F: '/^# File/,/^# Finished Make data base/ {if ($$1 !~ "^[#.]") {print $$1}}' | sort | egrep -v -e '^[^[:alnum:]]' -e '^$@$$'

## Run all pre-commit hooks
.PHONY: precommit
precommit:
    pre-commit run --all

## Lint your code using pylint
.PHONY: lint
lint:
    python -m pylint --version
    python -m pylint $(SRC_DIR) $(TESTS_DIR)

## Format your code using black
.PHONY: black
black:
    python -m black --version
    python -m black $(SRC_DIR) $(TESTS_DIR)

## Run unit tests using unittest
.PHONY: test-unit
test-unit:
    python -m unittest -v tests.selenium_lib

## Run ci part
.PHONY: ci
ci: precommit lint test-unit

## Build Python dependencies layer
.PHONY: build-python
build-python: clean-python
    # Create build environment for Python layer
    mkdir -p $(LOCAL_LAYER_REL_DIR_PYTHON)/python
    # Add the selenium library
    docker run --rm -v $(PWD):/out public.ecr.aws/sam/build-python3.12:latest \
    bash -c "pip install selenium==$(SELENIUM_VER) -t $(OUT_DIR_PYTHON)"
    $(call zip_layer,$(LOCAL_LAYER_REL_DIR_PYTHON),$(LAYER_NAME_PYTHON))

## Build binaries layer
.PHONY: build-binaries
build-binaries: clean-binaries
    # Create build environment for binaries layer
    mkdir -p $(LOCAL_LAYER_DIR_BINARIES)/lib
    mkdir -p $(LOCAL_LAYER_DIR_BINARIES)/lib64
    mkdir -p $(LOCAL_LAYER_REL_DIR_BINARIES)
    $(call generate_runtime,$(LOCAL_LAYER_DIR_BINARIES),$(LOCAL_LAYER_REL_DIR_BINARIES))
    $(call zip_layer,$(LOCAL_LAYER_REL_DIR_BINARIES),$(LAYER_NAME_BINARIES))

## Build both layers
.PHONY: build
build: build-python build-binaries

## Clean build folders
.PHONY: clean
clean: clean-python clean-binaries
    # Clean layer directory
    rm -rf layer

.PHONY: clean-python
clean-python:
    rm -rf $(LOCAL_LAYER_REL_DIR_PYTHON)

.PHONY: clean-binaries
clean-binaries:
    rm -rf $(LOCAL_LAYER_REL_DIR_BINARIES)

## Expand compressed layer files
.PHONY: .expand-layer
.expand-layer:
    $(call unzip_layer,$(LAYER_NAME_PYTHON))
    $(call unzip_layer,$(LAYER_NAME_BINARIES))

## Run test integration suite
.PHONY: test-integration
test-integration: .expand-layer
    $(call merge_layers)
    $(eval res := $(shell docker run --rm -v $(TESTS_DIR):/var/task -v $(PACKAGES_DIR)/merged-layer:/opt lambci/lambda:$(RUNTIME) $(TEST_DEFAULT_FUNCTION)))
    exit $(res)

## Create and test the new layer version
.PHONY: all
all: precommit lint build test-integration
    # Deploy the release version
    echo "PUBLISHED!"

I’d really appreciate any input or guidance if you’re familiar with this area or can point me to useful resources. I’m relatively new to software engineering, so feel free to let me know if I’m overlooking something obvious. While I managed to get this working with Docker, I prefer using Lambda layers as it simplifies and speeds up the process of modifying my functions.

I attempted to generate a Docker container and zip the necessary library files and Python packages for Lambda deployment. However, I encountered size constraints with this approach. I suspect the image may be too large or bloated with unnecessary components. Any advice on optimizing the image or alternative approaches would be appreciated.

You need to sign in to view this answers

Related Post