I’ve been working on using Selenium with Python 3.12 in an AWS Lambda layer. While I’ve successfully set this up using Docker, I’m curious if it’s possible to achieve the same with a Lambda layer. I’ve spent hours trying to design a layer that includes Chrome, Chromium, and the necessary packages. However, I’ve run into issues with errors and the restrictive size limits of Lambda layers, even when using S3 for uploads.
I have a Makefile I was working with:
SHELL:=/bin/bash
## ENV NAMES
LAYER_VERSION=dev
PYTHON_VERSION = 3.12
SRC_DIR := $(shell pwd)/src
TESTS_DIR := $(shell pwd)/tests
PACKAGES_DIR := $(shell pwd)/layer
# Define layer names
LAYER_NAME_PYTHON := headless_chrome_python
LAYER_NAME_BINARIES := headless_chrome_binaries
RUNTIME=python$(PYTHON_VERSION)
SELENIUM_VER=4.25.0
_VER=129.0.6668.100
DRIVER_URL=https://storage.googleapis.com/chrome-for-testing-public/$(_VER)/linux64/chromedriver-linux64.zip
CHROME_URL=https://storage.googleapis.com/chrome-for-testing-public/$(_VER)/linux64/chrome-headless-shell-linux64.zip
# Local layer directories
LOCAL_LAYER_DIR_PYTHON=$(PWD)/build/$(LAYER_NAME_PYTHON)
LOCAL_LAYER_DIR_BINARIES=$(PWD)/build/$(LAYER_NAME_BINARIES)
LOCAL_LAYER_REL_DIR_PYTHON=build/$(LAYER_NAME_PYTHON)
LOCAL_LAYER_REL_DIR_BINARIES=build/$(LAYER_NAME_BINARIES)
OUT_DIR_PYTHON=/out/$(LOCAL_LAYER_REL_DIR_PYTHON)/python/lib/$(RUNTIME)/site-packages
TEST_DOCKER_IMAGE_BASE_NAME = test-lambda
TEST_VERSION = 0.0.1
TEST_DEFAULT_FUNCTION = lambda_test.lambda_handler
define generate_runtime
# Create necessary directories
mkdir -p $(1)/lib
mkdir -p $(1)/lib64
mkdir -p $(2)
# Download chrome driver binary
curl -SL $(DRIVER_URL) -o chromedriver.zip && \
unzip chromedriver.zip -d $(2) && \
rm chromedriver.zip || echo "Failed to download or extract chromedriver"
# Download headless chrome binary
curl -SL $(CHROME_URL) -o headless-chromium.zip && \
unzip headless-chromium.zip -d $(2) && \
rm headless-chromium.zip || echo "Failed to download or extract headless chrome"
# Install libraries needed by chromedriver and headless chrome into the layer
docker run --rm --platform linux/amd64 -v $(1):/lambda/opt amazonlinux:2023 \
bash -c "\
dnf install -y yum-utils upx --releasever=2023.4.20240416 && \
dnf install -y --installroot=/lambda/opt --releasever=2023.4.20240416 \
--setopt=install_weak_deps=False \
--setopt=tsflags="nodocs nocontexts noscripts notriggers" \
--setopt=override_install_langs=en_US.utf8 \
atk cups-libs gtk3 libXcomposite alsa-lib \
libXcursor libXdamage libXext libXi libXrandr libXScrnSaver \
libXtst pango at-spi2-atk libXt xorg-x11-server-Xvfb \
xorg-x11-xauth dbus-glib dbus-glib-devel nss mesa-libgbm jq unzip && \
dnf clean all && \
rm -rf /lambda/opt/var/cache/dnf && \
rm -rf /lambda/opt/usr/share/{man,doc,info,gtk-doc,locale} && \
rm -rf /lambda/opt/usr/lib/{pkgconfig,cmake,gio,systemd} && \
rm -rf /lambda/opt/usr/include && \
rm -rf /lambda/opt/usr/lib64/{pkgconfig,cmake} && \
find /lambda/opt/ -type f -executable -exec strip --strip-all {} \; || true && \
upx --lzma /lambda/opt/chromedriver && \
upx --lzma /lambda/opt/headless-chromium || true"
endef
define zip_layer
pushd $(1) && zip -r ../../layer/layer-$(2)-$(LAYER_VERSION).zip * && popd
endef
define unzip_layer
mkdir -p $(PACKAGES_DIR)/layer-$(1) && \
pushd $(PACKAGES_DIR) && unzip layer-$(1)-$(LAYER_VERSION).zip -d layer-$(1) && popd
endef
define merge_layers
mkdir -p $(PACKAGES_DIR)/merged-layer
cp -r $(PACKAGES_DIR)/layer-$(LAYER_NAME_PYTHON)/* $(PACKAGES_DIR)/merged-layer/
cp -r $(PACKAGES_DIR)/layer-$(LAYER_NAME_BINARIES)/* $(PACKAGES_DIR)/merged-layer/
endef
# List all targets
.PHONY: list
list:
@$(MAKE) -pRrq -f $(lastword $(MAKEFILE_LIST)) : 2>/dev/null | awk -v RS= -F: '/^# File/,/^# Finished Make data base/ {if ($$1 !~ "^[#.]") {print $$1}}' | sort | egrep -v -e '^[^[:alnum:]]' -e '^$@$$'
## Run all pre-commit hooks
.PHONY: precommit
precommit:
pre-commit run --all
## Lint your code using pylint
.PHONY: lint
lint:
python -m pylint --version
python -m pylint $(SRC_DIR) $(TESTS_DIR)
## Format your code using black
.PHONY: black
black:
python -m black --version
python -m black $(SRC_DIR) $(TESTS_DIR)
## Run unit tests using unittest
.PHONY: test-unit
test-unit:
python -m unittest -v tests.selenium_lib
## Run ci part
.PHONY: ci
ci: precommit lint test-unit
## Build Python dependencies layer
.PHONY: build-python
build-python: clean-python
# Create build environment for Python layer
mkdir -p $(LOCAL_LAYER_REL_DIR_PYTHON)/python
# Add the selenium library
docker run --rm -v $(PWD):/out public.ecr.aws/sam/build-python3.12:latest \
bash -c "pip install selenium==$(SELENIUM_VER) -t $(OUT_DIR_PYTHON)"
$(call zip_layer,$(LOCAL_LAYER_REL_DIR_PYTHON),$(LAYER_NAME_PYTHON))
## Build binaries layer
.PHONY: build-binaries
build-binaries: clean-binaries
# Create build environment for binaries layer
mkdir -p $(LOCAL_LAYER_DIR_BINARIES)/lib
mkdir -p $(LOCAL_LAYER_DIR_BINARIES)/lib64
mkdir -p $(LOCAL_LAYER_REL_DIR_BINARIES)
$(call generate_runtime,$(LOCAL_LAYER_DIR_BINARIES),$(LOCAL_LAYER_REL_DIR_BINARIES))
$(call zip_layer,$(LOCAL_LAYER_REL_DIR_BINARIES),$(LAYER_NAME_BINARIES))
## Build both layers
.PHONY: build
build: build-python build-binaries
## Clean build folders
.PHONY: clean
clean: clean-python clean-binaries
# Clean layer directory
rm -rf layer
.PHONY: clean-python
clean-python:
rm -rf $(LOCAL_LAYER_REL_DIR_PYTHON)
.PHONY: clean-binaries
clean-binaries:
rm -rf $(LOCAL_LAYER_REL_DIR_BINARIES)
## Expand compressed layer files
.PHONY: .expand-layer
.expand-layer:
$(call unzip_layer,$(LAYER_NAME_PYTHON))
$(call unzip_layer,$(LAYER_NAME_BINARIES))
## Run test integration suite
.PHONY: test-integration
test-integration: .expand-layer
$(call merge_layers)
$(eval res := $(shell docker run --rm -v $(TESTS_DIR):/var/task -v $(PACKAGES_DIR)/merged-layer:/opt lambci/lambda:$(RUNTIME) $(TEST_DEFAULT_FUNCTION)))
exit $(res)
## Create and test the new layer version
.PHONY: all
all: precommit lint build test-integration
# Deploy the release version
echo "PUBLISHED!"
I’d really appreciate any input or guidance if you’re familiar with this area or can point me to useful resources. I’m relatively new to software engineering, so feel free to let me know if I’m overlooking something obvious. While I managed to get this working with Docker, I prefer using Lambda layers as it simplifies and speeds up the process of modifying my functions.
I attempted to generate a Docker container and zip the necessary library files and Python packages for Lambda deployment. However, I encountered size constraints with this approach. I suspect the image may be too large or bloated with unnecessary components. Any advice on optimizing the image or alternative approaches would be appreciated.
You need to sign in to view this answers
Leave feedback about this