OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

"pre-import" python dependencies in docker image

  • Thread starter Thread starter AJP
  • Start date Start date
A

AJP

Guest
I am building a Python 3.10 Docker image in an Ubuntu-latest GitHub action which is uploaded (by serverless.com CLI) to form an AWS lambda. It has many python dependencies. With a clean install of the dependencies on my local Mac it can take 20 seconds to import the main source file (calc.py). Similarly when the AWS lambda first starts, it can take 20 seconds to import the main source file (calc.py). If I comment out all of the dependencies or they are left in but it is a subsequent call to the AWS lambda then the calc.py file is imported very quickly (< 1 second).

I want to run an initial import in the docker container to get the python dependencies to "pre-compile" so that when the image is started by the AWS lambda it is "warmed up" and ready to go instead of taking 20+ seconds to serve it's first request.

I have the following Dockerfile but the line RUN python -c "from src.calc import process_payload" does not seem to have any effect on the AWS lambda despite working in my local environment.

Code:
# Using the SHA hash as suggested by: https://snyk.io/blog/best-practices-containerizing-python-docker/#https://snyk.io/blog/best-practices-containerizing-python-docker/#1. Use explicit and deterministic Docker base image tags for containerized Python applications
# This is the SHA hash for the `public.ecr.aws/lambda/python:3.10` image
# You can get it by running `docker pull public.ecr.aws/lambda/python:3.10` and then `docker images --digests | grep "public.ecr.aws/lambda/python.*3.10"`
FROM public.ecr.aws/lambda/python:3.10@sha256:7688a9c4c1a27ea3bfcf12df55cc958e188bb15eea81b463750fe42722e0de87

RUN mkdir src

# Seeing the following warning:
#       Matplotlib created a temporary cache directory at /tmp/matplotlib-4sw1aw8x because the default path (/home/sbx_user1051/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
# https://stackoverflow.com/questions/9827377/setting-matplotlib-mplconfigdir-consider-setting-mplconfigdir-to-a-writable-dir#comment120692932_9947889
# Make a non temporary directory
RUN mkdir /matplotlib_cache
ENV MPLCONFIGDIR="/matplotlib_cache"

COPY ./requirements.txt ./

# Install system dependencies using yum
RUN yum -y update && \
    yum -y install ffmpeg libSM libXext mesa-libGL && \
    yum clean all && \
    rm -rf /var/cache/yum

RUN pip install -r requirements.txt

COPY ./src/ ./src/

# https://github.com/matplotlib/matplotlib/pull/16374#issuecomment-580549298
RUN python -c "import matplotlib"
# Still trying to get code to precompile
RUN python -c "from src.calc import process_payload"

CMD ["src/main.lambda_handler"]

Any advice on how to "pre-import" the code and dependencies in the docker image so that the AWS lambda starts quickly? Please let me know if you need more info.
<p>I am building a Python 3.10 Docker image in an Ubuntu-latest GitHub action which is uploaded (by serverless.com CLI) to form an AWS lambda. It has many python dependencies. With a clean install of the dependencies on my local Mac it can take 20 seconds to import the main source file (calc.py). Similarly when the AWS lambda first starts, it can take 20 seconds to import the main source file (calc.py). If I comment out all of the dependencies or they are left in but it is a subsequent call to the AWS lambda then the calc.py file is imported very quickly (< 1 second).</p>
<p>I want to run an initial import in the docker container to get the python dependencies to "pre-compile" so that when the image is started by the AWS lambda it is "warmed up" and ready to go instead of taking 20+ seconds to serve it's first request.</p>
<p>I have the following Dockerfile but the line <code>RUN python -c "from src.calc import process_payload"</code> does not seem to have any effect on the AWS lambda despite working in my local environment.</p>
<pre><code># Using the SHA hash as suggested by: https://snyk.io/blog/best-practices...est-practices-containerizing-python-docker/#1. Use explicit and deterministic Docker base image tags for containerized Python applications
# This is the SHA hash for the `public.ecr.aws/lambda/python:3.10` image
# You can get it by running `docker pull public.ecr.aws/lambda/python:3.10` and then `docker images --digests | grep "public.ecr.aws/lambda/python.*3.10"`
FROM public.ecr.aws/lambda/python:3.10@sha256:7688a9c4c1a27ea3bfcf12df55cc958e188bb15eea81b463750fe42722e0de87

RUN mkdir src

# Seeing the following warning:
# Matplotlib created a temporary cache directory at /tmp/matplotlib-4sw1aw8x because the default path (/home/sbx_user1051/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
# https://stackoverflow.com/questions...ir-to-a-writable-dir#comment120692932_9947889
# Make a non temporary directory
RUN mkdir /matplotlib_cache
ENV MPLCONFIGDIR="/matplotlib_cache"

COPY ./requirements.txt ./

# Install system dependencies using yum
RUN yum -y update && \
yum -y install ffmpeg libSM libXext mesa-libGL && \
yum clean all && \
rm -rf /var/cache/yum

RUN pip install -r requirements.txt

COPY ./src/ ./src/

# https://github.com/matplotlib/matplotlib/pull/16374#issuecomment-580549298
RUN python -c "import matplotlib"
# Still trying to get code to precompile
RUN python -c "from src.calc import process_payload"

CMD ["src/main.lambda_handler"]
</code></pre>
<p>Any advice on how to "pre-import" the code and dependencies in the docker image so that the AWS lambda starts quickly? Please let me know if you need more info.</p>
 

Latest posts

Top