Thursday 17 October 2024

how to install apache airflow on windows

 Installing Apache Airflow on Windows requires a few steps, as Airflow officially supports Linux and macOS more extensively. However, with the help of Windows Subsystem for Linux (WSL) or Docker, you can still get Airflow running on your Windows machine. This article will guide you through both methods.

Method 1: Install Apache Airflow Using Windows Subsystem for Linux (WSL)

Step 1: Install WSL

  1. Open PowerShell as an administrator and run the following command to enable WSL:

    powershell
    wsl --install

    This command installs WSL and sets up a default Linux distribution (usually Ubuntu).

  2. Restart your machine when prompted, then open Ubuntu from the Start menu to complete the setup.

Step 2: Update and Install Prerequisites in WSL

  1. Update your package lists:

    bash
    sudo apt update
  2. Install Python and pip:

    bash
    sudo apt install python3 python3-pip -y
  3. Install other necessary dependencies:

    bash
    sudo apt install libmysqlclient-dev libssl-dev libffi-dev -y

Step 3: Install Apache Airflow

  1. Set up an Airflow environment variable to specify the version and extra packages. For example:

    bash
    export AIRFLOW_VERSION=2.5.0 export PYTHON_VERSION="$(python3 --version | cut -d ' ' -f 2 | cut -d '.' -f 1-2)" export CONSTRAINT_URL="https://raw.githubusercontent.com/apache/airflow/constraints-${AIRFLOW_VERSION}/constraints-${PYTHON_VERSION}.txt"
  2. Install Airflow using pip:

    bash
    pip install "apache-airflow==${AIRFLOW_VERSION}" --constraint "${CONSTRAINT_URL}"

Step 4: Initialize and Start Apache Airflow

  1. Initialize the Airflow database:

    bash
    airflow db init
  2. Create an admin user:

    bash
    airflow users create --username admin --firstname Admin --lastname User --role Admin --email admin@example.com
  3. Start the web server:

    bash
    airflow webserver --port 8080
  4. In a new terminal, start the scheduler:

    bash
    airflow scheduler
  5. You can now access Airflow at http://localhost:8080.

Method 2: Install Apache Airflow Using Docker

If you prefer to use Docker, Airflow provides an official Docker image that you can easily run on Windows.

Step 1: Install Docker Desktop

Download and install Docker Desktop for Windows from the official website and follow the installation instructions. Ensure Docker is running before proceeding.

Step 2: Get the Docker Compose File

  1. Open PowerShell or Command Prompt and create a directory for your Airflow installation:

    powershell
    mkdir airflow-docker cd airflow-docker
  2. Download the docker-compose.yaml template provided by Airflow:

    powershell
    curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.5.0/docker-compose.yaml'

Step 3: Initialize and Start Apache Airflow

  1. Set up environment variables required by Airflow:

    powershell
    echo -e "AIRFLOW_UID=$(id -u)" > .env
  2. Initialize the Airflow database:

    powershell
    docker-compose up airflow-init
  3. Start the Airflow services:

    powershell
    docker-compose up
  4. Access Airflow at http://localhost:8080. The default credentials are:

    • Username: airflow
    • Password: airflow
  5. To stop Airflow, use Ctrl + C in the terminal and then:

    powershell
    docker-compose down

Conclusion

Both methods will enable you to run Apache Airflow on a Windows machine, with WSL offering a more native Linux experience and Docker providing an easier setup. Choose the method that best fits your workflow, and you'll have Apache Airflow up and running on Windows in no time.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.