Thursday, 17 October 2024

how to install apache airflow on windows without docker

 Installing Apache Airflow on Windows without Docker can be a bit more challenging than on Linux-based systems due to dependency management issues. However, with the right tools and steps, you can set up Apache Airflow on your Windows system directly. Here’s a step-by-step guide:

Prerequisites

  1. Python: Airflow works best with Python 3.7, 3.8, or 3.9. Ensure you have a compatible version installed.
  2. Pip: Make sure you have pip installed to manage Python packages.
  3. Virtual Environment: It's recommended to use a virtual environment to avoid conflicts with system-wide packages.
  4. Microsoft Visual C++ Build Tools: Airflow requires this for compiling certain dependencies. You can download it from here.

Step-by-Step Guide to Installing Apache Airflow

Step 1: Install Python and Set Up a Virtual Environment

  1. Download and install Python from the official Python website.
  2. Open the Command Prompt and verify the installation:
    sh
    python --version pip --version
  3. Create a virtual environment for Airflow:
    sh
    python -m venv airflow_venv
  4. Activate the virtual environment:
    sh
    airflow_venv\Scripts\activate

Step 2: Install Apache Airflow

  1. Before installing Apache Airflow, set the AIRFLOW_HOME environment variable. This directory will be used for Airflow's configuration and logs:
    sh
    setx AIRFLOW_HOME %USERPROFILE%\airflow
  2. Install Apache Airflow with the required dependencies. Make sure to pin the Airflow version to avoid compatibility issues:
    sh
    pip install apache-airflow==2.7.0 --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.7.0/constraints-3.7.txt"
    • Replace 2.7.0 and 3.7 with your specific Airflow and Python versions if needed.

Step 3: Initialize the Airflow Database

  1. Initialize the Airflow database using the following command:
    sh
    airflow db init
    • This will create an SQLite database by default in the AIRFLOW_HOME directory, but you can configure Airflow to use other databases by updating the airflow.cfg file.

Step 4: Create an Admin User

  1. Create an admin user to access the Airflow web UI:
    sh
    airflow users create \ --username admin \ --firstname FirstName \ --lastname LastName \ --role Admin \ --email admin@example.com

Step 5: Start the Airflow Web Server and Scheduler

  1. Start the Airflow web server, which provides the web-based UI for Airflow:

    sh
    airflow webserver --port 8080
    • You can now access the Airflow UI at http://localhost:8080.
  2. Open a new Command Prompt window, activate the virtual environment again, and start the Airflow scheduler:

    sh
    airflow_venv\Scripts\activate airflow scheduler
    • The scheduler will periodically check for new tasks and run them.

Step 6: Verify the Installation

  1. To verify the installation, open a web browser and navigate to http://localhost:8080.
  2. Log in with the admin credentials you created and check that the web interface is functioning correctly.

Additional Configuration (Optional)

  • Using a different database: Airflow uses SQLite by default, but for production, you might want to use a more robust database like PostgreSQL. You can configure this in the airflow.cfg file.
  • Setting up environment variables: You can set environment variables such as AIRFLOW_HOME permanently by going to System Properties > Environment Variables in Windows.

Troubleshooting Common Issues

  1. Compatibility Issues: Make sure to check the compatibility of Python and Airflow versions.
  2. Permissions Errors: If you encounter permission errors, try running Command Prompt as an administrator.
  3. Installation Failures: Ensure that Microsoft Visual C++ Build Tools are installed if the installation fails due to missing C++ dependencies.

By following these steps, you should have a working installation of Apache Airflow on your Windows machine, ready for orchestrating workflows without the need for Docker.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.