Skip to content

Installation

freshdata requires Python ≥ 3.9 and pandas ≥ 1.5.

Basic install

pip install freshdata-cleaner

This installs the pandas + NumPy core — everything you need for fd.clean, fd.profile, and the decision engine.

Optional extras

Install only what you need:

pip install "freshdata-cleaner[ml]"

Adds scikit-learn for KNN imputation and IsolationForest outlier detection (used in strategy="aggressive").

pip install "freshdata-cleaner[enterprise]"

Adds polars, pyarrow, requests, pyyaml for the enterprise layer: fuzzy clustering, PII masking, semantic validation, trust scoring, OpenLineage metadata, and the batch CLI.

pip install "freshdata-cleaner[all]"

All extras above plus cleanlab for ML label-noise detection.

pip install "freshdata-cleaner[polars]"

Pass a Polars DataFrame to fd.clean and get a Polars DataFrame back.

Verify the installation

python -c "import freshdata as fd; print(fd.__version__)"
import pandas as pd
import freshdata as fd

df = pd.DataFrame({"a": [1, 2, 2, None], "b": [" x ", "y", "y", "z"]})
print(fd.clean(df))

Note on naming

The PyPI distribution is freshdata-cleaner, but the import name is simply freshdata — so you install one and import the other:

pip install freshdata-cleaner
import freshdata as fd