Installation¶
freshdata requires Python ≥ 3.9 and pandas ≥ 1.5.
Basic install¶
This installs the pandas + NumPy core — everything you need for fd.clean,
fd.profile, and the decision engine.
Optional extras¶
Install only what you need:
Adds scikit-learn for KNN imputation and IsolationForest outlier
detection (used in strategy="aggressive").
Adds polars, pyarrow, requests, pyyaml for the enterprise layer: fuzzy clustering, PII masking, semantic validation, trust scoring, OpenLineage metadata, and the batch CLI.
Verify the installation¶
import pandas as pd
import freshdata as fd
df = pd.DataFrame({"a": [1, 2, 2, None], "b": [" x ", "y", "y", "z"]})
print(fd.clean(df))
Note on naming¶
The PyPI distribution is freshdata-cleaner, but the import name is simply
freshdata — so you install one and import the other: