pyspark.sql.functions.
regexp_replace
Replace all substrings of the specified string value that match regexp with replacement.
New in version 1.5.0.
Changed in version 3.4.0: Supports Spark Connect.
Column
column name or column containing the string value
column object or str containing the regexp pattern
column object or str containing the replacement
string with all substrings replaced.
Examples
>>> df = spark.createDataFrame([("100-200", r"(\d+)", "--")], ["str", "pattern", "replacement"]) >>> df.select(regexp_replace('str', r'(\d+)', '--').alias('d')).collect() [Row(d='-----')] >>> df.select(regexp_replace("str", col("pattern"), col("replacement")).alias('d')).collect() [Row(d='-----')]