pyspark.sql.functions.regexp_substr#
- pyspark.sql.functions.regexp_substr(str, regexp)[source]#
Returns the substring that matches the Java regex regexp within the string str. If the regular expression is not found, the result is null.
New in version 3.5.0.
- Parameters
- Returns
Column
the substring that matches a Java regex within the string str.
Examples
>>> df = spark.createDataFrame([("1a 2b 14m", r"\d+")], ["str", "regexp"]) >>> df.select(regexp_substr('str', lit(r'\d+')).alias('d')).collect() [Row(d='1')] >>> df.select(regexp_substr('str', lit(r'mmm')).alias('d')).collect() [Row(d=None)] >>> df.select(regexp_substr("str", col("regexp")).alias('d')).collect() [Row(d='1')]