public class SparkUtil
extends java.lang.Object
Modifier and Type | Method and Description |
---|---|
static boolean |
caseSensitive(org.apache.spark.sql.SparkSession spark) |
static <C,T> Pair<C,T> |
catalogAndIdentifier(java.util.List<java.lang.String> nameParts,
java.util.function.Function<java.lang.String,C> catalogProvider,
java.util.function.BiFunction<java.lang.String[],java.lang.String,T> identiferProvider,
C currentCatalog,
java.lang.String[] currentNamespace)
A modified version of Spark's LookupCatalog.CatalogAndIdentifier.unapply Attempts to find the
catalog and identifier a multipart identifier represents
|
static org.apache.hadoop.conf.Configuration |
hadoopConfCatalogOverrides(org.apache.spark.sql.SparkSession spark,
java.lang.String catalogName)
Pulls any Catalog specific overrides for the Hadoop conf from the current SparkSession, which
can be set via `spark.sql.catalog.$catalogName.hadoop.*`
|
static java.util.List<org.apache.spark.sql.catalyst.expressions.Expression> |
partitionMapToExpression(org.apache.spark.sql.types.StructType schema,
java.util.Map<java.lang.String,java.lang.String> filters)
Get a List of Spark filter Expression.
|
static java.lang.String |
toColumnName(org.apache.spark.sql.connector.expressions.NamedReference ref) |
static void |
validatePartitionTransforms(PartitionSpec spec)
Check whether the partition transforms in a spec can be used to write data.
|
static void |
validateTimestampWithoutTimezoneConfig(org.apache.spark.sql.RuntimeConfig conf) |
static void |
validateTimestampWithoutTimezoneConfig(org.apache.spark.sql.RuntimeConfig conf,
java.util.Map<java.lang.String,java.lang.String> options)
Checks for properties both supplied by Spark's RuntimeConfig and the read or write options
|
public static void validatePartitionTransforms(PartitionSpec spec)
spec
- a PartitionSpecjava.lang.UnsupportedOperationException
- if the spec contains unknown partition transformspublic static <C,T> Pair<C,T> catalogAndIdentifier(java.util.List<java.lang.String> nameParts, java.util.function.Function<java.lang.String,C> catalogProvider, java.util.function.BiFunction<java.lang.String[],java.lang.String,T> identiferProvider, C currentCatalog, java.lang.String[] currentNamespace)
nameParts
- Multipart identifier representing a tablepublic static org.apache.hadoop.conf.Configuration hadoopConfCatalogOverrides(org.apache.spark.sql.SparkSession spark, java.lang.String catalogName)
Mirrors the override of hadoop configurations for a given spark session using `spark.hadoop.*`.
The SparkCatalog allows for hadoop configurations to be overridden per catalog, by setting them on the SQLConf, where the following will add the property "fs.default.name" with value "hdfs://hanksnamenode:8020" to the catalog's hadoop configuration. SparkSession.builder() .config(s"spark.sql.catalog.$catalogName.hadoop.fs.default.name", "hdfs://hanksnamenode:8020") .getOrCreate()
spark
- The current Spark sessioncatalogName
- Name of the catalog to find overrides for.public static void validateTimestampWithoutTimezoneConfig(org.apache.spark.sql.RuntimeConfig conf)
public static void validateTimestampWithoutTimezoneConfig(org.apache.spark.sql.RuntimeConfig conf, java.util.Map<java.lang.String,java.lang.String> options)
conf
- The RuntimeConfig of the active Spark sessionoptions
- The read or write options supplied when reading/writing a tablepublic static java.util.List<org.apache.spark.sql.catalyst.expressions.Expression> partitionMapToExpression(org.apache.spark.sql.types.StructType schema, java.util.Map<java.lang.String,java.lang.String> filters)
schema
- table schemafilters
- filters in the format of a Map, where key is one of the table column name, and
value is the specific value to be filtered on the column.public static java.lang.String toColumnName(org.apache.spark.sql.connector.expressions.NamedReference ref)
public static boolean caseSensitive(org.apache.spark.sql.SparkSession spark)